strings Flashcards

Question 1

Q

chr() and ord()

Answer

A

ord(b’A’)

returns 65

chr(65)

returns ‘A’. It doesn’t return b’A’

a standard string (the output of chr()), doesn’t work exactly like a byte string.

For this, there is the struct module.

Question 2

Q

struct module

Answer

A

Performs complex conversions

chr and ord can only be used with bytes. Converting numbers to bytes limits values to be from 0 to 255.

struct. pack() writes out byte strings
struct. unpack() reads those values back into python.

>>>import struct

>>> struct.pack(b’B’, 65)

… b’A’

>>> struct.pack(b’B’, 33)

… b’!’

>>>struct.pack(b’BBBBBBB’, 69, 120, 97, 109,etc….)

… b’Example’

if the input is signed, there are 256 values, but ranging -128 to 127

>>> struct.pack(b’b’, 65, -23)

… b’A\xe9’

lowercase assumes signed value52

for two byte numbers, use H and h

there are 65,536 values possible

unpack

>>> struct.unpack(b’H’, b’\x00*’)

(10752, )

>>> struct.unpack(b’H’, b’*\x00’)

(42, )

pack and unpack are true inverses.

##

Four byte numbers use I and i, Eight byte numbers use Q and q

Question 3

Q

Endianness

Answer

A

Term for how the bytes of a value are ordered.

Big Endianness: the byte that provides the largest part of the number gets stored first
Little endianness: the byte that stores the smallest part of the number gets stored first.

’<’ before the format specification(‘H’, ‘h’, ‘B’, or ‘b’) signifies “little endianness”

’>’ is the opposite.

Little endian is typically used on modern systems

Question 4

Q

Converting strings

Answer

A

Not terribly interesting.

consider:

>>> first_name = ‘Marty’

>>> last_name = ‘Alchin’

>>> age = 28

>>> data = struct.pack(b’10s10sB’, last_name, first_name, age)

>>> data

b’Alchin\x00\x00\x00\x00Marty\x00\x00\x00\x00\x00\x1c’

sorta handy, I guess.

Question 5

Q

Text

Answer

A

History:

ASCII - American Standard Code for Information Interchange

127 characters, 95 of them printable

Only covered 7 bits of each byte, but even another 128 values weren’t enough for the language needs outside of English.

Unicode - standard in Python. The ‘u’ prefix is no longer supported in Python 3. the ‘b’ prefix signifies bytes.

Encoding - usually, don’t need everything in unicode, so there are encodings. “Python string”.encode(‘ascii’) returns the same thing, but in ASCII - one byte characters.

UTF-8 is the most common

Question 6

Q

UTF-8

Answer

A

Characters within a certain range are a single byte. Some are two bytes, then some are three and even four bytes.

It is desirable because:

Can support any Unicode code point. Not unique to UTF-8, but better than ascii
More common the character, the less space it’s code point takes up.
Single byte range precisely coincides with ascii, meaning it’s perfectly backward compatible with ascii.

Question 7

Q

string formatting codes

Answer

A

%s = str

%r = repr

plenty more

objects can be inserted by keyword as well

def log(*args):

__for i, arg in enumerate(args)

____(“print this %(i)s: %(arg)r” % {‘i’: i, ‘arg’: arg})

log(‘test, ‘ing’)

Argument 0: ‘test’

Argument 1: ‘ing’

Question 8

Q

new formatting

Answer

A

New, more robust method:

>>> “this is argument 0: {0}’.format(‘test’)

This is argument 0: test’

>>> “this is argument key: {key}’. format(key=’value’)

‘This is argument key: value’

Because it’s a method call rather than an operator, you can mix positional and keyword arguments, referencing them in any order.

Question 9

Q

Looking up values within objects

Answer

A

example does best:

>>> import datetime

>>> def format_time(time):

__return ‘{0.minute} past {0.hour}’.format(time)

>>> format_time(datetime.time(8,10))

‘10 past 8’

>>> ‘{0[spam]}’.format({‘spam’:’eggs’})

‘eggs’

Question 10

Q

Distinguishing types of strings

Answer

A

immediately following the object reference (index or keyword), follow it by a ‘!’ and ‘s’ or ‘r’, depending on what you want.

exact_match is a simple function

from previous exercise

>>> validate_test = exact_match(‘test’, ‘Expected{1!r}, got {0!r}’)

>>> validate_test(‘invalid’)

Traceback…

ValueError: Expected ‘test’, got ‘invalid’

Question 11

Q

Standard format specification

Answer

A

after the field reference in string formatting you can include a colon followed by a string that controls the formatting

{0:>20}{1} translates to “0 index is 20 characters long and right aligned. Then 1 index”

{0:=^40}.format(text)

translates to ‘take “text” and center it between equal signs, total of 40 characters.’

If the text is longer than 40, format will extend it rather than truncating. There would be no ‘=’ on either side.

Question 12

Q

custom format specification

Answer

A

pretty dumb,

format() isn’t in control of the formatting syntax described so far.

It instead delegates that control to __format__() defined on the object.

class Verb:

__def __init__(self, present, past = None):

____self.present = present

____self.past = past

__def __format__(self, tense):

____if tense == ‘past’:

______return self.past

____else:

________return self.present

>>> format = Verb(‘format’, past=’formatted’)

>>> message = ‘{0:present} strings with {0:past} objects.’

>>> message.format(format)

format strings with formatted objects.

especially since ‘formatted’, here, is an adjective.