strings Flashcards

1
Q

chr() and ord()

A

ord(b’A’)

returns 65

chr(65)

returns ‘A’. It doesn’t return b’A’

a standard string (the output of chr()), doesn’t work exactly like a byte string.

For this, there is the struct module.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

struct module

A

Performs complex conversions

chr and ord can only be used with bytes. Converting numbers to bytes limits values to be from 0 to 255.

struct. pack() writes out byte strings
struct. unpack() reads those values back into python.

>>>import struct

>>> struct.pack(b’B’, 65)

… b’A’

>>> struct.pack(b’B’, 33)

… b’!’

>>>struct.pack(b’BBBBBBB’, 69, 120, 97, 109,etc….)

… b’Example’

if the input is signed, there are 256 values, but ranging -128 to 127

>>> struct.pack(b’b’, 65, -23)

… b’A\xe9’

lowercase assumes signed value52

for two byte numbers, use H and h

there are 65,536 values possible

unpack

>>> struct.unpack(b’H’, b’\x00*’)

(10752, )

>>> struct.unpack(b’H’, b’*\x00’)

(42, )

pack and unpack are true inverses.

##

Four byte numbers use I and i, Eight byte numbers use Q and q

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Endianness

A

Term for how the bytes of a value are ordered.

  1. Big Endianness: the byte that provides the largest part of the number gets stored first
  2. Little endianness: the byte that stores the smallest part of the number gets stored first.

’<’ before the format specification(‘H’, ‘h’, ‘B’, or ‘b’) signifies “little endianness

’>’ is the opposite.

Little endian is typically used on modern systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Converting strings

A

Not terribly interesting.

consider:

>>> first_name = ‘Marty’

>>> last_name = ‘Alchin’

>>> age = 28

>>> data = struct.pack(b’10s10sB’, last_name, first_name, age)

>>> data

b’Alchin\x00\x00\x00\x00Marty\x00\x00\x00\x00\x00\x1c’

sorta handy, I guess.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Text

A

History:

ASCII - American Standard Code for Information Interchange

127 characters, 95 of them printable

Only covered 7 bits of each byte, but even another 128 values weren’t enough for the language needs outside of English.

Unicode - standard in Python. The ‘u’ prefix is no longer supported in Python 3. the ‘b’ prefix signifies bytes.

Encoding - usually, don’t need everything in unicode, so there are encodings. “Python string”.encode(‘ascii’) returns the same thing, but in ASCII - one byte characters.

UTF-8 is the most common

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

UTF-8

A

Characters within a certain range are a single byte. Some are two bytes, then some are three and even four bytes.

It is desirable because:

  1. Can support any Unicode code point. Not unique to UTF-8, but better than ascii
  2. More common the character, the less space it’s code point takes up.
  3. Single byte range precisely coincides with ascii, meaning it’s perfectly backward compatible with ascii.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

string formatting codes

A

%s = str

%r = repr

plenty more

objects can be inserted by keyword as well

def log(*args):

__for i, arg in enumerate(args)

____(“print this %(i)s: %(arg)r” % {‘i’: i, ‘arg’: arg})

log(‘test, ‘ing’)

Argument 0: ‘test’

Argument 1: ‘ing’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

new formatting

A

New, more robust method:

>>> “this is argument 0: {0}’.format(‘test’)

This is argument 0: test’

>>> “this is argument key: {key}’. format(key=’value’)

‘This is argument key: value’

Because it’s a method call rather than an operator, you can mix positional and keyword arguments, referencing them in any order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Looking up values within objects

A

example does best:

>>> import datetime

>>> def format_time(time):

__return ‘{0.minute} past {0.hour}’.format(time)

>>> format_time(datetime.time(8,10))

‘10 past 8’

>>> ‘{0[spam]}’.format({‘spam’:’eggs’})

‘eggs’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Distinguishing types of strings

A

immediately following the object reference (index or keyword), follow it by a ‘!’ and ‘s’ or ‘r’, depending on what you want.

exact_match is a simple function

from previous exercise

>>> validate_test = exact_match(‘test’, ‘Expected{1!r}, got {0!r}’)

>>> validate_test(‘invalid’)

Traceback…

ValueError: Expected ‘test’, got ‘invalid’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Standard format specification

A

after the field reference in string formatting you can include a colon followed by a string that controls the formatting

{0:>20}{1} translates to “0 index is 20 characters long and right aligned. Then 1 index”

{0:=^40}.format(text)

translates to ‘take “text” and center it between equal signs, total of 40 characters.’

If the text is longer than 40, format will extend it rather than truncating. There would be no ‘=’ on either side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

custom format specification

A

pretty dumb,

format() isn’t in control of the formatting syntax described so far.

It instead delegates that control to __format__() defined on the object.

class Verb:

__def __init__(self, present, past = None):

____self.present = present

____self.past = past

__def __format__(self, tense):

____if tense == ‘past’:

______return self.past

____else:

________return self.present

>>> format = Verb(‘format’, past=’formatted’)

>>> message = ‘{0:present} strings with {0:past} objects.’

>>> message.format(format)

format strings with formatted objects.

especially since ‘formatted’, here, is an adjective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly