Chapters 20-end Flashcards
What is a bit?
A bit is a bi(nary digi)t. In computing terms it represents a 1 or a 0.
What is a byte?
In computer convention a byte is a sequence of 8 bits. The range of values accommodated can range from binary 0 - 00000000 to binary 255 - 11111111
How can a byte be used to represent values?
A byte has 256 possible values (0-255). By associating a value with a letter, character or symbol - the byte can be used to encode the values of a letter, character or symbol into a byte. Strings of bytes can then be used to represent information (e.g. words).
What is ASCII?
ASCII is the American Standard Code for Information Interchange. It is an encoding or codec that was one of the first to be internationally accepted. Within the 255 values available most letters, characters and symbols required in common Western European languages (e.g. English) can be accommodated.
What is Unicode and why do we need it?
ASCII can only handle English and a few other European languages in its 255 character codec.
Unicode is a Universal encoding that can handle all human languages. Unicode uses between 1 and 4 bytes to represent each character so up to 32 bits can be used to encode characters giving around 4,000 million possible characters.
Unicode is commonly described as UTF (Unicode Transformation Format) with UTF-8, UTF-16 and UTF-32 bit formats available
What is UTF-8?
UTF-8 stands for Unicode Transformation Format 8 bit. UTF-8 is a simplified Universal encoding that can handle most human languages. Unicode uses between 1 and 4 bytes to represent each character. UTF-8 is backwards compatible with ASCII.
UTF-8 is the most commonly used codec - used in over 95% of applications
In Python how do you convert a decimal value to binary?
Use bin()
z = bin(125)
populates z with the string ‘0b1111101’
What does ord() do in Python?
e.g.
print(ord(‘h’))
ord() returns the unicode encoding number that represents the specified character
e.g.
print(ord(‘h’)) returns 104 (the ascii / utf value for ‘h’)
What does chr() do in Python?
e.g.
print(chr(97))
chr() returns the character represented by the specified unicode number
e.g.
print(chr(97)) returns ‘a’ - the unicode character mapped by the value 97
Describe how the internal representation of a string is handled in Python
In Python a string is a UTF-8 encoded sequence of characters for displaying or working with text.
Text entered from keyboard or read from files by default is captured as text and encoded as a UTF-8 encode sequence of characters.
What does DBES mean?
Decode bytes and encode strings.
If you have a string and want to get the underlying bytes then you need to encode it to give a string of raw bytes e.g. b’\xe6\x96\x87\xe8\xa8\x80’
my\_string = '文言' print('original string is ',my\_string) raw\_bytes = my\_string.encode() print('encoded raw bytes is ',raw\_bytes) # outputs b'\xe6\x96\x87\xe8\xa8\x80' decoded\_bytes = raw\_bytes.decode() print('decoded raw bytes is ',decoded\_bytes) # outputs '文言'
If you have a string and want to do an operation on it usually it will work but sometimes Python will throw up an error saying it doesn’t know how to encode it. In that case you must use .encode to get the bytes you need.
How do you create python documentation for your own functions and classes? e.g. so that when someone types help(my_function) in python command line they will see your help doco
Add a comment enclosed by “”” quotes immediately after the function definition line
def break\_words(stuff): """This function will break up words for us""" words = stuff.split(' ') # words breaks the sentence up and stores individual words in a list # the sentence is broken into str elements whenever a ' ' occurs # other delimiters could be used for other effects return words
How do you access functions in an imported module e.g. call a function in mymodule?
import mymodule
requires each call to start with
mymodule.my_function()
With
from mymodule import *
means you can simply call code direct without referencing the module name
my_function()
If stuff is a sentence (str) what does the following code do?
words = stuff.split(‘ ‘)
stuff is broken up based on each occurence of a space (‘ ‘)and a new list (words) is created to store each occurence. Spaces are discarded.
Works with other delimiters too.
If stuff is a sentence (str) what does the following code do?
words = stuff.split(' ') sorted\_words = sorted(words)
words creates a list containing the individual words in the sentence.
Using the inbuilt function sorted() creates a new sorted list of the words
e.g.
stuff = “All good things come to those who wait.”
words => [‘All’, ‘good’, ‘things’, ‘come’, ‘to’, ‘those’, ‘who’, ‘wait.’]
sorted_words => [‘All’, ‘come’, ‘good’, ‘things’, ‘those’, ‘to’, ‘wait.’, ‘who’]