1.04 - Internal coding of texts Flashcards
UNICODE is developed with the UCS scheme, standardised as….
ISO/IEC 10646
What is the most popular version of unicode?
UTF - 8
The inclusion of 8 shows that codes are defined by 1,2,3 and 4 bytes
1 (UNICODE) byte reproduces ….. ASCII bits
7
2,3 and 4 bytes have their most significant value set to 10
When a byte has the most significant figure set to 11…
there will be at least one continuation byte following
What is the long form of ASCII
American Standard Code for Information Interchange
How many codes are avaible for a 7 bit code in ASCII
2(to the power 7) = 128
Five
Important points to remember about the ASCII code
- A limited number of codes represent non-printing or control codes
- The majority of the codes are for characters found in English text
- Includes upper and lowercase letters, punctuation, ect…
- The codes for letters and numbers are in sequence
- Upper and lower case codes are different from each other
How are the number of codes available determined in Unicode?
By the number of bits that are not pre defined by the format
What is a character code referred to as in unicode?
Code point
What is a character code?
A coding scheme that provides a unique binary code for each distinct individual component item of the text to be stored.
How is a code point identified in Unicode?
By U+ followed by a 4 digit hexadecimal number
What does the code points U+0000 to U+00FF define?
(Unicode)
It defines characters that are a duplicate of those in a standard Latin-1 scheme
What does binary codes corresponding to U+0000 to U+007F use?
(Unicode)
One byte only
Ranges from 00000000 to 0111111
What does the binary codes for U+0080 to U+00FF require?
(Unicode)
Two bytes
Range from 11000000 for the first byte followed by 10000000 for the second byte through to 11000001 followed by 10111111