Character Encoding Flashcards

Question 1

Q

Unicode

Answer

A

Question 2

Q

Text documents

Answer

A

represented as series of numbers
simplest form of encoding through fixed precision ie. fixed number of digits to represent code point for each character

Question 3

Q

ASCII

Answer

A

Question 4

Q

UTF-32

Answer

A

- can encode all unicode characters but bloated

Question 5

Q

ISO-8859

Answer

A

single byte encoding built on top of ASCII to include extra 128 characters
can represent orthographies such as Thai, unable to support big orthographies e.g. Japanese

Question 6

Q

Variable-width encoding

Answer

A

variable bytes
encode code points using variable number of code units of fixed size
e. g. UTF-8, UTF-16

Question 7

Q

UTF-8

Answer

A

Question 8

Q

Declaring character encoding

Answer

A

manually specify character encoding in document e..g charset = ISO8859-8
automatically detects character encoding in terms of compatibility, user preferences, statistical model

(8 cards)