Data Representation Flashcards
Number systems
Natural numbers = N, all the positive whole numbers and zero
Integers = Z, all the whole numbers, both positive and negative
Contains all the natural numbers
Rational numbers = Q, all the numbers that can be represented as x/y, where both x and y are integers
Contain all the integers
Irrational numbers = Numbers that cannot be represented by x/y
Examples include π and √ 2
Real numbers = R, all possible real world quantities with/without a fractional
Contains all rational and irrational numbers
Ordinal numbers = Numbers that indicate a position in a list
Binary and decimal units
1 byte = 8 bits
Binary units:
- Kibibyte = 2^10 bytes
- Mebibyte = 2^20 bytes
- Gibibyte = 2^30 bytes
- Tebibyte = 2^40 bytes
Decimal units:
- Kilobyte = 10^3 bytes
- Megabyte = 10^6 bytes
- Gigabyte = 10^9 bytes
- Terabyte = 10^12 bytes
Unsigned and Signed binary
Unsigned binary can only represent positive numbers while signed binary can represent both positive and negative numbers
Unsigned binary can represent for a given number of bits, 0 to 2^n -1, where n is the number of bits
(eg. 8 bits can represent the numbers 0 to 255 because 2^8 - 1 = 255)
Signed binary uses two’s complement to represent both positive and negative numbers - in two’s complement, the most significant bit is given a negative place value
Signed binary with two’s complement can represent (2^n-1) - 1 to -2^n-1 where n is the number of bits
(eg. 8 bits can represent the numbers 127 to -128;
2^8-1 is 127 and -2^8-1 is -128)
Numbers with a fractional part
There are 2 ways to represent numbers with a fractional part in binary; fixed point and floating point
Fixed point = A specified number of bits are placed before and after the binary point (eg. for 8 bits, 4 are placed before and 4 are placed after)
Floating point = Comparable to scientific notation, where 3,100,000 would be represented as 3.1*10^6, where 3.1 is the mantissa (the significant digits of a number) and 6 is the exponent (the power of 2 that indicates the size of the number and where the decimal should be positioned)
Rounding errors
There are some decimal numbers that cannot possibly be represented exactly in binary, even with the use of fixed point or floating point notation.
This means that both fixed and floating point representations can be inaccurate
Absolute and relative errors
Absolute error = The actual amount by which a value is inaccurate and is found by (Real value - Value stored as binary)
Relative error = A measure of uncertainty in a given value compared to the actual value which is relative to the size of the given value.
Found by (Absolute error / Real value)
Fixed point vs Floating point
Precision = the maximum number of significant digits that can be represented
Fixed point = A binary point close to the left of a number gives good precision but only a small range of numbers.
However, a binary point close to the right increases the range while decreasing precision.
Floating point = Allows for the representation of a greater range of numbers with a given number of bits than fixed point.
- This is because floating point can take advantage of an exponent which can be either positive or negative.
- The number of bits allocated to each part of a floating point number affects the numbers that can be represented.
- A large exponent and a small mantissa allows for a large range but little precision.
- A small exponent and a large mantissa allows for good precision but only a small range.
Normalisation
Floating point numbers are normalised in order to provide the maximum level of precision for a given number of bits.
Normalisation involves ensuring that a floating point number starts with 0.1 (for a positive number) or 1.0 (for negative numbers).
Cancellation errors
These cause a loss of accuracy during the addition or subtraction of numbers of widely differing values
Underflow and overflow
Underflow happens when the number is too small to be accurately represented with the available number of bits
(eg. Multiplying 2 very small fractions or dividing a small fraction by a much larger number)
Overflow happens when the number is too large to be accurately represented with the available number of bits
(eg. Multiplying 2 very large numbers or dividing a large number by a small fraction)
Character code
A unique binary representation of a character
The character code of a decimal digit is different to the digits binary value
ASCII
American Standard Code for Information Interchange
Uses 7 bits to represent characters - this means it could represent a maximum of 2^7 characters, aka 128. Range is from 0 to 2^7-1 which is 127
Written with the Latin script - can’t represent arabic, chinese, or cyrillic letters
Unicode
Unicode is an information coding system that was introduced to allow the representation of more characters that ASCII couldn’t represent, and therefore allows more languages and alphabets to be represented
It also is universal - the same character code is used no matter where or how Unicode is used whereas multiple ASCII systems existed on different systems
Error checking methods
Error checking is done to ensure that transmitted data is correct and to reduce the chances of incorrect data being used
Methods include:
- Parity bits
- Majority voting
- Checksums
- Check digits
Parity bits
Parity bits = Bits added onto a bit sequence that is used to validate data.
The parity bit of a sequence is calculated from the sequence itself and is set to 0 or 1 depending on the parity used
They are very efficient
- Even parity = In even parity, the parity bit is set to make the number of 1s in the sequence an even number
- Odd parity = In odd parity, the parity bit is set to make the number of 1s in the sequence an odd number
The sending computer calculates and attaches the parity bit to the data transmission and the receiving computer performs a parity check to validate the data.
If the value of the parity bit matches with the parity of the bit sequence, the data is accepted; if it doesn’t then an error has occurred and the data needs to be re-transmitted; the receiving computer will request that the sending computer re-transmits the data.