Data Representation Flashcards

Question 1

Q

Number systems

Answer

A

Natural numbers = N, all the positive whole numbers and zero

Integers = Z, all the whole numbers, both positive and negative
Contains all the natural numbers

Rational numbers = Q, all the numbers that can be represented as x/y, where both x and y are integers
Contain all the integers

Irrational numbers = Numbers that cannot be represented by x/y
Examples include π and √ 2

Real numbers = R, all possible real world quantities with/without a fractional
Contains all rational and irrational numbers

Ordinal numbers = Numbers that indicate a position in a list

Question 2

Q

Binary and decimal units

Answer

A

1 byte = 8 bits
Binary units:

Kibibyte = 2^10 bytes
Mebibyte = 2^20 bytes
Gibibyte = 2^30 bytes
Tebibyte = 2^40 bytes

Decimal units:

Kilobyte = 10^3 bytes
Megabyte = 10^6 bytes
Gigabyte = 10^9 bytes
Terabyte = 10^12 bytes

Question 3

Q

Unsigned and Signed binary

Answer

A

Unsigned binary can only represent positive numbers while signed binary can represent both positive and negative numbers
Unsigned binary can represent for a given number of bits, 0 to 2^n -1, where n is the number of bits
(eg. 8 bits can represent the numbers 0 to 255 because 2^8 - 1 = 255)

Signed binary uses two’s complement to represent both positive and negative numbers - in two’s complement, the most significant bit is given a negative place value
Signed binary with two’s complement can represent (2^n-1) - 1 to -2^n-1 where n is the number of bits
(eg. 8 bits can represent the numbers 127 to -128;
2^8-1 is 127 and -2^8-1 is -128)

Question 4

Q

Numbers with a fractional part

Answer

A

There are 2 ways to represent numbers with a fractional part in binary; fixed point and floating point

Fixed point = A specified number of bits are placed before and after the binary point (eg. for 8 bits, 4 are placed before and 4 are placed after)

Floating point = Comparable to scientific notation, where 3,100,000 would be represented as 3.1*10^6, where 3.1 is the mantissa (the significant digits of a number) and 6 is the exponent (the power of 2 that indicates the size of the number and where the decimal should be positioned)

Question 5

Q

Rounding errors

Answer

A

There are some decimal numbers that cannot possibly be represented exactly in binary, even with the use of fixed point or floating point notation.
This means that both fixed and floating point representations can be inaccurate

Question 6

Q

Absolute and relative errors

Answer

A

Absolute error = The actual amount by which a value is inaccurate and is found by (Real value - Value stored as binary)

Relative error = A measure of uncertainty in a given value compared to the actual value which is relative to the size of the given value.
Found by (Absolute error / Real value)

Question 7

Q

Fixed point vs Floating point

Answer

A

Precision = the maximum number of significant digits that can be represented

Fixed point = A binary point close to the left of a number gives good precision but only a small range of numbers.
However, a binary point close to the right increases the range while decreasing precision.

Floating point = Allows for the representation of a greater range of numbers with a given number of bits than fixed point.
- This is because floating point can take advantage of an exponent which can be either positive or negative.
- The number of bits allocated to each part of a floating point number affects the numbers that can be represented.
- A large exponent and a small mantissa allows for a large range but little precision.
- A small exponent and a large mantissa allows for good precision but only a small range.

Question 8

Q

Normalisation

Answer

A

Floating point numbers are normalised in order to provide the maximum level of precision for a given number of bits.
Normalisation involves ensuring that a floating point number starts with 0.1 (for a positive number) or 1.0 (for negative numbers).

Question 9

Q

Cancellation errors

Answer

A

These cause a loss of accuracy during the addition or subtraction of numbers of widely differing values

Question 10

Q

Underflow and overflow

Answer

A

Underflow happens when the number is too small to be accurately represented with the available number of bits
(eg. Multiplying 2 very small fractions or dividing a small fraction by a much larger number)

Overflow happens when the number is too large to be accurately represented with the available number of bits
(eg. Multiplying 2 very large numbers or dividing a large number by a small fraction)

Question 11

Q

Character code

Answer

A

A unique binary representation of a character
The character code of a decimal digit is different to the digits binary value

Question 12

Q

ASCII

Answer

A

American Standard Code for Information Interchange
Uses 7 bits to represent characters - this means it could represent a maximum of 2^7 characters, aka 128. Range is from 0 to 2^7-1 which is 127
Written with the Latin script - can’t represent arabic, chinese, or cyrillic letters

Question 13

Q

Unicode

Answer

A

Unicode is an information coding system that was introduced to allow the representation of more characters that ASCII couldn’t represent, and therefore allows more languages and alphabets to be represented

It also is universal - the same character code is used no matter where or how Unicode is used whereas multiple ASCII systems existed on different systems

Question 14

Q

Error checking methods

Answer

A

Error checking is done to ensure that transmitted data is correct and to reduce the chances of incorrect data being used
Methods include:
- Parity bits
- Majority voting
- Checksums
- Check digits

Question 15

Q

Parity bits

Answer

A

Parity bits = Bits added onto a bit sequence that is used to validate data.
The parity bit of a sequence is calculated from the sequence itself and is set to 0 or 1 depending on the parity used
They are very efficient

Even parity = In even parity, the parity bit is set to make the number of 1s in the sequence an even number
Odd parity = In odd parity, the parity bit is set to make the number of 1s in the sequence an odd number

The sending computer calculates and attaches the parity bit to the data transmission and the receiving computer performs a parity check to validate the data.
If the value of the parity bit matches with the parity of the bit sequence, the data is accepted; if it doesn’t then an error has occurred and the data needs to be re-transmitted; the receiving computer will request that the sending computer re-transmits the data.

Question 16

Q

Majority voting

Answer

A

Majority voting is where each bit is sent multiple times - when the data is received, the most commonly occurring value is taken to be the value of that bit
(eg. 0100 would be transmitted as 000 011 010 001)

Majority voting is capable of correcting errors in data transmission, which means there’s no need for re-transmission unlike with parity bits
However, more data is transmitted with the repetition of bits - this means the data would take more time to transmit and makes majority voting inefficient

Question 17

Q

Checksums

Answer

A

Checksums are used to check the integrity of data blocks rather than individual bytes - they are calculated from the block of data with an algorithm such as the modulus function
They are efficient depending on the complexity of the algorithm used to find them

The sending computer applies an algorithm to the block of data to calculate its checksum; it then attaches the checksum to the data and transmits it to the receiving computer

The receiving computer removes the checksum and applies the same algorithm to the data block to ensure that the checksum produced matches the checksum received.
If they do, the data is validated; if they don’t, the receiving computer will request that the sending computer re-transmits the data

Question 18

Q

Check digits

Answer

A

A type of checksum where only one digit is added to the transmitted data block - this limits how complex the algorithms that calculate check digits can be
They are used to validate product and book codes.
They are efficient since only one digit needs to be calculated

A check digit is an extra digit that is calculated from the original digits in the data block; it’s attached to the original data block and transmitted as a part of it.

A program will calculate a check digit by multiplying each digit in the data block by 1 or 3 - the products are added together and the modulus of 10 is calculated.
This value is subtracted by 10 to get the check digit which is added to the data block
When this data block is input / received by another program or computer, the check digit is recalculated from the data block. The calculated check digit is then compared to the inputted / received check digit - if they don’t match then the data needs to be re-transmitted

Question 19

Q

Analogue and digital data / signals

Answer

A

Analogue data is continuous while digital data is discrete.

Analogue signals vary in a continuous manner and can take any values and change as frequently as needed while digital signals vary in a discrete manner
Computers can’t process analogue signals but can process digital signals.

Question 20

Q

ADC - Analogue to Digital Conversion

Answer

A

An ADC works by;
1) Sampling an analogue signal at regular intervals at a specific frequency - this determines the number of samples per second

2) Measuring the amplitude of each sample

3) Each amplitude measurement is quantised - the height of each sample is given an integer value

4) Each integer is encoded as a binary value using a fixed number of bits

5) The ADC then outputs the digital representation of the analogue signal, which can be processed by computers - the PCM signal

Question 21

Q

DAC - Digital to Analogue Conversion

Answer

A

A DAC works by:
1) Reading a bit pattern that represents an analogue signal -the PCM signal - and converts it into a PAM signal

2) Samples the PAM signal at regular intervals, then gives the sampled digital values discrete levels and converts these values into analog voltages

3) These voltages are then output

The output is an approximation of the original analogue signal since the DAC rounds the discrete values to the nearest value that caan be represented - the difference between the original analogue signal and the DAC’s output is called quantisation noise

Question 22

Q

Nyquist’s Theorem

Answer

A

Nyquist’s Theorem states that the sampling rate must be at least double the highest frequency of an analogue signal

Question 23

Q

Musical Instrument Digital Interface (MIDI)

Answer

A

MIDI is a standard adopted by the music industry for controlling devices that emit music

MIDI isn’t music and doesn’t contain any actual sounds - it is nothing more than a set of instructions on how to produce sounds

MIDI data contain a list of events, messages or instructions that tell an electronic device how to generate a certain sound

These event messages specify:
- When to play a note
- What pitch a note is
- Control signals for volume
- Clock signals to set tempo

Question 24

Q

MIDI advantages

Answer

A

.MID files are much smaller than audio files like .mp3 or .wav - this is useful for devices with less memory such as mobile phones

.MID files load faster - good for embedding into web pages

MIDI supports a wide variety of musical instruments

Question 25

Q

Bitmaps

Answer

A

Bitmap images are represented by a grid made up of blocks of colour called pixels, which each have a binary value assigned to them

Pixel = The smallest addressable part of an image

The resolution of a bitmap image is measured in pixels and is the mo. of pixels per row multiplied by the no. of pixels per column

The value assigned to a pixel determines the colour of the pixel

Colour depth = the number of bits assigned to each pixel; the more bits, the more colours that each pixel can take

Bitmaps are used for real world images

Question 26

Q

Metadata

Answer

A

Metadata is the additional details of an image and can include:

Colour depth
When it was taken / processed
File size
Date and time of when the image was created
If the image is compressed or not
GPS location of where the image was created

Question 27

Q

Vector images

Answer

A

Vector images are represented by shapes and geometric objects such as lines and rectangles
The properties of a vector image are stored in a file called the drawing list

Vector graphics are used for drawing images such as logos and maps since these can be made up of different shapes and objects

Question 28

Q

The drawing list

Answer

A

The drawing list contains all the objects that make up the image and the necessary information about each object
These properties include:

Fill colour
Line thickness
Shading
Border thickness
Height and width

Question 29

Q

Vectors v Bitmaps

Answer

A

Vector images can be scaled without losing quality

Vector images use less storage space compared to bitmap images

Vectors are better for simpler drawn images; bitmaps are much better for real world photographs

Bitmaps can’t be scaled without distortion

Question 30

Q

Memory requirements for sound files

Answer

A

Sampling Rate (Hz) x Sampling resolution (bits) x Time period (seconds)

Question 31

Q

Memory requirements for bitmap images

Answer

A

Colour deptjh (bits) x Image resolution

Question 32

Q

Compression

Answer

A

Compression in the process of encoding data so that the data is squeezed into a smaller number of bytes than the data would occupy if uncompressed

Question 33

Q

Lossy compression

Answer

A

Lossy compression removes data from the original file to reduce the file’s size - this data can’t be restored when the file is decompressed
This could involve reducing the resolution of an image or reducing the sampling resolution of a sound file

There’s no limit to how much the file can be compressed

Question 34

Q

Lossless compression

Answer

A

Lossless compression works by identifying and encoding patters within the data rather than encoding the actual data itself - there is no loss of data and the size of a file can be reduced without affecting its quality

There is a limit on how much the file can be compressed without removing data

Question 35

Q

Run Length Encoding

Answer

A

RLE reduces the size of a file by removing repeated information and replacing them as a single data value and a count of how many times the information is repeated in the file

Not all data is suitable for RLE since some data might not have any repeating parts

Question 36

Q

Dictionary-based methods

Answer

A

Dictionary-based methods use algorithms that substitute repeated information in a file with data values that identify the information in a dictionary - this dictionary is added to the file itself

These data values take up less space than the information they replace however the dictionary itself is stored which could raise the file size