Data Representation Flashcards

1
Q

Number systems

A

Natural numbers = N, all the positive whole numbers and zero

Integers = Z, all the whole numbers, both positive and negative
Contains all the natural numbers

Rational numbers = Q, all the numbers that can be represented as x/y, where both x and y are integers
Contain all the integers

Irrational numbers = Numbers that cannot be represented by x/y
Examples include π and √ 2

Real numbers = R, all possible real world quantities with/without a fractional
Contains all rational and irrational numbers

Ordinal numbers = Numbers that indicate a position in a list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Binary and decimal units

A

1 byte = 8 bits
Binary units:

  • Kibibyte = 2^10 bytes
  • Mebibyte = 2^20 bytes
  • Gibibyte = 2^30 bytes
  • Tebibyte = 2^40 bytes

Decimal units:

  • Kilobyte = 10^3 bytes
  • Megabyte = 10^6 bytes
  • Gigabyte = 10^9 bytes
  • Terabyte = 10^12 bytes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Unsigned and Signed binary

A

Unsigned binary can only represent positive numbers while signed binary can represent both positive and negative numbers
Unsigned binary can represent for a given number of bits, 0 to 2^n -1, where n is the number of bits
(eg. 8 bits can represent the numbers 0 to 255 because 2^8 - 1 = 255)

Signed binary uses two’s complement to represent both positive and negative numbers - in two’s complement, the most significant bit is given a negative place value
Signed binary with two’s complement can represent (2^n-1) - 1 to -2^n-1 where n is the number of bits
(eg. 8 bits can represent the numbers 127 to -128;
2^8-1 is 127 and -2^8-1 is -128)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Numbers with a fractional part

A

There are 2 ways to represent numbers with a fractional part in binary; fixed point and floating point

Fixed point = A specified number of bits are placed before and after the binary point (eg. for 8 bits, 4 are placed before and 4 are placed after)

Floating point = Comparable to scientific notation, where 3,100,000 would be represented as 3.1*10^6, where 3.1 is the mantissa (the significant digits of a number) and 6 is the exponent (the power of 2 that indicates the size of the number and where the decimal should be positioned)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Rounding errors

A

There are some decimal numbers that cannot possibly be represented exactly in binary, even with the use of fixed point or floating point notation.
This means that both fixed and floating point representations can be inaccurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Absolute and relative errors

A

Absolute error = The actual amount by which a value is inaccurate and is found by (Real value - Value stored as binary)

Relative error = A measure of uncertainty in a given value compared to the actual value which is relative to the size of the given value.
Found by (Absolute error / Real value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fixed point vs Floating point

A

Precision = the maximum number of significant digits that can be represented

Fixed point = A binary point close to the left of a number gives good precision but only a small range of numbers.
However, a binary point close to the right increases the range while decreasing precision.

Floating point = Allows for the representation of a greater range of numbers with a given number of bits than fixed point.
- This is because floating point can take advantage of an exponent which can be either positive or negative.
- The number of bits allocated to each part of a floating point number affects the numbers that can be represented.
- A large exponent and a small mantissa allows for a large range but little precision.
- A small exponent and a large mantissa allows for good precision but only a small range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Normalisation

A

Floating point numbers are normalised in order to provide the maximum level of precision for a given number of bits.
Normalisation involves ensuring that a floating point number starts with 0.1 (for a positive number) or 1.0 (for negative numbers).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cancellation errors

A

These cause a loss of accuracy during the addition or subtraction of numbers of widely differing values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Underflow and overflow

A

Underflow happens when the number is too small to be accurately represented with the available number of bits
(eg. Multiplying 2 very small fractions or dividing a small fraction by a much larger number)

Overflow happens when the number is too large to be accurately represented with the available number of bits
(eg. Multiplying 2 very large numbers or dividing a large number by a small fraction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Character code

A

A unique binary representation of a character
The character code of a decimal digit is different to the digits binary value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ASCII

A

American Standard Code for Information Interchange
Uses 7 bits to represent characters - this means it could represent a maximum of 2^7 characters, aka 128. Range is from 0 to 2^7-1 which is 127
Written with the Latin script - can’t represent arabic, chinese, or cyrillic letters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Unicode

A

Unicode is an information coding system that was introduced to allow the representation of more characters that ASCII couldn’t represent, and therefore allows more languages and alphabets to be represented

It also is universal - the same character code is used no matter where or how Unicode is used whereas multiple ASCII systems existed on different systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Error checking methods

A

Error checking is done to ensure that transmitted data is correct and to reduce the chances of incorrect data being used
Methods include:
- Parity bits
- Majority voting
- Checksums
- Check digits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Parity bits

A

Parity bits = Bits added onto a bit sequence that is used to validate data.
The parity bit of a sequence is calculated from the sequence itself and is set to 0 or 1 depending on the parity used
They are very efficient

  • Even parity = In even parity, the parity bit is set to make the number of 1s in the sequence an even number
  • Odd parity = In odd parity, the parity bit is set to make the number of 1s in the sequence an odd number

The sending computer calculates and attaches the parity bit to the data transmission and the receiving computer performs a parity check to validate the data.
If the value of the parity bit matches with the parity of the bit sequence, the data is accepted; if it doesn’t then an error has occurred and the data needs to be re-transmitted; the receiving computer will request that the sending computer re-transmits the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Majority voting

A

Majority voting is where each bit is sent multiple times - when the data is received, the most commonly occurring value is taken to be the value of that bit
(eg. 0100 would be transmitted as 000 011 010 001)

Majority voting is capable of correcting errors in data transmission, which means there’s no need for re-transmission unlike with parity bits
However, more data is transmitted with the repetition of bits - this means the data would take more time to transmit and makes majority voting inefficient

17
Q

Checksums

A

Checksums are used to check the integrity of data blocks rather than individual bytes - they are calculated from the block of data with an algorithm such as the modulus function
They are efficient depending on the complexity of the algorithm used to find them

The sending computer applies an algorithm to the block of data to calculate its checksum; it then attaches the checksum to the data and transmits it to the receiving computer

The receiving computer removes the checksum and applies the same algorithm to the data block to ensure that the checksum produced matches the checksum received.
If they do, the data is validated; if they don’t, the receiving computer will request that the sending computer re-transmits the data

18
Q

Check digits

A

A type of checksum where only one digit is added to the transmitted data block - this limits how complex the algorithms that calculate check digits can be
They are used to validate product and book codes.
They are efficient since only one digit needs to be calculated

A check digit is an extra digit that is calculated from the original digits in the data block; it’s attached to the original data block and transmitted as a part of it.

A program will calculate a check digit by multiplying each digit in the data block by 1 or 3 - the products are added together and the modulus of 10 is calculated.
This value is subtracted by 10 to get the check digit which is added to the data block
When this data block is input / received by another program or computer, the check digit is recalculated from the data block. The calculated check digit is then compared to the inputted / received check digit - if they don’t match then the data needs to be re-transmitted

19
Q

Analogue and digital data / signals

A

Analogue data is continuous while digital data is discrete.

Analogue signals vary in a continuous manner and can take any values and change as frequently as needed while digital signals vary in a discrete manner
Computers can’t process analogue signals but can process digital signals.

20
Q

ADC - Analogue to Digital Conversion

A

An ADC works by;
1) Sampling an analogue signal at regular intervals at a specific frequency - this determines the number of samples per second

2) Measuring the amplitude of each sample

3) Each amplitude measurement is quantised - the height of each sample is given an integer value

4) Each integer is encoded as a binary value using a fixed number of bits

5) The ADC then outputs the digital representation of the analogue signal, which can be processed by computers - the PCM signal

21
Q

DAC - Digital to Analogue Conversion

A

A DAC works by:
1) Reading a bit pattern that represents an analogue signal -the PCM signal - and converts it into a PAM signal

2) Samples the PAM signal at regular intervals, then gives the sampled digital values discrete levels and converts these values into analog voltages

3) These voltages are then output

The output is an approximation of the original analogue signal since the DAC rounds the discrete values to the nearest value that caan be represented - the difference between the original analogue signal and the DAC’s output is called quantisation noise

22
Q

Nyquist’s Theorem

A

Nyquist’s Theorem states that the sampling rate must be at least double the highest frequency of an analogue signal

23
Q

Musical Instrument Digital Interface (MIDI)

A

MIDI is a standard adopted by the music industry for controlling devices that emit music

MIDI isn’t music and doesn’t contain any actual sounds - it is nothing more than a set of instructions on how to produce sounds

MIDI data contain a list of events, messages or instructions that tell an electronic device how to generate a certain sound

These event messages specify:
- When to play a note
- What pitch a note is
- Control signals for volume
- Clock signals to set tempo

24
Q

MIDI advantages

A

.MID files are much smaller than audio files like .mp3 or .wav - this is useful for devices with less memory such as mobile phones

.MID files load faster - good for embedding into web pages

MIDI supports a wide variety of musical instruments

25
Q

Bitmaps

A

Bitmap images are represented by a grid made up of blocks of colour called pixels, which each have a binary value assigned to them

Pixel = The smallest addressable part of an image

The resolution of a bitmap image is measured in pixels and is the mo. of pixels per row multiplied by the no. of pixels per column

The value assigned to a pixel determines the colour of the pixel

Colour depth = the number of bits assigned to each pixel; the more bits, the more colours that each pixel can take

Bitmaps are used for real world images

26
Q

Metadata

A

Metadata is the additional details of an image and can include:

  • Colour depth
  • When it was taken / processed
  • File size
  • Date and time of when the image was created
  • If the image is compressed or not
  • GPS location of where the image was created
27
Q

Vector images

A

Vector images are represented by shapes and geometric objects such as lines and rectangles
The properties of a vector image are stored in a file called the drawing list

Vector graphics are used for drawing images such as logos and maps since these can be made up of different shapes and objects

28
Q

The drawing list

A

The drawing list contains all the objects that make up the image and the necessary information about each object
These properties include:

  • Fill colour
  • Line thickness
  • Shading
  • Border thickness
  • Height and width
29
Q

Vectors v Bitmaps

A

Vector images can be scaled without losing quality

Vector images use less storage space compared to bitmap images

Vectors are better for simpler drawn images; bitmaps are much better for real world photographs

Bitmaps can’t be scaled without distortion

30
Q

Memory requirements for sound files

A

Sampling Rate (Hz) x Sampling resolution (bits) x Time period (seconds)

31
Q

Memory requirements for bitmap images

A

Colour deptjh (bits) x Image resolution

32
Q

Compression

A

Compression in the process of encoding data so that the data is squeezed into a smaller number of bytes than the data would occupy if uncompressed

33
Q

Lossy compression

A

Lossy compression removes data from the original file to reduce the file’s size - this data can’t be restored when the file is decompressed
This could involve reducing the resolution of an image or reducing the sampling resolution of a sound file

There’s no limit to how much the file can be compressed

34
Q

Lossless compression

A

Lossless compression works by identifying and encoding patters within the data rather than encoding the actual data itself - there is no loss of data and the size of a file can be reduced without affecting its quality

There is a limit on how much the file can be compressed without removing data

35
Q

Run Length Encoding

A

RLE reduces the size of a file by removing repeated information and replacing them as a single data value and a count of how many times the information is repeated in the file

Not all data is suitable for RLE since some data might not have any repeating parts

36
Q

Dictionary-based methods

A

Dictionary-based methods use algorithms that substitute repeated information in a file with data values that identify the information in a dictionary - this dictionary is added to the file itself

These data values take up less space than the information they replace however the dictionary itself is stored which could raise the file size