Data Rep - Coding Systems Flashcards

1
Q

ASCII Full Form

A

American Standard Code for Information Interchange

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Limitations of ASCII

A
  • 256 characters are not sufficient to represent all possible characters, numbers and symbols.
  • Due to being initially developed in English, it does not support other languages.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two common encodings of Unicode?

A
  • UTF-8, 256 total characters.
  • UTF-16, 65,536 total characters.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a parity bit?

A
  • Most ASCII characters only use 7 bits.
  • The MSB (the 8th bit) can be used as a parity bit.
  • This is a method of detecting errors during data transmission.
  • However, it does not identify all errors.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is one cause for the corruption of data being transmitted?

A
  • Data is sent on carrier waves, and slight variations in the frequency can mean that a 0 is misinterpreted as a 1, making data very unreliable.
  • Depending on the nature of the data, it could be critical, corrupting it entirely.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do parity bits work?

A
  • Counts the number of 1s in each byte before sending, to check whether it’s even/odd.
  • Even parity: The number of 1s is counted, and the parity bit is either 1 or 0, in order to make the total 1s even.
  • Odd parity: The parity bit is either 1 or 0, to ensure the overall number of 1s in the bit is odd.
  • The received data is then checked, and if the no. of 1s is still even/odd, the data is assumed to have been received correctly.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Majority Voting

A
  • A method of identifying errors in transmission.
  • Sends the same bit multiple times (must be an odd number).
  • If the repeated bits aren’t the same, then then majority voting checks which bit occurs most frequently and assumes this to be the correct bit.
  • For example, 000 111 110 010 would be 0110, where majority voting occurs for the last 2 bits.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Advantage and disadvantage of Majority Voting

A
  • The data does not have to be requested again, as majority voting decides the most likely correct bit.
  • The volume of bits being transmitted is much larger, increasing time for data transmission.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Check Digits

A
  • A form of redundancy check used for error detection on identification numbers, such as an ISBN-10 number used on books.
  • This uses a process called modulo-11:
  • Takes the original code, multiplies it by a weight (2, 3, 4…), then adds the products, then divides by 11, and subtracts the remainder from 11, and this remainder is the check digit.
  • The number 23045 becomes 230456.
  • The check digit can be added to the other numbers, and divided by 11. If the answer is whole, the check digit is correct.
  • If not, the data is resent.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Colour Depth

A
  • The amount of memory allocated to each pixel, in bits.
  • If 24 bits were allocated to each pixel, it would give you 2^24 combinations or 16,777,216 different colours.
  • 24 bits is the most common bit depth, with 8 bits allocated to each primary colour.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate the total storage used by a bitmap image?

A
  • Resolution x Bit Depth (answer in bits)
  • 1920 x 1080 x 24 = 49,766,400 bits
  • 49,766,400/8 = 6,220,800 bytes (B)
  • 6,220,800/1,000,000 = 6.2208 megabytes (MB)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

State what metadata is, and include 4 examples

A
  • Data about data, it’s information that describes a file such as an image.
  • Includes information such as:
  • File type (png, jpeg…)
  • Resolution
  • Colour Depth
  • Location picture was taken
  • Date and time picture was taken
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

State what an ADC is and describe how it works

A
  • Analogue to digital convertor, converts analogue signals to digital bit patterns.
  • Records the amplitude of an analogue sound signal at regular intervals, and records the value as a bit pattern. This is called sampling, and the frequency of each sample is defined as the sampling rate.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DAC

A
  • Digital to analogue convertor, converts digital bit patterns which represent sound into analogue signals.
  • Reads a bit pattern representing an analogue signal, and outputs it into alternating, analogue, electrical signals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a MIDI? How does it ‘record’ sound?

A
  • Musical Instrument Digital Interface, a device which takes data in from a musical instrument which may be analogue.
  • Stores sound as a series of ‘event messages’.
  • Each event message is a series of instructions used to recreate a piece of music. They contain information such as:
  • The duration of a note.
  • The instrument with which a note is played.
  • How loud a note is.
  • If a note should be sustained.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Analogue vs Digital data

A
  • Analogue data is continuous, meaning it can take any value at any point in time.
  • Digital data is discrete, so it can only take a set value at any given time.
  • Analogue data can change value as frequently as required, digital data can only change as specified time intervals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do computers represent sound?

A
  • A sequence of sound samples, each of which takes a discrete digital value of bit patterns.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define sampling resolution, and the drawback to a high sampling resolution

A
  • The number of bits allocated to each sample.
  • A higher sample resolution results in a higher quality audio, but an increased file size.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Calculating the size of a sound file (including the units of each variable)

A
  • Duration of Sample (seconds) x Sampling Rate (hertz) x Sample Resolution (bits)
  • A minute long, 44 kHz audio file with a sample resolution of 24:

60 x 44,000 x 24 = 63,360,000
63,360,000 / 8 = 7,920,000 bytes, 7.92 MB

  • Additional metadata can add to this total file size.
20
Q

The Nyquist Theorem, and its implication on the average sampling rate for audio

A
  • The sampling rate of a digital audio file must be at least twice the frequency of the sound, in order to accurately represent the sound.
  • Hence, we often use 44 kHz to represent sounds, as it is just above twice the human hearing range of 20 kHz.
21
Q

Advantages and Disadvantages of MIDI

A
  • It allows for easy manipulation of music (e.g. the duration of a note can be altered), without a loss of quality.
  • MIDI files are smaller in size than sampled audio files, and are lossless.
  • MIDI cannot be used for storing speech, and sometimes results in a less realistic sound than sampled recordings.
22
Q

Lossless Compression

A
  • A data compression technique that employs an algorithm to reduce the size of a file without permanently discarding any of the original data.
  • The original data can be perfectly reconstructed from the compressed data.
23
Q

Lossy Compression

A
  • A data compression technique that permanently discards non-essential data from a file, leading to a decrease in the accuracy of the data, however a significant decrease in file size.
  • Data removed from the original file is non-recoverable.
24
Q

Why do we use data compression techniques?

A
  • To reduce file sizes.
  • This reduce the storage requirements of a file.
  • This makes it quicker to transmit the data in these files.
25
Q

Lossless compression technique for images

A

Run-length encoding

26
Q

Lossless compression technique for text files

A

Dictionary-based compression

27
Q

Explain run-length encoding

A
  • RLE reduces the size of a file by representing repeating patterns of information with a single occurrence of the information, followed by the number of time it occurs.
  • For example, if the same colour occurs in 50 adjacent pixels in a bitmapped image, RLE would state the image colour once, and the number of times it repeats (50), saving space as less bits are used.
28
Q

Disadvantage of RLE

A
  • Not all data is suitable for run length encoding. If the data does not hold many repeated patterns of data, RLE would not be very efficient.
29
Q

Explain dictionary-based methods of lossless compression

A
  • A file containing a dictionary is appended to the file.
  • The dictionary records common occurrences of strings, for example ‘ion’ is a very common part of many words.
  • Any occurrences of these strings can then be easily represented by a smaller code, allocated to a value in the dictionary.
30
Q

Advantage and disadvantage of dictionary-based compression

A
  • It can result in a significant reduction in file size, depending on the nature of the data.
  • The dictionary has to be appended to the file in order to reconstruct the original text, increasing the file size.
  • Hence dictionary-based compression is not suitable for files with few repeating strings of text.
31
Q

Advantages of lossy/lossless compression of lossless

A
  • In lossy compression, there is no limit to how much a file is compressed, as any amount of data can be removed as is necessary.
  • Lossless compression results in no loss of information/quality.
32
Q

Lossy and lossless file formats

A

Lossy formats:
- JPEG
- MP3
- WAV

Lossless formats:
- PNG
- ZIP
- GIF

33
Q

Why is lossy compression (e.g.JPEG) used if it results in a permanent loss in file quality?

A
  • Small changes in the quality of a file (such as an image or audio file) are often indistinguishable to the human eye and ear.
34
Q

When is lossy compression unviable?

A
  • In important text files, where all text is necessary and no detail can be removed from the contents of the file.
  • The file could become unusable and redundant.
35
Q

Encryption definition

A
  • The process of converting the original text (plaintext) into a form which cannot be understood by unauthorised users (ciphertext) using an encryption algorithm (cipher).
36
Q

Caesar Cipher

A
  • A substitution cipher in which each letter of plaintext is substituted for another, which is a fixed number of letters ahead/behind in the alphabet.
  • The new string of substituted letters then becomes the ciphertext, as it cannot be understood without knowing how to unencrypt it first.
37
Q

Vernam Cipher definition

A
  • A cipher that uses a one-time pad (a secret random encryption key) to convert each character to cipher text, by modularly adding it with the corresponding character of the key. This is impossible to decrypt without a key.
38
Q

Disadvantage of Caesar Ciphers

A
  • They can be very easily cracked.
  • When caesar ciphers use a key to shift a certain number of places down the alphabet, they can be brute forced as there are only 25 possible shifts, hence they can be brute forced.
  • The frequency at which characters occur can provide a clue as to which letter is replaced with which.
  • Once a single character is cracked, the entire cipher can be cracked as they key can be found.
39
Q

How does a Vernam Cipher work?

A
  • The characters of the plaintext and the one-time pad are aligned.
  • Each character is converted into binary using an information coding system.
40
Q

What is a MIDI file?

A
  • A MIDI file is not the same as a digital recording of a live source.
  • It is simply a “list of instructions” on how to recreate sound.
  • The sounds used to recreate the sounds are pre-recorded digital samples of real sounds.
  • As a result, a MIDI file uses up far less disk space than a traditional digital recording.
41
Q

What are the conditions for the Verman Cipher to offer perfect security?

A
  • The encryption key (one-time pad) is equal to or longer than the plaintext message.
  • The key is truly random.
  • The key is used only once then destroyed.
  • The key is shared securely, which means it has to be in person.
  • Hence, this cipher is not based on computational security and cannot be cracked.
42
Q

What is the concept of computational security/hardness?

A
  • A cipher that is computationally secure is theoretically breakable, but cannot be broken with current technology within a useable timeframe.
  • Most encryption can be theoretically cracked, but in practise it is secure enough to withstand most threats.
43
Q

Brute force decryptions

A
  • Takes quite long, as the computer looks at every single permutation of characters that can be created and compares the decrypted text to these permutations.
44
Q

Dictionary attacks

A
  • Using a dictionary with common words/phrases, to see if any words in each decryption attempt match.
45
Q

Reverse engineering

A
  • The process of going back step by step until you work out how something has been put together.
46
Q

Cracking encryption algorithms - Identifying commonly used techniques

A
  • Many ciphers are based on substitution or transpositions.
  • Experience cryptographers are able to recognise patterns in data that has been encrypted using these methods.