1.3: Exchanging Data Flashcards
What is compression?
Compression is the process used to reduce the storage space required by a file, meaning you can store more files with the same amount of storage space.
Why is compression important?
Compression is particularly important for sharing files over networks or the Internet. The larger a file, the longer it takes to transfer and so compressing files increases the number of files that can be transferred in a given time.
E.g. Apps like Google Photos compress files so that they can quickly be searched for and downloaded. Downloading a compressed file over the Internet is faster than downloading the full version of the file.
Benefits of compression?
- Data is sent more quickly
- Less bandwidth is used as transfer limits may apply
- Buffering on audio and video streams is less likely to occur
- Less storage is required
What is lossy compression?
As the name suggests, lossy compression reduces the size of a file by permanently removing data deemed non-essential. This could result in a more pixelated image or less clear audio recording.
Example of lossy file type?
.JPG, .MP3, .MP4
What is lossless compression?
Lossless compression reduces the size of a file without losing any information by spotting and summarising patterns in the data.
Example of lossless file type?
.zip, .png
Difference between lossy and lossless?
When using lossless compression, the original file can be recovered from the compressed version. Something which is not possible when using lossy compression which reduces the size of the file by completely disregarding some information.
How can audio files be compressed using lossy compression?
For example, audio files can be compressed lossily by removing the very high or very low frequencies which are least noticeable to the ear and by removing quiet sounds that are overlapped with louder sounds. There’s no way to go from the lossy version of the recording back to the full version as there’s no record of what the high and low frequencies were.
What is RLE?
Run length encoding is a method of lossless compression in which consecutive values are removed and replaced with one occurrence of the data followed by the number of times it should be repeated.
For example, the string AAAAAABBBBBCCC could be represented as A6B5C3.
When does RLE work well?
In order to work well, run length encoding relies on consecutive pieces of data being the same - if there’s little repetition, run length encoding doesn’t offer a great reduction in file size. Image and sound data often have a lot of repetition.
What is dictionary encoding?
Dictionary encoding is another example of a method of lossless compression. Frequently occurring pieces of data are replaced with an index and compressed data is stored alongside a dictionary which matches the frequently occurring data to an index. The original data can then be restored using the dictionary. The dictionary produces additional overheads but the space saving negates this problem.
Can dictionary compressed data be used without the dictionary?
No - It’s important to remember that data compressed using dictionary compression must be transferred alongside its dictionary. Without a dictionary, the data cannot be used.
What is encryption?
A way of making sure data cannot be understood if you don’t possess the means to decrypt it
What is encryption used for?
Encryption is used to keep data secure when it’s being transmitted. There are a variety of different methods which can be used to scramble data before it’s transmitted and then decipher it once it arrives at its destination.
What are two methods of encryption?
- Ceasar cipher
- Vernam cipher
Ceasar cipher?
The Caesar cipher is most basic type of encryption and the most insecure
Letters of the alphabet are shifted by a consistent amount
Methods of cracking a ceasar cipher?
- Brute force attackA brute force attack attempts to apply every possible key to decrypt ciphertext until one works
-Frequency analysis
Frequency analysis consists of counting the occurrence of each letter in a text. Frequency analysis is based on the fact that, in any given piece of text, certain letters and combinations of letters occur with varying frequencies. This will help us decrypt some of the letters in the text.
Vernam cipher (+what should the key be)?
The encryption key, also known as the one-time pad, is the only cipher proven to be unbreakable.
The key must be:
- a truly random sequence greater or equal in length than the plaintext and only ever used once
- Shared with the recipient by hand, independently of the message and destroyed immediately after use
How are messages encrypted and decrypted? (Vernam)
Encryption and decryption of a message is performed bit by bit using an exclusive or (XOR) operation with the shared key. XOR means that there can only be one occurence, two occurences does not count.
Sources where the one-time pad can be generated from?
Sources may include: atmospheric noise, radioactive decay, the movements of a mouse or snapshots of a lava lamp.
Why is the one-time pad so effective?
A truly random key will render any frequency analysis useless as it would have a uniform distribution.
Why can you not use computer generated random sequences for the one-time pad?
Computer generated ‘random’ sequences are not actually random.
What are the things needed to crack a cipher?
Given enough ciphertext, computer power and time, any key (except the one-time pad) can be determined and the message cracked.