3.1 Compression, Encryption & Hashing Flashcards
What is compression?
The process used to reduce the storage space required by a file by removing unnecessary or insignificant detail.
What are the 2 main advantages of compression?
Increases the number of files that can be stored in the same amount of storage space
Increases the number of files that can be transferred in a given amount of time
What are the 2 types of compression?
lossy
lossless
What is the difference between lossy/lossless compression?
Lossy compression reduces the size of a file while removing some information whereas lossless compression reduces file size without removing any information.
What are the advantages/disadvantages of lossy compression?
advantages -
- reduces file size more than lossless
disadvantages -
- original file can not be restored
- some quality may be lost
What are the advantages/disadvantages of lossless compression?
advantages -
- no quality lost
- original file can be restored
disadvantages -
- reduces file size less than lossy
What are 2 methods of lossless compression?
run length encoding
dictionary encoding
What is run length encoding?
A method of lossless compression in which repeated values are removed and replaced by 1 occurrence of the value followed by the number of times it repeats.
What would happen to AAAAAABBBBBCCC with RLE?
It would become A6B5C3
What is dictionary encoding?
A method of lossless compression in which frequently occurring pieces of data are replaced with an index. Each piece of data is matched to its index in a dictionary.
What is the disadvantage of run length encoding?
Relies on consecutive pieces of data being the same i.e. lots of repetition, otherwise there is little reduction in file size.
What is the disadvantage of dictionary encoding?
If the data is transferred, its dictionary must be transferred alongside it for the original file to be obtained.
What is encryption?
Scrambling data with a specific cipher algorithm so that it can not be understood unless reverted back to the original data with the correct decryption key/algorithm.
What are 2 types of encryption?
symmetric
asymmetric
How does symmetric encryption work?
The sender and receiver share the same private key which is distributed between them in a key exchange. This key is used to encrypt and decrypt the data.
What is the problem with symmetric encryption?
It requires the sender to transfer the key to the receiver. If the key is intercepted alongside the data, then it can be unscrambled and made sense of.
How does asymmetric encryption work?
The receiver has a public key that can be used to encrypt data and a private key that can be used to decrypt data. The sender can use their public key to encrypt data before sending it but only the receiver has the private key to unscramble the data.
How can asymmetric encryption be used to prove that you sent a message? (digital signatures)
If you encrypt a message with your private key, then anybody can decrypt that message with your public key proving that you sent it.
What is hashing?
When an irreversible hash function/algorithm is applied onto a key/input to produce a fixed size value known as a hash.
What are 3 uses of hashing?
hash tables
password storage
digital signatures
How is hashing useful for password storage?
Only hashes of the passwords should be stored. When a user enters their password, the hash function should be applied onto it and then the hash produced should be checked against the hash that is stored to see if they match. However, a hacker can’t find the users passwords as the hashes can’t be reversed back to the original input.
What is a hash table?
An abstract data structure which holds key value pairs and is used for indexing/retrieval of data.
What is a collision?
When 2 different pieces of data produce identical hashes.
How can collisions be overcome?
Store items together in a linked list at the location of that hash. (chaining)
Store the item in the next available location in the hash table. (linear probing)
Use another hash function to generate a new hash whenever there is a duplicate.
What are 2 features of a good hash function?
low chance of collision
quick to produce/calculate hashes
also -
hash should be smaller than the original input
How is hashing used for digital signatures alongside asymmetric encryption?
A encrypts the message with B’s public key
A generates a hash of the message
A encrypts the hash with A’s private key
B decrypts the message with B’s private key
B decrypts the hash with A’s public key
B hashes the message and checks if it matches with the decrypted hash
This proves that the message is uncorrupted and that the sender is who they claim they are.