1.9 Compression, Encryption & Hashing Flashcards
Compression
Compression: Process used to reduce the storage space required by file, store more files with same amount of storage
- Important for sharing files over networks/Internet
- Larger = Longer to transfer
- E.g Google Photos compress files, can quickly be searched for & downloaded
- Downloading a compressed file over the Internet is faster than downloading full version of the file
Lossy & Loseless Compression
Lossy Compression: Reduces file size by discarding some information
- May lead to lower quality, pixelation in images, reduced clarity in audio
- Information loss is irreversible, making recovery of the original file impossible
- E.g: Removing less noticeable audio frequencies in a recording
Lossless Compression: Reduces file size* without losing any information*
- Enables complete recovery of the original file from the compressed version
- Doesn’t discard data, maintaining the full fidelity of the original content
Run Length Encoding
Run Length Encoding (RLE): Lossless method, removes repeated values by replacing them with single occurrence of the data followed by the no. of repetitions
- E.g: “AAAAAABBBBBCCC” = “A6B5C3”
- Requires consecutive identical data for efficient compression
- Effectiveness based on repetition:
- Offers significant reduction in file size when data has substantial repetition
- Less effective when there is minimal repetition, resulting in limited reduction in file size
Dictionary Encoding
Dictionary encoding: Lossless method, replaces frequent data with an index, storing compressed data alongside a matching dictionary
- Original data can be restored using the dictionary
- (E.g 1: “We shall” 2: “fight” 3: “the” 4: “on” 5: “in” 6: “and”)
- Data compressed requires the accompanying dictionary
Encryption
Encryption: Keeps data secure during transmission. Scrambles data before sending & decrypt upon arrival.
Symmetric Encryption: Both sender & receiver share same private key
- Key exchange distributes this key for encryption and decryption
- Secrecy of the private key is crucial to prevent interception and decryption
Asymmetric Encryption: Utilises 2 keys: public & private. Public key shared openly, while private key remains secret
- Keys are mathematically related & function as a key pair
- Requires recipient’s private key to decrypt messages encrypted with public key.
- Sending a message: Encrypt with the recipient’s public key for their decryption
- Digital Signatures: Use your private key to encrypt, allowing anyone with access to your public key to verify the sender’s identity, forming digital signatures.
Public/Private Key Functionality: Accessing someone’s public key allows encryption of messages for their eyes only
- Using one’s private key for encryption helps verify the sender’s identity through digital signatures.
Hashing
Hashing: Process turning an input (key) into a fixed-size value (hash) using various hash functions
- Output of hash function cannot be reversed to obtain the original key
Password Storage: Useful for storing passwords.
- User’s password is hashed & compared against the stored key for authentication
- Prevents unauthorized access to passwords if the keys are obtained
Hash Tables: Data structure for key-value pairs, utilizing a bucket array & hash function.
- Enables fast data retrieval in constant time
- Commonly used in caches and databases for storing large volumes of data.
Collisions in Hashing:
- Occur when different keys produce the same hash.
- Methods to handle collisions:
- Store items in a list under the same hash value
- Use secondary hash function to generate a new hash
Properties of a Good Hash Function:
- Low chance of collisions
- Quick to compute
- Output is smaller than the input to optimize search times