131-compression-encryption-and-hashing Flashcards
Compression
process reduce file size
so less storage space used
Compression purposes
smaller files = fewer packets = faster transmission time, reduce traffic over internet, less chance of collisions/transmission errors
- Quicker upload/download/load/transfer time
- better streaming of music and video
- Less storage space taken up on disk/servers
- Less mobile data/bandwidth usage
Lossless Compression
- Lossless compression does not remove data permanently
- Retains original data while making it smaller
- Lossless compression uses an algorithm to compress the file without losing any information
- Reduces file size less than lossy compression but retains better quality
- Suitable for text and code as no data is lost
- Lossless compression can reduce the size of an image file but is ideal for vector-type images like logos, cartoons, and icons
- Text documents and executable files need to be restored in their entirety
- None of the original data is lost, and the original file can be recreated when uncompressed
- Lossless compression techniques include dictionary coding and run-length encoding.
Lossy Compression
- Lossy compression permanently deletes some data to reduce file size
- Lossy compression is suitable for images, audio, and video files where loss of quality is an acceptable trade-off for smaller files
- Lossy compression results in a more pixelated image or less clear audio recording
- Lossy compression cannot be used for text or code, as it makes the content unreadable and unable to execute
- Lossy compression reduces file size significantly but also reduces the quality more compared to lossless compression
- Lossy compression is unlikely to be noticed by humans as it removes unnoticeable data
Run Length Encoding:
sequences of the same consecutive data are represented as a single data value and its number of occurrences
relies on consecutive repeated data to offer a reduction in file size
suited for images and sound
Dictionary Coding:
frequently occurring pieces of data replaced by tokens.
stored with an dictionary that matches the data to its token index. when decompressed,
dictionary is used to replace the tokens with the original data.
suited for text
Encryption
- scrambles a message to keep it secure during transmission.
An algorithm is used to convert text into cipher text that cannot be understood if intercepted during transit - requires a set of keys to encrypt and decrypt the data.
Public and private keys are used to encrypt and decrypt data. - used to keep data secure when transmitting it over the internet, using different methods.
- helps prevent and minimize threats by making stolen data useless to unauthorized users.
Caesar Cipher
The Caesar cipher is a classic symmetric encryption technique.
It replaces each letter of the alphabet with another letter that is a fixed distance away from the original letter.
To decrypt the message, the recipient needs to know the number of places the alphabet has been shifted by - this is the key.
The encryption is easy to crack even without the key, so the ultimate aim of encryption is to make the original message impossible to crack without the key.
Symmetric encryption
Symmetric encryption uses a single key to both encrypt and decrypt a message.
Both the sender and receiver share the same private key.
-faster decrypting and ecnrypting, saves time on communication.
The key is distributed to each other through a process called key exchange.
It is important that the private key is kept secret because if it is intercepted during the key exchange, any communication sent can be intercepted and decrypted.
(btw key not neccesarrily sent with file but is sent to recpient somehow)
- The same key can be used multiple times. Alternatively, a unique key can be generated each time in an attempt to make it harder to crack.
A danger of symmetric encryption is that the key can be intercepted, duplicated or compromised.
Systems that send or receive sensitive information like payment card details use more secure methods such as asymmetric encryption.
Asymmetric Encryption
Uses two different keys - a public key and a private key.
Public key is freely given out, while private key is kept secret.
Message is encrypted using recipient’s public key, and can only be decrypted using their private key.
Virtually impossible to derive one key from the other, making asymmetric encryption much more secure.
Widely used for internet transactions and securing webpages with the HTTPS protocol.
Asymmetric Encryption Process
Start by having access to our own public/private key pairs.
Sender and recipient exchange copies of their public keys.
Use our own private key to decrypt messages (kept safe and never sent out).
When sending a message, use our own private key and recipient’s public key (called combined encryption key) to encrypt message and send it.
Recipient would need to use their private key and our public key to decrypt the message.
Can be sure the message hasn’t been modified and hasn’t been read by anyone else.
Confirming a message as authentic (digital signatures)
Digital signatures are used to confirm the authenticity of a message.
To create a digital signature, one can encrypt a message using their private key.
Anyone can decrypt the message using the sender’s public key, which is available to everyone.
If the message can be decrypted with the sender’s public key, it confirms that the message was encrypted with the sender’s private key.
Digital signatures form the basis of a system for verifying the authenticity of messages.
Public key (Asymmetric Encryption)
Public key is used for encrypting data and can be published anywhere
Messages encrypted with a recipient’s public key can only be decrypted with the recipient’s private key
Anyone can access a public key, can be made public, you can give it out or publish it online, often stored in secure servers known as key safes.
Private Key (Asymmetric Encryption)
Private key is a secret key used for decrypting encrypted data
Private key should never be shared or sent to anyone
Private key is mathematically related to its corresponding public key
Hashing
- Hashing is a one-way process that can’t be reversed to form the original input.
- Involves taking an input (usually a string or binary data) and generating a fixed-size output (usually a hash) using a specific algorithm.
- output is unique and deterministic, meaning that the same input will always produce the same output.
- Commonly used for data storage, retrieval, and protection, such as in password storage.