1.3.1 Compression, Encryption, Hashing Flashcards
What is compression
Reducing the size of a file by identifying repetitions of data
What are the attributes of Lossy Compression
Reduces File size,
Permanently deletes some data,
Useful for compressing image files
What are the attributes of Lossless Compression
Reduces File size,
Patterns in data are summarised in a shorter file format,
No data deleted,
Useful for compressing text documents
What’s dictionary compression
Splits files up into repeated data patterns,
Stores these in a dictionary
What’s a dictionary
A data structure that allow you to store key-value pairs. Each word can be stored as an ASCII-code of 8 bits
What’s encryption
The process of converting a message from plain text to cipher text,
Requires a key
Prevents data from being understood if intercepted..
What’s decryption
Process of converting cipher text back into plain text requires a key
What’s a message
Data that will be communicated between 2 parties
What’s plaintext
A message in an easily human readable form
What’s cipher text
An encrypted message
What’s a cipher
A set of instructions (algorithm) for encrypting plain text
What’s authentication
Proving the identity of the sender
What’s symmetric encryption
The same private key is used to encrypt and decrypt,
The key must be shared between the sender and receiver,
Key can be easily intercepted when sharing
What’s Asymmetric Encryption
Two different ( yet mathematically linked) keys are used to encrypt and decrypt,
Keys don’t need to be shared
Public key used to encrypt
Private key used to decrypt
Why do most systems use asymmetric encryption
To generate a symmetric key, securing limited communication sessions
What type of key exchange is used to generate a symmetric key
Diffie Hellman key exchange is used to generate it over transport layer security
What is Transport Layer Security
An encryption that’s used to secure TCP/IP protocols
What’s Caesar cypher encryption
Encryption that shifts each letter of plain text by an amount in a cyclical manner,
The amount is specified by the key
Advantages of Caesar cypher
Quick to reveal message
Cons of Caesar cypher
Easily crackable - only 25 possible keys
What’s the one time pad
A substitution cipher, in which each character is encrypted using its own key,
It’s theoretically impossible to crack and used by the vernam cipher
Define computational security
When a scheme cannot be cracked in reasonable time
What are the prerequisites of computational security
Key must be at least the same length as the plain text,
Characters in the key must be truly random,
The key must only be used ONCE,
Must only be two copies of the key, and it must be kept secret,
Key must be destroyed after use
What’s a hash table
A data structure that implements an associative array
What’s an associative array
An array in which data is stored as collection of key-value pairs,
E.g. a dictionary
Why must an array be used in a hash table
You have to be able to access each position of the array directly
What’s hashing
The process of converting a string of characters or key into a different value, usually a shorter, fixed-length value. Via a hashing algorithm.
What’s the formula for the load factor of a hashing table
Number of occupied buckets divided by the total number of buckets,
Optimal load factor = 0.75
What’s a hash function
An algorithm that converts a hash key to a hash value
A hash key is the raw data that is input into a hashing algorithm, while a hash value is the output that results from that process:
The index for a specific element in a hash table
What are the requirements needed to use a hash function
Always produce the same hash value for the same key,
To provide a uniform distribution of hash values. This means every value has an equal probability of generation,
Minimise cluttering
What causes cluttering
When many different keys produce the same hash value, a collision has occurred
What’s the purpose of a hashing function
It provides a mapping between an arbitrary length input and a fixed length or smaller output,
Its one way - so it can’t be reverted
What’s rehashing
When you rehash, a new hash table is created (larger if needed).
The key for each item in the existing table will be rehashed and the item will be inserted into the new hash table.
If the new table is larger the hashing algorithm will need to be modified as it will need to generate a larger range of hash values.
Why might rehashing be needed
If the hash table starts to fill up, or a large number of items are in the wrong place, the performance of the hash table will degrade.