Component 8 - Key Definitions Flashcards
Lossy vs Lossless compression
Lossy compression reduces the size of files by removing unnecessary information, or by reducing quality in a way which is either “acceptable” or likely to not be noticed by the user. In music, for example, MP3 removes frequencies that the human ear cannot resolve. This has the obvious advantage of large reductions in file size, but the disadvantage that we can never restore the original quality.
Lossless compression reduces the file size without losing any quality or the ability to reproduce the original file. Common techniques include dictionary and run length encoding which reduce data by removing the need to represent repeated data explicitly.
Run length encoding and dictionary coding for lossless compression
Run length encoding – anywhere in a file where there is repeated data (colour in an image, words or letters in a text file), store only the colour and number of pixels, or store only the letter and the number of repetitions. This reduces the amount of data stored.
Dictionary encoding – commonly used to compress text. Represents each new word as a “token” in a lookup table. The document is then reduced to a collection of numbers. This reduces overall file size due to only storing each word once
Symmetric and asymmetric encryption
Symmetric encryption is any form of encoding in which the process to encrypt is the simply reversed in order to decrypt. This is obviously very insecure.
Asymmetric encryption uses the concept of public and private keys in order to encode and decode data. It relies on an incredibly complex mathematical discovery that it is possible to create “one way” algorithms where the method to encode cannot be simply reversed in order to return to the original data. This is further
helped by the fact that “key pairs” mean one key can be published and used to encrypt and then a separate, secret, key can be used to decrypt. This is essential for secure transmission of data.
Different uses of hashing
Hashing uses the concept of an algorithm which, for any given input, will always produce the same output. It also has the useful property that a small change in the input will result in a large, obvious change in the output. This has several applications including:
File hashing – a quick and convenient way of ensuring two copies of a file are indeed identical and have not been corrupted or tampered with.
File systems – a quick method of locating a files by generating a hash which points to its location