L16 - Data Compression 2, LZW Coding Flashcards

1
Q

What is data compression?

A

The removal of redundancy from data packets being transmitted. This improves reliability by decreasing the extent to which noise can interfere with the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Source Coding?

A

Another word for data compression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 models that compression algorithms can be based on? Define each…

A

Static Model: Applies the same compression technique for all text in the file.
Pro -> Fast
Con -> Non-optimal since different text have different statistical properties.
E.g. -> ASCII, Morse Code.

Adaptive Model: Compression models that progressively learn and update the model based on incoming text.
Pro -> Accuracy → Leads to better compression of data.
Con -> Decoding must start from the beginning of the data.
E.g. LZW Coding

Dynamic Model:
- Learn the statistics of the text, and generates a model based on what’s learnt.
- A preliminary test is required to generate the model.
- Model must be transmitted to the receiver.
- E.g Huffman Code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is LZW coding?

A
  • An adaptive dictionary coding scheme for lossless data compression.
  • Widely used for UNIX file compression and GIF image format compression.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

LZW Coding uses Dictionary Coding, explain how dictionary coding works…

A
  • Doesn’t use statistical knowledge of data.
  • Encoding → Reads input data stream and searches for its pattern in the dictionary.
    • Found → Replaces the input data stream with the index of the dictionary entry that the pattern matches.
    • Not found → Inserts the input data pattern into the dictionary.
  • Decoding → Inverts the encoding process by following the encoded indices to reconstruct the original data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give pros and cons of LZW coding…

A

Pros:
- Efficiently compresses data with redundant data or repeating sequences.

Cons:
- If data is already compressed or is randomised, compression wont be as effective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Walk through the steps of how the LZW algorithm compresses (encodes) data…

A
  1. Have an initial dictionary to build off, usually ASCII.
  2. From the first 2 elements in the input message, create the compression dictionary…
    • If the substring is a key in the dictionary, it has been visited before. Thus, move the right pointer 1 element to the left to create a new, larger substring and check again.
    • If substring is not a key in the dictionary, it hasn’t been visited before. Thus, this new substring becomes a key with an associated unique value.
    • Repeat until all unique substrings in the data have associated key-value pairs in the dictionary.
  3. Using the dictionary, the input data can be compressed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain what LZW encoding is and how it works…

A
  • A dictionary based compression algorithm.
  • Builds off an initial dictionary, usually ASCII codes.
  • Works by creating a dictionary item for each unique sub-string encountered in the input message.
  • A specific dictionary value can be used whenever that item is encountered.
  • For example, if the substring ‘ought’ has 3 instances in the input text, the same code will be used for each to represent that substring in the compressed output.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What through the steps of how the LZW algorithm decompresses (decodes) data…

A
  • Start with the same base dictionary as used in encoding.
  • For each encoded element in the compressed data, check if it’s in the dictionary.
    • If yes, move right pointer to the next encoded element, append the codes together to create a new code. Then add the substring and code to the dictionary.
  • Due to the above, any time an encoded value re-encountered, the same substring is decoded.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly