Lecture 4 - LZW Compression Flashcards
What based method is this?
Dictionary based
What is a dictionary
A collection of strings, with each string having a dedicated codeword to represent it. The codeword is a bit pattern, that can be represented by an unsigned int.
What is written to the compressed file?
The bit pattern (codeword). The bit pattern can differ for different chars, as the length changes during runtime of the algorithm.
Is dictionary statically available?
No it is built dynamically during both compression and decompression.
What does the dictionary intially contain?
All possible strings of length 1.
What is ‘closed under prefix’?
all prefixes of current string s are reprsented
What is the optimal representation of the dictionary?
A trie due to the closed under prefix property.
How many more strings can we store in the dicionary if the available codeword length increases by 1.
The size doubles
The 3 LZW Variants:
Constant codeword length: fix the codeword length for all time
-> the dictionary has fixed capacity: when full, just stop adding to it
Dynamic codeword length (the version we have been taught)
LRU version: when dictionary full and codeword length maximal
-> current string replaces Least Recently Used string in dictionary
Does the decompression stage build the same dictionary as the compression stage?
Yes, but one step out of phase.
Complexity of compression and decompression?
O(n) , as we pass through the text once and populate dictionary at every iteration