Coding and Information Compression Flashcards
The overall time cost of the Huffman Coding Algorithm depends on the data structure used to hold the forrest
The data structure holding the Huffman forest needs to support these operations:
• Create an initially empty structure
• Insert a tree into the structure
• Delete from the structure, and return, the tree with the smallest count in its root
The Priority Queue abstract data
type, used in many fundamental algorithms:
• Create an empty Priority Queue
• Insert an item into the Priority Queue
• Delete, and return, the item in the Priority Queue with highest priority
The overall time cost of the Huffman Coding Algorithm depends on the data structure used to hold the forrest
The data structure holding the Huffman forest needs to support these operations:
• Create an initially empty structure
• Insert a tree into the structure
• Delete from the structure, and return, the tree with the smallest count in its root
The Priority Queue abstract data
type, used in many fundamental algorithms:
• Create an empty Priority Queue
• Insert an item into the Priority Queue
• Delete, and return, the item in the Priority Queue with highest priority
When using a PQ, remember that the item/element/data used needs to be comparable to other items
For a Huffman Tree, the resulting tree as functions of N:
• How many leaf nodes are there?
How many internal (non-leaf) nodes are there?
How many nodes in the code tree?
N + (N-1) = total number of Nodes
2N-1 nodes
N-1 internal nodes
N is number of leaf nodes
Huffman’s algorithm does construct a binary trie, which represents a coding scheme to use for coding and decoding messages
• But for actually coding and decoding a message, it really isn’t required to use a binary trie
• What you need is:
• a way to go from message symbols to bit sequences, for coding a message
• a way to go from bit sequences to message symbols, for decoding a message
Other alternative Data structures for Huffman coding are:
An array of Strings of “1”’s and “0”s, indexed by message symbol
• A HashMap in which keys are Strings of “1”’s and “0”s, and values are message
symbols
Huffman requires the first step to assess the data and compute statistics about the occurrence of the symbols
Motivation is the most frequent symbol is at the top of the tree and would intrinsically have fewer bits needed to encode.
Breakthrough in Huffman was building this TREE from the leaves up to the root instead of from the root
Trie will be FULL but NOT necessarily COMPLETELY FILLED