Tries Flashcards
Trie Properties
trie height depends on key length k, but not on tree size n
* no rebalancing, deterministic structure
* lexicographic order
* the keys are implicitly contained in tree structure, can be reconstructed from paths
* automatically supports variable-size data
ART (adaptive radix tree) Span and Simple Physical Node Layout
for binary keys, the node fanout can be configured
* at each node, s bits (“span”) of the key are used
* each inner node is simply an array of 2^s pointe
span: number of bits that you look at in each node
How to choose span?
large s: Small height (fast), large space consumption
* small s: Large height (slow), small space consumption
you need k/s nodes (k is key length)
Adaptively Sized Nodes
s = 8: each inner node maps s bits of key to the next child node
physical layouts
Node256: direkt addressing
Node4: store up to 4 pointers
Node16: store up to 16 pointers (look up for key with binary search)
Node48: indirect addressing, can use vector instructions for comparison, 256 child indexes (8 bits each), 48 child pointers
Leaf Nodes options
- single-value leaves
- multi-value leaves
- combined pointer/value slots using pointer tagging (values up to 63 bits)
Height Optimizations for Long Keys
Patricia Trie optimization: remove all one-way nodes
path compresion: merge one-way node nto child node
lazy expansion: remove path to single leaf (F->O->O->(FOO) becomes F->(FOO)), chains that only have one pointer
implementation: each of the nodes gets a prefix and when you do the search, you first compare against the prefix and then continue with normal search
k-Constrained Tries (hight optimized trie HOT)
HOT vs ART: span is dynamic
* let’s fix the maximum fanout k instead of the span
* there are many ways to split a trie into nodes with a maximum fanout of k
HOT
dynamic span
* maximum per-node fanout of k = 32
* no unused pointers through copy on write (replace node for each
insert/delete)
* minimum height
Theoretical Properties
Determinism: for the same set of keys regardless of their insertion order the
resulting HOT will have the same structure and the same set of compound
nodes.
* Minimum height: for a given set of keys and a maximum fanout k no set of
compound nodes can be found, such that the height of the resulting tree is
lower than the height of a HOT with the same maximum fanout k and the
same set of keys.
* Recursive structure: each subtree of a HOT structure is itself a HOT structure
and thus of minimal height.
Binary COmparable keys: Key order
tries store data ordered lexicographically based on binary keys
* this may not be the order one wants
* if order is important, keys must be transformed before each operation
* many trie implementations (including ART, HOT) also require that no key is a prefix of another key (which may need extra effort for variable-length keys)
- how large are the nodes in a trie with span 5?
2^5
what re-balancing algorithms for tries are there?
structure is deterministic, no rebalancing
Trie vs B-tree
B-Tree constant structure, node content adapts to data
Trie constant content, tree structure adapts to data