Advanced Data Structures Flashcards
What are B-Trees?
B-Trees are balanced search trees designed to work well on disks or other storage devices.
They are similar to Red-Black Trees, but they are better at minimizing disk I/O operations.
A B-tree T is a rooted tree (whose root is T.root) having the following properties:
- Every node x has the following attributes:
- x.n, the number of keys currently stored in node x
- the x.n keys themselves, stored in nondecreasing order
- x.leaf, a boolean value that is TRUE if x is a leaf and FALSE if x is an internal node
- Each internal node x also contains x.n + 1 pointers to x.c_1, x.c_2, … x.cₓ.ₙ₊₁ to its children. Leaf nodes have no children, and so their c_i attributes are undefined.
- The keys x.key_i separate the ranges of keys stored in each subtree: if k_i is any key stored in the subtree with root x.c_i, then k_1 <= x.key_1 <= k_2 <= x.key_2 <= … <= x.key_x.n <= k_x.n+1
- All leaves have the same depth, which is tree’s height h
- Nodes have lower and upper bounds on the number of keys they can contain. We express these bounds in terms of a fixed integer t >= 2 called the minimum degree of the B-tree:
a) Every node other than the root must have at least t-1 keys. Every internal node other than the root thus has at least t children. If the tree is nonempty, the root must have at least one key.
b) Every node may contain at most 2t -1 keys. Therefore, an internal node may have at most 2t children. We say that a node is full if it contains exactly 2t-1 keys
What is the difference between B-Trees and Red-Black Trees?
B-Trees can have any number of child nodes, hence the branching factor is much larger
How are new keys inserted into a B-tree?
Inserting a key into a B-tree is significantly more complicated than inserting a key into a binary search tree. As with binary search trees, we search for the leaf position at which to insert the new key. With a B-tree, however, we cannot simply create a new leaf node and insert it, as the resulting tree would fail to be a valid B-tree.
Instead, we insert the new key into an existing leaf node. Since we cannot insert a key into a leaf node that is full, we introduce an operation that splits a full node y (having 2t-1 keys) around its median key y.key_t into tow nodes having only t-1 keys each. The median key moves up into y’s parent to identify the dividing point between the two new trees.