Indexing Flashcards
describe a disk
organised into tracks and sectors
a track in a sector is a block
contiguous blocks are clusters
what is an index
allows us to access data elements directly.
stored as a separate table.
points to an address
ordered
what is a primary index
primary key
ordered file
fixed length
two fields: key and pointer
what is a secondary index
defined on a field of our choosing
what is a pointer
points to a location in memory
what is an anchor record
the first record in each block of the data file
what is a sparse index
has entries for only some of the search values (less than half)
is a primary index sparse
yes, it only contains keys of anchors rather than every search value
what are the advantages of a sparse index
occupies a smaller space
binary search is faster
more index entries fit on a block
what are clustering indexes
records are physically ordered on a nonkey field
speeds up retrieval
is a clustering index sparse
depends if nonkey field is unique or not
nonkey so CAN be more than value for the field
one entry in the clustering index for each distinct value
pointer to the first block
what are secondary indexes
a means of accessing a data file for which primary access already exists
data doesn’t have to be ordered
created on a candidate or non key field
pointer is either a block or a record pointer
is a secondary index dense
yes
if we create a secondary index on a candidate key…
it will be unique but unordered so the index will be dense.
The index will create an ordering that allows us to perform binary search
if we create a secondary index on a non key field…
there will be duplicates
what are our 3 options for dealing with duplicates in a secondary index
1- create duplicate index entries
2- have a variable length index with repeated pointer fields
3- have fixed length index entries and add a layer of indirection
how do we perform binary search on a multi level index
considerer the first-level index file as a sorted data file
create a primary index for the first level
create a secondary index to the first level
because secondary index is on the primary key too, we can use pointers to anchors
what are B and B+ trees (general)
special cases of trees
what is a B tree
an m-way search tree.
each node has multiple indexes and multiple pointers
describe a search tree of order p
node contains p-1 search values
and p pointers
describe key values that contain pointers
we can also make our key values be pointers to disk
what is the rule when deleting from a b tree
if the deletion causes a node to be less than half full, combine with neighbours
what is a B+ tree
an implementation of a dynamic multi level index
- data pointers are only stored at leaf nodes
- leaf nodes are linked (base level index)
- internal nodes = other levels in the index
what is the difference between B trees and B+ trees
Btrees
- contain pointers at various levels.
- values not repeated
B+trees
- retain middle values to guide the search
- pointers only at leaf nodes creating a sequential linked list