Indexing Flashcards

1
Q

describe a disk

A

organised into tracks and sectors

a track in a sector is a block

contiguous blocks are clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is an index

A

allows us to access data elements directly.

stored as a separate table.

points to an address

ordered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a primary index

A

primary key

ordered file

fixed length

two fields: key and pointer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a secondary index

A

defined on a field of our choosing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a pointer

A

points to a location in memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is an anchor record

A

the first record in each block of the data file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a sparse index

A

has entries for only some of the search values (less than half)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

is a primary index sparse

A

yes, it only contains keys of anchors rather than every search value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the advantages of a sparse index

A

occupies a smaller space

binary search is faster

more index entries fit on a block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are clustering indexes

A

records are physically ordered on a nonkey field

speeds up retrieval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

is a clustering index sparse

A

depends if nonkey field is unique or not

nonkey so CAN be more than value for the field

one entry in the clustering index for each distinct value

pointer to the first block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are secondary indexes

A

a means of accessing a data file for which primary access already exists

data doesn’t have to be ordered

created on a candidate or non key field

pointer is either a block or a record pointer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

is a secondary index dense

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

if we create a secondary index on a candidate key…

A

it will be unique but unordered so the index will be dense.

The index will create an ordering that allows us to perform binary search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

if we create a secondary index on a non key field…

A

there will be duplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are our 3 options for dealing with duplicates in a secondary index

A

1- create duplicate index entries

2- have a variable length index with repeated pointer fields

3- have fixed length index entries and add a layer of indirection

17
Q

how do we perform binary search on a multi level index

A

considerer the first-level index file as a sorted data file
create a primary index for the first level
create a secondary index to the first level

because secondary index is on the primary key too, we can use pointers to anchors

18
Q

what are B and B+ trees (general)

A

special cases of trees

19
Q

what is a B tree

A

an m-way search tree.

each node has multiple indexes and multiple pointers

20
Q

describe a search tree of order p

A

node contains p-1 search values

and p pointers

21
Q

describe key values that contain pointers

A

we can also make our key values be pointers to disk

22
Q

what is the rule when deleting from a b tree

A

if the deletion causes a node to be less than half full, combine with neighbours

23
Q

what is a B+ tree

A

an implementation of a dynamic multi level index

  • data pointers are only stored at leaf nodes
  • leaf nodes are linked (base level index)
  • internal nodes = other levels in the index
24
Q

what is the difference between B trees and B+ trees

A

Btrees

  • contain pointers at various levels.
  • values not repeated

B+trees

  • retain middle values to guide the search
  • pointers only at leaf nodes creating a sequential linked list