Sorting Algorithms Flashcards

1
Q

Describe external sorting

A

Sorting algorithms that are suitable for very large files on disk whose records are not entirely stored in main memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the sort-merge strategy

A

The strategy involves sorting small subfiles of the main file known as runs, these runs are then merged into sorted into larger sorted subfiles which are sorted in turn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the difference in sorting for relations that fit in the memory and those that do not

A

For those that fit, techniques such as quicksort, bubble sort and merge sort are used.
For those that do not, external sort-merge is used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In what cases is sorting not done

A

Sorting is not done when there exists an appropriate index on the file, eg primary or clustered, that allows ordered access to the files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the buffer space in the sort-merge algorithm

A

Like all algorithms, sort-merge requires a buffer space in the main memory where actual sorting and merging of the runs takes place

The buffer space is part of the DBMS cache, an area in the computer main memory that is controlled by the DBMS
The buffer is divided into individual buffers where each buffer is the same size in bytes as the size of one disk block. Each individual buffer corresponds to exactly one block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the sorting phase of the sort-merge algorithm

A

Runs of the file are read into the main memory (buffer space) sorted using an internal sorting algorithm and then written back to the disk as sorted subfiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is the number of initial runs required determined?

A

The size of the run and number of initial runs (nR) is determined by the number of file blocks (b) and the available buffer space (nB) using the formula:
nR=⎡(b/nB)⎤

Do example in page 35

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the merging phase in sort-merge

A

The sorted runs are merged in one or more merge passes. These merge passes have one or more merge steps
-one buffer block is needed to hold one disk block from each of the sorted subfiles being merged, and one additional buffer is needed for containing one disk block of the merge result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the degree of merging (dM) and how is it determined

A

This is the number of sorted subfiles that can be merged in one merge step
dM is the smaller of (nB-1) and nR

*nB-available buffers nR- number of initial runs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate the number of merge passes required

A

Done manually by counting or:
[(log dM (nR))]
Do example on page 37

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the cost analysis of the merge-sort algorithm

A

The cost is measured based on the number of disk block reads and writes before the sorting is done/
To calculate:

(2b)+(2b*(log dM (nR)))

The first part is for the sorting phase. Each block is accessed twice (reading into main memory and writing back into disk)

The second part is for the merging phase. Similar to the sorting phase, each block is accessed twice. We the multiply by the number of merge passes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do example in 41

A

irh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly