Sorting Algorithms Flashcards

Question 1

Q

Describe external sorting

Answer

A

Sorting algorithms that are suitable for very large files on disk whose records are not entirely stored in main memory

Question 2

Q

Describe the sort-merge strategy

Answer

A

The strategy involves sorting small subfiles of the main file known as runs, these runs are then merged into sorted into larger sorted subfiles which are sorted in turn

Question 3

Q

Describe the difference in sorting for relations that fit in the memory and those that do not

Answer

A

For those that fit, techniques such as quicksort, bubble sort and merge sort are used.
For those that do not, external sort-merge is used

Question 4

Q

In what cases is sorting not done

Answer

A

Sorting is not done when there exists an appropriate index on the file, eg primary or clustered, that allows ordered access to the files

Question 5

Q

Describe the buffer space in the sort-merge algorithm

Answer

A

Like all algorithms, sort-merge requires a buffer space in the main memory where actual sorting and merging of the runs takes place

The buffer space is part of the DBMS cache, an area in the computer main memory that is controlled by the DBMS
The buffer is divided into individual buffers where each buffer is the same size in bytes as the size of one disk block. Each individual buffer corresponds to exactly one block

Question 6

Q

Describe the sorting phase of the sort-merge algorithm

Answer

A

Runs of the file are read into the main memory (buffer space) sorted using an internal sorting algorithm and then written back to the disk as sorted subfiles.

Question 7

Q

How is the number of initial runs required determined?

Answer

A

The size of the run and number of initial runs (nR) is determined by the number of file blocks (b) and the available buffer space (nB) using the formula:
nR=⎡(b/nB)⎤

Do example in page 35

Question 8

Q

Describe the merging phase in sort-merge

Answer

A

The sorted runs are merged in one or more merge passes. These merge passes have one or more merge steps
-one buffer block is needed to hold one disk block from each of the sorted subfiles being merged, and one additional buffer is needed for containing one disk block of the merge result

Question 9

Q

What is the degree of merging (dM) and how is it determined

Answer

A

This is the number of sorted subfiles that can be merged in one merge step
dM is the smaller of (nB-1) and nR

*nB-available buffers nR- number of initial runs

Question 10

Q

How do you calculate the number of merge passes required

Answer

A

Done manually by counting or:
[(log dM (nR))]
Do example on page 37

Question 11

Q

Describe the cost analysis of the merge-sort algorithm

Answer

A

The cost is measured based on the number of disk block reads and writes before the sorting is done/
To calculate:

(2b)+(2b*(log dM (nR)))

The first part is for the sorting phase. Each block is accessed twice (reading into main memory and writing back into disk)

The second part is for the merging phase. Similar to the sorting phase, each block is accessed twice. We the multiply by the number of merge passes.

Question 12

Q

Do example in 41

Sorting Algorithms Flashcards

(12 cards)