sorting Flashcards

1
Q

why is external sorting needed?

A

files are stored on the disk and it’s too large for in-memory sorting methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

in multi-way sort there are multiple passes

what does pass 0 do?
what about all the subsequent passes?

A

pass 0 is used to read a single page and sort it

subsequent passes keep X buffers X-1 buffers for input and 1 buffer for output, these will sort the pages between themselves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

in a 2-way external merge sort how do you calculate
1. the number of passes
2. total I/O pages

A
  1. ceil(log2(N)) + 1
  2. 2N( ceil ( log2 (N) ) + 1 )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

in a general external merge sort how do you calculate where you have B buffers
1. the number of passes
2. total I/O pages

A
  1. ceil ( logB-1 (N/B) ) + 1
  2. 2N ( ceil (logB-1 (N/B) + 1 )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what needs to be the case for us to be able to reduce I/O time per page

A

if the pages are sequentially stored we can reduce I/O time per page by having larger input buffers. We will take more pages in at a time. However it will cost additional seek time and rotational delay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Regarding blocked buffers and their sorting costs, if
buffer (memory) size is B pages
input buffer block size is b pages
N is the number of pages

what is the
1. number of input buffers
2. number of runs after pass 0
3. I/O pages

A
  1. (B-1)/b
  2. ceil (N/B)
  3. 2N ( 1 + |logB-1/b ( ceil (N/b) ) | )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

in blocked buffer sorting, a larger input buffer size does what?

A

a larger input buffer block size reduces fan in , which increases passes and I/O pages but reduces the time per I/O page

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

true or false: clustered B+ trees are always better for sorting than external sorting

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

true or false: unclustered B+ Trees are always better for sorting than external sorting

A

false, it could be more expensive than external sorting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

true or false: iterating over unclustered index can be faster than external sorting

A

True. if the # of records are low enough, iterating over unclustered indices can be faster. However, the number of records per page are normally large enough enough so that external sorting is faster.

  • slide 25
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how do you calculate the # of runs of B pages?

B = buffer pages

A

ceil ( N / B ) runs of B pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the formula to calculate the # of runs merged

A

ceil ( (B-1) / b )

b: is input buffer block size.
B: # of buffer pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

if we have a larger b (input buffer block),
- does it increase or decrease I/O time per page?
- does it increase or decrease the # of runs merged?
- does it increase or decrease the number of passes?

b is

A

Larger b means:
- less I/O time per page
- smaller # of runs merged
- more passes and I/O pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

do we prefer an internal sort that produces longer runs or shorter runs

A

longer runs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how many pages does replacement sort produce per page

A

Replacement sort produce runs of 2*B
pages each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

describe the steps to replacement sort

A
  • Repeatedly append value k in Current Set to
    Output, where k is smallest >= max value in
    Output.
  • Terminate current run when no value in
    Current Set is >= max value in Output
17
Q

review the drawings on slide 29 because I didn’t draw them and girlfriend did

A