Query Processing Flashcards

1
Q

What are the three basic steps in query processing?

A
<ol>
<li>Parser and translator</li>
<li>Optimiser</li>
<li>Evaluation</li>
</ol>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What happens at the parser and translator stage in query processing?

A

<ul>
<li>Syntax is checked and referenced relations are verified</li>
<li>Query parsed and translated into relational algebra</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What happens at the optimiser stage in query processing?

A

<ul>
<li>Query converted to more efficient form</li>
<li>Query evaluation plan is created</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens in the evaluation stage in query processing?

A

Query evaluation plan is executed and the answer is returned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is contained within a query evaluation plan?

A

Evaluation primitives which are relational algebra operations annotated with execution guidance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the cost of execution plans estimated using?

A

Statistical information/estimation for

<ul>
<li>the data dictionary (number of tuples etc)</li>
<li>intermediate results for complex expressions</li>
<li>evaluation estimation</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a theta join?

A

Cartestian product combined with a select statement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the heuristics for getting a sensible equivalent SQL statement?

A

<ol>
<li>Perform selections as early as possible</li>
<li>Change conjunctive selection to nested selections</li>
<li>Perform projections as early as possible</li>
<li>Re-order joins to minimise intermediate data</li>
</ol>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is conjunctive selection?

A

Where two predicates are applied as an and statement, as opposed to applying one predicate then applying the next

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two factors in estimating query sizes?

A

<ul>
<li>Size of relations</li>
<li>Distribution of values in tuples</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What statistics stored in the systems catalogue can be useful for estimating the query size?

A
<ul>
<li>Number of tuples in a relation</li>
<li>Size of a tuple in a relation</li>
<li>Number of distinct values appearing for an attribute</li>
</ul>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we find the size of a cartesian product?

A

(number of tuples in q and r) times (the bytes per tuple for each table)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we estimate the size of a selection?

A

(number of tuples)/(number of values they can taken)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the estimation of size of selection assume?

A

Uniform distribution of values that the attributes do take

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the estimation of size of a natural join of two relations if they have no overlapping elements?

A

The size of their relations multipled together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the estimation of the size of a natural join of two relations if the common elements is a key for the first relation?

A

No more tuples than in the other relation

17
Q

What is the estimation of the size of a natural join of two relations if none of the clever conditions apply (no overlap or overlap is key of a relation)?

A

Number of tuples in one relation times number of tuples in the othr all divided by the number of values the attribute that is the common attribute can take

18
Q

What is cost generally measured as when looking at queries?

A

Total elapsed time for answering the query

19
Q

What is taken into account when looking at access costs?

A

<ul>
<li>Number of seeks * average seek cost</li>
<li>Number of blocks read * average block read cost</li>
<li>Number of blocks written * average block write cost</li>
</ul>

20
Q

What is tT?

A

Time to transfer one block

21
Q

What is tS?

A

Time for one seek

22
Q

What is br?

A

The number of blocks containing tuples of relation r

23
Q

What is fr?

A

The number of tuples of relation r that fit into one block (blocking factor)

24
Q

What is hi?

A

The height of the index

25
Q

What is the cost estimate for direct selections where the attribute does not have an index on it?

A

The time to seek a block + time to transfer a block * number of blocks needed

26
Q

What is the cost estimate for direct selections where the attribute does have a primary index on it and the attribute is a key?

A

(The height of the index plus one) times (the time to transfer + the time to seek a block)

27
Q

What is the cost estimate for direct selections where the attribute does have a primary index on it but the attribute is not a key?

A

(the height of the index)*(time to transfer + time to seek) + time to seek + (number of blocks with matching records * time to transfer)

28
Q

What is the cost estimate for direct selections where there is a secondary index on the attribute?

A

(height of the index + number of matching records) * (time to transfer + time to seek)

29
Q

How are comparison selections performed when there is no index on A?

A

Each block is scanned and all records are tested by the condition

30
Q

How are comparison selections performed when there is a primary index on A?

A

Scan sequentially from either the start ()

31
Q

How are comparison selections performed when there is a secondary index on A?

A

Linear file scan might be cheaper but find the index pointing to a record where the first of the condition holds then scan the INDEX sequentially from there

32
Q

What are evaluation primitives?

A

Relational algebra operations annotated with execution guidance