Query Processing Flashcards

1
Q

What are the three basic steps in query processing?

A
<ol>
<li>Parser and translator</li>
<li>Optimiser</li>
<li>Evaluation</li>
</ol>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What happens at the parser and translator stage in query processing?

A

<ul>
<li>Syntax is checked and referenced relations are verified</li>
<li>Query parsed and translated into relational algebra</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What happens at the optimiser stage in query processing?

A

<ul>
<li>Query converted to more efficient form</li>
<li>Query evaluation plan is created</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens in the evaluation stage in query processing?

A

Query evaluation plan is executed and the answer is returned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is contained within a query evaluation plan?

A

Evaluation primitives which are relational algebra operations annotated with execution guidance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the cost of execution plans estimated using?

A

Statistical information/estimation for

<ul>
<li>the data dictionary (number of tuples etc)</li>
<li>intermediate results for complex expressions</li>
<li>evaluation estimation</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a theta join?

A

Cartestian product combined with a select statement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the heuristics for getting a sensible equivalent SQL statement?

A

<ol>
<li>Perform selections as early as possible</li>
<li>Change conjunctive selection to nested selections</li>
<li>Perform projections as early as possible</li>
<li>Re-order joins to minimise intermediate data</li>
</ol>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is conjunctive selection?

A

Where two predicates are applied as an and statement, as opposed to applying one predicate then applying the next

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two factors in estimating query sizes?

A

<ul>
<li>Size of relations</li>
<li>Distribution of values in tuples</li>
</ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What statistics stored in the systems catalogue can be useful for estimating the query size?

A
<ul>
<li>Number of tuples in a relation</li>
<li>Size of a tuple in a relation</li>
<li>Number of distinct values appearing for an attribute</li>
</ul>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we find the size of a cartesian product?

A

(number of tuples in q and r) times (the bytes per tuple for each table)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we estimate the size of a selection?

A

(number of tuples)/(number of values they can taken)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the estimation of size of selection assume?

A

Uniform distribution of values that the attributes do take

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the estimation of size of a natural join of two relations if they have no overlapping elements?

A

The size of their relations multipled together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the estimation of the size of a natural join of two relations if the common elements is a key for the first relation?

A

No more tuples than in the other relation

17
Q

What is the estimation of the size of a natural join of two relations if none of the clever conditions apply (no overlap or overlap is key of a relation)?

A

Number of tuples in one relation times number of tuples in the othr all divided by the number of values the attribute that is the common attribute can take

18
Q

What is cost generally measured as when looking at queries?

A

Total elapsed time for answering the query

19
Q

What is taken into account when looking at access costs?

A

<ul>
<li>Number of seeks * average seek cost</li>
<li>Number of blocks read * average block read cost</li>
<li>Number of blocks written * average block write cost</li>
</ul>

20
Q

What is tT?

A

Time to transfer one block

21
Q

What is tS?

A

Time for one seek

22
Q

What is br?

A

The number of blocks containing tuples of relation r

23
Q

What is fr?

A

The number of tuples of relation r that fit into one block (blocking factor)

24
Q

What is hi?

A

The height of the index

25
What is the cost estimate for direct selections where the attribute does not have an index on it?
The time to seek a block + time to transfer a block * number of blocks needed
26
What is the cost estimate for direct selections where the attribute does have a primary index on it and the attribute is a key?
(The height of the index plus one) times (the time to transfer + the time to seek a block)
27
What is the cost estimate for direct selections where the attribute does have a primary index on it but the attribute is not a key?
(the height of the index)*(time to transfer + time to seek) + time to seek + (number of blocks with matching records * time to transfer)
28
What is the cost estimate for direct selections where there is a secondary index on the attribute?
(height of the index + number of matching records) * (time to transfer + time to seek)
29
How are comparison selections performed when there is no index on A?
Each block is scanned and all records are tested by the condition
30
How are comparison selections performed when there is a primary index on A?
Scan sequentially from either the start ()
31
How are comparison selections performed when there is a secondary index on A?
Linear file scan might be cheaper but find the index pointing to a record where the first of the condition holds then scan the INDEX sequentially from there
32
What are evaluation primitives?
Relational algebra operations annotated with execution guidance