Query Processing Flashcards
What are the three basic steps in query processing?
<ol> <li>Parser and translator</li> <li>Optimiser</li> <li>Evaluation</li> </ol>
What happens at the parser and translator stage in query processing?
<ul>
<li>Syntax is checked and referenced relations are verified</li>
<li>Query parsed and translated into relational algebra</li>
</ul>
What happens at the optimiser stage in query processing?
<ul>
<li>Query converted to more efficient form</li>
<li>Query evaluation plan is created</li>
</ul>
What happens in the evaluation stage in query processing?
Query evaluation plan is executed and the answer is returned
What is contained within a query evaluation plan?
Evaluation primitives which are relational algebra operations annotated with execution guidance
What are the cost of execution plans estimated using?
Statistical information/estimation for
<ul>
<li>the data dictionary (number of tuples etc)</li>
<li>intermediate results for complex expressions</li>
<li>evaluation estimation</li>
</ul>
What is a theta join?
Cartestian product combined with a select statement
What are the heuristics for getting a sensible equivalent SQL statement?
<ol>
<li>Perform selections as early as possible</li>
<li>Change conjunctive selection to nested selections</li>
<li>Perform projections as early as possible</li>
<li>Re-order joins to minimise intermediate data</li>
</ol>
What is conjunctive selection?
Where two predicates are applied as an and statement, as opposed to applying one predicate then applying the next
What are the two factors in estimating query sizes?
<ul>
<li>Size of relations</li>
<li>Distribution of values in tuples</li>
</ul>
What statistics stored in the systems catalogue can be useful for estimating the query size?
<ul> <li>Number of tuples in a relation</li> <li>Size of a tuple in a relation</li> <li>Number of distinct values appearing for an attribute</li> </ul>
How do we find the size of a cartesian product?
(number of tuples in q and r) times (the bytes per tuple for each table)
How do we estimate the size of a selection?
(number of tuples)/(number of values they can taken)
What does the estimation of size of selection assume?
Uniform distribution of values that the attributes do take
What is the estimation of size of a natural join of two relations if they have no overlapping elements?
The size of their relations multipled together
What is the estimation of the size of a natural join of two relations if the common elements is a key for the first relation?
No more tuples than in the other relation
What is the estimation of the size of a natural join of two relations if none of the clever conditions apply (no overlap or overlap is key of a relation)?
Number of tuples in one relation times number of tuples in the othr all divided by the number of values the attribute that is the common attribute can take
What is cost generally measured as when looking at queries?
Total elapsed time for answering the query
What is taken into account when looking at access costs?
<ul>
<li>Number of seeks * average seek cost</li>
<li>Number of blocks read * average block read cost</li>
<li>Number of blocks written * average block write cost</li>
</ul>
What is tT?
Time to transfer one block
What is tS?
Time for one seek
What is br?
The number of blocks containing tuples of relation r
What is fr?
The number of tuples of relation r that fit into one block (blocking factor)
What is hi?
The height of the index