Query Processing Flashcards
What are the three basic steps in query processing?
<ol> <li>Parser and translator</li> <li>Optimiser</li> <li>Evaluation</li> </ol>
What happens at the parser and translator stage in query processing?
<ul>
<li>Syntax is checked and referenced relations are verified</li>
<li>Query parsed and translated into relational algebra</li>
</ul>
What happens at the optimiser stage in query processing?
<ul>
<li>Query converted to more efficient form</li>
<li>Query evaluation plan is created</li>
</ul>
What happens in the evaluation stage in query processing?
Query evaluation plan is executed and the answer is returned
What is contained within a query evaluation plan?
Evaluation primitives which are relational algebra operations annotated with execution guidance
What are the cost of execution plans estimated using?
Statistical information/estimation for
<ul>
<li>the data dictionary (number of tuples etc)</li>
<li>intermediate results for complex expressions</li>
<li>evaluation estimation</li>
</ul>
What is a theta join?
Cartestian product combined with a select statement
What are the heuristics for getting a sensible equivalent SQL statement?
<ol>
<li>Perform selections as early as possible</li>
<li>Change conjunctive selection to nested selections</li>
<li>Perform projections as early as possible</li>
<li>Re-order joins to minimise intermediate data</li>
</ol>
What is conjunctive selection?
Where two predicates are applied as an and statement, as opposed to applying one predicate then applying the next
What are the two factors in estimating query sizes?
<ul>
<li>Size of relations</li>
<li>Distribution of values in tuples</li>
</ul>
What statistics stored in the systems catalogue can be useful for estimating the query size?
<ul> <li>Number of tuples in a relation</li> <li>Size of a tuple in a relation</li> <li>Number of distinct values appearing for an attribute</li> </ul>
How do we find the size of a cartesian product?
(number of tuples in q and r) times (the bytes per tuple for each table)
How do we estimate the size of a selection?
(number of tuples)/(number of values they can taken)
What does the estimation of size of selection assume?
Uniform distribution of values that the attributes do take
What is the estimation of size of a natural join of two relations if they have no overlapping elements?
The size of their relations multipled together