DBMS - Query Optimization and Query Processing Flashcards
Query Language
Dedicated, High level language for accessing and updating the contents of a database
Query Models
Closely associated to their underlying data model
SELECT Statement based on what model
Relational model
Query Languages are either
Declarative or Procedural (RA)
If a query is specified in text
DBMS transforms it to a number of graphs
Why are declarative queries good
- Design the meaning of a query rather than implementation details
- Shift job of finding a better implementation to runtime
Disadvantage of Declarative queries
DBMS needs to dedicate more resources to compile
Declarative constructs must be converted to
Procedural constructs using RA
Are there any declarative queries that cannot be converted to procedural?
SELECT and GROUP BY for example
Canonical Translation from query to RA
- Run a sequence of products through all tables listed in the FROM clause
- Sift result with the WHERE clause conjuncts
- Project the output
Can the canonical translation be proven correctly
Yes and easily
Access Plan
Also referred to instead of the canonical translation and any RA expression in Query Processing
Access Plans are procedural and follow a data flow mechanism
Query Optimizer’s Role
To manipulate access plans into better plans. Better is not neccesarily the best since QO is np hard
What is the output of a query optimizer called
Query Execution Plan
DBMS components of QO and QP
Query Language interface, buffer manager (RAM buffer), Storage Manager, File Structures.
QO Environment of Interest
- One storage manager
- If multicore, parallelism is provided by other layers
Compilation of a non procedural query have some added procedures, ex:
- Typical scanning, parsing, type checking
- QP and QO converts query into detailed representation (query graph) and choosing execution plan that safely and efficiently implements data access.
QO Basic Plan
In Non-Trivial queries, there exists a number of possible solutions. For general queries it is an np problem. Heuristics and statistics may be used.
When a solution requires extensive work
base a quick solution on heuristics to offer an approximate solution that effectively has limited optimality and completeness.
Search space is
all the possible permutations of a query. Huge for most general queries.
Algebraic select can be implemented with
serial scans, B Tree, hash, index…
How to decide which options to prune in search space
Use statistics
Why do we use heuristics and statistics
To evaluate potential of access plans to be selected as the optimal execution plan. QO must search and generate access plans, estimate their cost of parts and whole plan.
No optimizer really optimizes
SQL QO can give good query execution plans, but not guaranteed to be the best.
Output of QO
passed to query processor to execute
Normalization
Assume the input query is restricted.
- Does Lexical and syntax analysis, grammar and type checks
- If good query is converted to normal form (PRENEX)
- Query can be transformed to CNF or DNF
CNF
Prefers AND
DNF
Prefers ORs
PRENEX
A query from where all quantifiers precede a quantifier free qualification.
Simplification
One can use different queries that will return the same result quicker for simplification. Use boolean rules.
Semantic Analysis
Refute incorrect queries (ex. contradictions)