Query Processing Flashcards
Query Processing
This is where we look at the query compiler, and execution engine. This is responsible for transforming SQL queries into sequences of database operations and then execute the queries.
Queries
In queries, we typically tell the DBMS what we want, not how we want it.
We’re typically going to look at select statements.
Relational Algebra
Set of operations that can be applied to relations to compute new relations (things like select, projection, natural join, union etc).
Relational Algebra Expression
Often called a logical query plan, and are often represented as trees.
Logical Query Plan example
SELECT department, name
FROM Stores, Employees
WHERE department=worksAt AND city=’Liverpool’
can be converted to
π_(department, name)(σ_(department=worksAt AND city=’Liverpool’)(Stores x Employees)
σ
Selection, used for conditions.
π
Projection, restricts attributes in attribute list.
ρ
Renaming, very simply used for representing one thing as another (like using AS in SQL).
✕
Cartesian product, pairs each tuple together (like using SELECT DISINCT * FROM _)
Query Plan steps
- Compute an intermediate result for each node
- For a leaf labeled with relation R, the intermediate result is R
- For an inner node labeled with operator op, get the intermediate result by applying op to the children’s intermediate results.
- Result of query -> intermediate result of the root.
Equijoins
Represented as R ⋈_a=b S.
Joins things together based upon attributes given (in this case, a and b).
Merging (merge sort)
Way of understanding equijoin:
1) Check values of a and b. If a is smaller than b, move down in a, and if b is smaller than a, move down in b.
2) If a and b are equal, we merge the cell of a and b, and move down in b.
3) Repeat
If there are duplicates in a, we remember that they are duplicates and repeat the b table again
Fast Join Algorithms
There is a fast join algorithm which is the sort join algorithm, and can be a fast way of computing R ⋈_a=b S.
Other Join Algorithms
- Index join
- Hash join
- Multiway join
Relations on disk
Relations are stored on disk, and RAM is not often big enough to store everything, but if we stored everything on disk then it would take longer than if on RAM.