5. Relational Algebra Flashcards
Why do we need relational algebra (in addition to SQL)?
SQL is a declarative programming language, i.e. you specify what you want, but you don’t have to specify how to do it.
When we study query optimization, we want a way to express the many different valid plans a database can use to execute a query. For this, we will use Relational Algebra, a procedural programming language (specifies exactly what operators to use and in what order)
What inputs and outputs have operators in relational algebra
All of the operators in relational algebra take in a relation and output a relation.
What operators relational algebra has? (9)
Projection (π)
Selection (σ)
Union (∪)
Set Difference (-)
Intersection (∩)
Cross Product (×)
Joins
Rename (ρ)
Group By / Aggregation (γ)
Projection (π) - explain
selects only the columns specified.
π name (dogs) - get only name column from dogs relation
equivalent to: SELECT name FROM dogs ;
Selection (σ) - explain
used to filter rows based on a certain condition.
σ age=12 (π name,age (dogs))
is equivalent to:
SELECT name, age FROM dogs WHERE age = 12;
Union (∪) - explain
we take all the rows from each relation and combine them removing duplicates along the way. Same as union in SQL.
E.g.
π name (dogs) ∪ π name (cats)
would return a set of names of both cats and dogs
Set Difference (-) - explain
It returns every row in the first table except the rows that also show up in the second table
E.g.
π name (dogs) − π name (cats)
will show names included in the dogs table and at the same time not included in the cats table.
Intersection (∩) - explain
it only keeps rows that occur in both tables in the intersection
E.g.
π name (dogs) ∩ π name (cats)
will return names included in both dogs and cats
Cross Product (×) - explain
Create one tuple for every possible pair of tuples (with removing duplicates).
E.g.
dogs × parks
will return a set of all possible dog-park pairs
so it we have N dogs and M parks, we’ll get N * M results
Joins - explain
Equivalent to SQL joins.
joins can actually be derived from just a cross product (×) and conjunction of selections (σ)
Rename (ρ) - explain
accomplishes the same thing as aliasing in SQL
E.g.
π dname( ρ name−>dname (dogs))
will return names of dogs, but the column name will be dname instead.
Group By / Aggregation (γ) - explain
equivalent to using the GROUP BY and HAVING clauses in SQL.
E.g.
γ age,COUNT(∗)>5 (dogs)
is equivalent to
SELECT ∗ FROM dogs GROUP BY age HAVING COUNT( ∗ ) > 5;
How duplicates are handled in relational algebra
In relational algebra, relations can’t have duplicates at all, so they are removed on every step.