Query Processing Flashcards
Codd defintions of relational algebra: What is a product?
create longer relations by combining others
Codd defintions of relational algebra: What is a select?
filter tuples of a longer relation, using a predicate
Codd defintions of relational algebra: What is a project?
creater shorter relation, with fewer places
Codd defintions of relational algebra: What is a join?
similar to product, but merging one or more places
What is a Natural Join of 2 tables?
joining two tables on same-named columns, matching attributes, omits rows with no match
What is an Inner Join of 2 tables?
joins tables on two explicitly named columns, not necessarily with same name
What is a Left outer (right outer) Join of 2 tables?
includes LH (RH) rows with no matches in the RH (LH) tables
WHat is a full outer join of 2 tables?
Union of left and right outer joins.
What are the main differences between SQl and relational algebra?
SQL returns lists, not sets
SQL has additional syntax for defining, manipulating and querying tables
SQL has poorly chosen operation names ( SQL select does project, SQL where does select)
WHat is an aggregate function?
Non-relational part of sql
summarising sets of data to yield statistics
Why are aggregate functions useful?
Summarising data is what users seem to want
What are somethings that cause issues with query efficiency?
- how many rows need to be found (time)
- how difficult it is to find them (joins)
- where to store intermediate results (space)
What are some possible approaches to improve query optimization?
- structure queries using subqueries
- filter large data sets before joining
- project columns before joining
What are the benefits of indexing in terms of searching data sets?
index refers to row position in a large file can offer faster random access to a row by its index
What method do we use to handle very large data sets?
B-tree; a memory graph that maps primary keys to row indices