Post midterm concepts Flashcards
What is the SQL syntax order and the SQL execution order?
SYNTAX: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY.
EXECUTION: FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY.
Difference between filtering in HAVING and WHERE?
HAVING: quicker because it sorts attributes in the SELECT statement. Often it is optimal to use over WHERE.
WHERE: slower because it sorts attributes across the entire base table.
What should you do if you want to update two kinds of values and the update of one relies on the updated values of the other.
You must separate them into two UPDATE statements. Otherwise, it will not work as multiple updates in UPDATE happen simultaneously, not in order.
What is an equi-join?
an inefficient inner join. It’s a cross join and then non matching records are deleted. It is done by putting all tables in the FROM clause and specifying PK-FK relations in the WHERE clause.
Cross Join
All possible row joins are joined.
Inner Join
table records overlapping on an attribute are joined.
Left Join
All overlapping records on an attribute are joined as well as all other records from the Left table. The non matching records are matched with null records.
Full Join
All overlapping records on an attribute are joined as well as all other records from both tables. Those that do not have a matching row are a totally null record.
Union
two tables’ records are stacked on top of each other.
Outer Join
records from each table that do not match.
3 prerequisites for union?
- all query results must have the same number of columns.
- All columns must have the same sequence.
- All corresponding columns must share a common datatype.
UNION vs. UNION ALL
UNION removes duplicate rows while UNION ALL is more inclusive. UNION ALL is more efficient and faster as it cuts out a process.
Which two syntax elements only appears once when a UNION is used?
The ORDER BY clause only appears at the end and the aliases only appear in the first SELECT clause.
Non-correlated Subquery
Inner query is executed first and once. The outer query is then compared to the Inner query. The attribute of comparison and the attribute selected in the inner query must match. IN is less optimal than EXISTS because it checks all records in the inner query.
Ex: “WHERE EmpID IN (…);” or
“WHERE EmpID = (…);”
Correlated Subquery
Inner query is executed with every record of the outer query. EXISTS is more optimal than IN because it looks for matching values and when a match is found, it stops looking.
Ex: “WHERE EXISTS (…);”
How would you do an inner join using subqueries?
Put the other table in a subquery and find overlapping records.
Views, CTEs, and Derived Tables all come from what data language?
DML; data manipulation language.
What is a view? Why should we use it? Does it saved across multiple sessions? What is the Syntax for creating/dropping it?
- A view is a pre-compiled, saved query that can be referenced as a table.
- It enhances security and productivity while reducing training.
- Yes it persists across multiple sessions.
- CREATE VIEW view_name AS (…) & DROP VIEW view_name.
What is a CTE?
A CTE is a common table expression that exists in memory only in the instance that the query is run.
What is a derived table?
A subquery; usually in parenthesis and in the FROM clause.
How are RANK() and ROW_NUMBER() used syntactically? What is the difference? How are they used?
- ROW_NUMBER() COUNT(*) OVER(PARTITION BY attribute) AS ‘Row_no.’
- RANK() can have ties while ROW_NUMBER() just orders results in the order they are generated.
- They are typically put in an inner query, CTE, or view and attributes are put in the SELECT statement and ‘Row_no’ is put in the ORDER BY or WHERE statement of outer query.
Difference between Data Warehouse and Database?
Data Warehouse has fewer uses, read-only, not 3NF, and derived from the database.