Chapter 6 - Window Tables and Window Functions in SQL Flashcards
What is the idea of Windowed Table functions and how do they compare with traditional set functions?
- Operates on a Window of table, calculating one value for each row based on the other rows in that Window (moving/cumulative aggreegate values) -> tuple-based aggregation.
- Grouping with set functions -> one result per group
Explain the concept of window table, window and window function.
- Table can have multiple independent windows associated
- Window: defines for each row, a set of related rows used to compute additional atts
- Window function: applied over each row, together with other rows in the win, returning single value.
What are the elements of a window definition and how are they expressed in SQL?
Window partitioning, window ordering, window frame
- Window partitioning: Forming groups, but rows are retained.
- Window ordering: defines order of rows within partition
- Window frame: Defined relative to each row, further restricts set of rows
SELECT… FUNCTION(column) OVER (PARTITION BY expr1 ORDER BY expr2 (ROWS/RANGE)
How does the partitioning clause work?
- Defined like GROUP BY, with list of columns
-> Each group is a partition
(Windows are limited by partition boundaries)
Is it possible to apply set functions as window functions?
- Yes: set function + OVER = window function
- > aggregate value is computed for each row
- > if no frame specified, value is unique for a whole partition (like GROUP BY)
How does the ordering clause work?
- > Defines ordering with partition (independent from query ORDER BY -> no impact on sort)
- > List of columns: name (ASC/DESC) (NULLS FIRST/ NULLS LAST)
- > required for frame specification and ranking function
What is the functionality provided by ranking functions?
-> provide, based on the ordering, the rank of each row in the partition
Enumerate and explain the different kind of ranking functions.
RANK: draws are ranked the same, skipping positions -> 1,2,2,4
DENSE-RANK: draws are ranked the same, without skips -> 1,2,2,3
ROW_NUMBER: draws are ranked differently (non-deterministic) -> 1,2,3,4
How do ranking functions and window functions in general relate to other clauses in the query?
window functions are computed after FROM, WHERE, GROUP BY, HAVING -> cannot be referenced
- > possible to apply window function to rows aggregated by GROUP BY
- > set aggregate functions can be used in ordering clause (groups defined in GROUP BY)
What is the idea behind cumulative functions?
if no frame is specified, the value for each row is computed using all the previous (requires ordering!) rows in the partition -> cumulative effect
What are window frames and their different kinds supported in SQL?
refines the set of rows used to apply window function with ranges of values or row count
(ROWS, RANGE) (| BETWEEN AND )
What are the differences and pros/cons between RANGE and ROWS as frame specifications?
RANGE is better suited if duplicate/missing values occur (not dense data)
-> or situations in which groups must be made based on values
How can we explicitly define a window?
-> Provide names to windows, allowing the same window to be used in different functions
…f(arg) OVER w…
WINDOW w AS (partition by … order by…)
How do the OLAP operations fit in the query processing stages?
WINDOW functions are processed in SELECT clause = projection
- > after product, selection, grouping aggregation
- > allows to create partitions based on groups generated by GROUP BY
Give an overview of additional OLAP capabilities.
-> Hypothetical aggregate functions: RANK (col1, col2, col3)
(how would this tuple rank in this group?)
-> inverse distribution functions: percentile_ (DISC|CONT) (expr)
* return values of expressions in order by clause that correspond to percent value 0…1