Chapter 5 Grouping and Windowing Flashcards

Question

What is the CUBE clause and how do you use it?

Answer 1

The CUBE clause accepts a list of expressions as inputs and defines all possible grouping sets that can be generated from the inputs - including the empty grouping set., e.g. GROUP BY CUBE (shipperid, YEAR(shippeddate)); This produces 4 grouping sets: (1) shipperid, (2) YEAR(shippeddate), (3) shipperid, YEAR(shippeddate), (4) (Empty)

Answer 2

The NTILE function allows you to arrange the rows within the partition into a requested number of equally sized tiles based on the specified ordering. You specify the desired number of tiles as input to the function, e.g. NTILE(100). If there are 830 rows in the result set, the tile size is 830 / 100 = 8 with a rem of 30. Because there is a rem, the first 30 tiles are assigned an extra row.

Answer 3

Because all elements besides the aggregation and spreading elements are implicitly used for grouping. By using a table expression, you control which columns are used for grouping.

Answer 4

Pivot rotates data from a state of rows to a state of column headers. Unpivot rotates the data from a state of column headers to a state of row values.

Answer 5

The LAG function returns an element from the row in the current partition that is a requested number of rows \*before\* the current row with 1 assumed as the default offset. The LEAD function returns an element from the row that is in the requested offset \*after\* the current row. If no explicit offset is specified, it uses a default of 1. If you want a different offset, you specify it as the second argument, LAG(val,3) If a row doesn't exist in the requested offset, NULL is returned. To return something different, specify it as the third argument. LAG(val,3,0). The LAG and LEAD functions support window partition and ordering clauses. But, do not support a window frame clause.

Answer 6

Process of elimination - it's what's left from the queried table besides the aggregation and spreading elements.

Answer 7

The ROLLUP clause accepts a list of expressions as inputs and defines a hierarchy formed by the input elements such as a location hierarchy (country, region, city), .e.g GROUP BY ROLLUP (country, region, city) produces 4 grouping sets: (1) country, region, city, (2) country, region, (3) country, (4) (Empty).

Answer 8

Yes.But the window order clause only determines ordering for the window function's computation not for presentation. There's no guarantee that the rows will be presented in the same order as the window function's ordering. If you need such a guarantee, add a presentation ORDER BY clause.

Answer 9

String with pivoted data, when unpivoting, you rotate the input data from a state of columns to a state of rows.

Answer 10

NULLs are used as placeholders in rows where an element isn't part of the grouping set.

Answer 11

(1) The aggregation and spreading elements cannot be the results of expressions - they must be column names from the queried table. You can however apply expressions in the query defining the table expression, assign aliases to those expressions, and then use the aliases in PIVOT (2) The COUNT(\*) function isn't allowed as an aggregate function used by PIVOT. You must use COUNT( \< col name \>) There is a work around using the table expression (3) PIVOT is used to using only one aggregate function (4) The IN clause of the PIVOT operator accepts a static list of spreading values. It doesn't support a subquery as input. You need to know ahead of time what the distinct values are in the spreading column. You can use dynamic SQL to work around this.

Answer 12

Return an element from a single row that is in a given offset from the current row in the window partition, or from the first or last row in the window frame.

Answer 13

SELECT custid, orderid, orderdate, **val**, LAG/LEAD(**val**) OVER ( PARTITION BY custid ORDER BY orderdate, orderid ) FROM Sales.OrderValues

Answer 14

The beginning and the end of the partition.

Answer 15

All expressions that appear in those clauses must guarantee a single result value per group. There's no problem referring directly to elements that appear in the GROUP BY clause because each of those already return one distinct value per group. For other elements from the underlying table, you must apply an aggregate function.

Answer 16

The RANK function returns the number of rows that have a lower ordering value than the current plus 1. Can have gaps between ranking values.

Answer 17

SELECT custid, shipperid, freight FROM Sales.FreightTotals UNPIVOT(freight FOR shipperid IN ([1],[2],[3])) AS U;

Answer 18

SELECT \< column list \>, \< **names column** \>, \< **values column** \> FROM \< source table \> UNPIVOT ( \< **values column** \> FOR \< **names column** \> IN ( ) ) AS U;

Answer 19

SELECT custid, orderid, orderdate, **value**, FIRST\_VALUE(**value**) OVER ( PARTITION BY custid ORDER BY orderdate, orderid ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) as first\_val

Answer 20

These functions can be used for sorting data because they return a 0 bit for a detail element and a 1 bit for an aggregated element. So, if you want to see detail first, sort by the result of the function in ascending order.

Answer 21

The FIRST\_VALUE and LAST\_VALUE functions return a value expression from the first or last rows in the window frame respectively. These functions support both window partition, order, and frame clauses.

Answer 22

They are supported to operate on the underlying query's result which is achieved when logical query processing gets to the SELECT phase.

Answer 23

WITH PivotData AS ( SELECT custid, -- grouping column shipperid, -- spreading column freight, -- aggregation column FROM Sales.Orders ) SELECT custid, [1], [2], [3] FROM PivotData PIVOT(SUM(freight) FOR shipperid IN ([1], [2], [3])) AS P;

Answer 24

Yes. You need to be explicit and use the ROWS clause.

Answer 25

The names colum is defined as a nvarchar(128) and the values column is defined with the same type as the type of the source columns that were unpivoted.

Answer 26

Yes. This results in one group with all rows for computation of grand aggregates.

Answer 27

Framing is a filtering option available to window aggregate functions. You define ordering within the partition by using a window order clause, and then based on that order you can confine a frame of rows between two delimiters.

Answer 28

Tells whether a NULL in the grouped results represents a placeholder or an original NULL. GROUPING accepts a single element as input and returns 0 when the element is part of the grouping set and 1 when it isn't, e.g. GROUPING(country) =\> 0/1

Answer 29

The HAVING clause uses a predicate but evaluates the predicate per group as opposed to per row. This means that you can refer to aggregate computations because the data has already been grouped.

Answer 30

ROWS or RANGE

Answer 31

The ROW\_NUMBER function computes a unique sequential integer starting with 1 within the window partition based on the window ordering. Note that if the ordering isn't unique, the ROW\_NUMBER function is not deterministic. If there's no "tie breaker" in the ordering, the choice of which row gets the higher number is arbitrary - optimization dependent.

Answer 32

ROWS UNBOUNDED PRECEDING

Answer 33

GROUPING\_ID accepts the list of grouped columns as inputs and returns an integer representing a bitmap. The rightmost bit represents the rightmost input. The bit is 0 when the respective element is part of the grouping set and 1 when it isn't. The result integer is the sum of the values representing elements that are not part of the grouping set because their bits are turned on., e.g. GROUPING\_ID(country, region, city), 7 would represent the empty grouping set - none of the 3 elements is part of the grouping set. Therefore, the respective bits (1, 2, 4 =\> 7) are turned on.

Answer 34

ROWS BETWEEN \***UNBOUNDED PRECEDING**\* AND **CURRENT ROW** - you need the first row in the partition.

Answer 35

The DENSE\_RANK function returns the number of \*distinct\* ordering values that are lower than the current plus 1. Cannot have gaps between ranking values.

Answer 36

Windowed queries do not hide detail - they return a row for every underlying query's row. This means you can mix detail and aggregated elements in the same query.

Answer 37

(1) The set of source columns that you're unpivoting, (2) The name you want to assign to the target values column (e.g. "freight"), (3) The name you want to assign to the target names column ("shipperid").

Answer 38

(1) GROUPING SETS, (2) CUBE, (3) ROLLUP. You use these in the GROUP BY clause.

Answer 39

SELECT and ORDER BY - if you need to refer to the result of a window function in any clause evaluated before SELECT, you need to use a table expression (CTE).

Answer 40

Rows are arrange in one or more groups according to the grouping set of expressions and the group function operates on each group.

Answer 41

ROWS BETWEEN **CURRENT ROW** AND \***UNBOUNDED FOLLOWING**\* - you need the last row in the partition.

Answer 42

They are table operators similar to JOIN, etc.

Answer 43

(1) UNBOUNDED PRECEDING or FOLLOWING, (2) CURRENT ROW, (3) \< n \> ROWS PRECEDING or FOLLOWING

Answer 44

You can use the GROUPING SETS clause to list all grouping sets that you want to define in the query. You list the grouping sets separated by commas within the outer pair of parenthesis. You use an inner pair of parenthesis to enclose each grouping set. If you don't use inner parenthesis, each individual element is considered a separate grouping set. e.g. GROUP BY GROUPING SETS ( (shupperid, YEAR(shippeddate)), (shipperid), (YEAR(shippeddate)), ());

Answer 45

ROW\_NUMBER, RANK, DENSE\_RANK, and NTILE.

Answer 46

(1) What do you want to see on rows? This element is known as the "on rows" or "grouping element" (2) What do you want to see on columns? This element is known as the "on cols" or "spreading element" (3) What do you want to see in the intersection of each distinct row and column value? This element is known as the "data" or "aggregation element".

Answer 47

A data analysis function is a function applied to a set of rows and it returns a single value, e.g. the SUM aggregate function.

Chapter 5 Grouping and Windowing Flashcards

(71 cards)