DAX - Statistical Functions Flashcards
What do statistical functions do?
- Calculate values related to statistical distibutions and probability
- Calculations and comparisons that reveal meaningful information about the data. Such as:
Trends
Patterns
Relationships
Common statistical functions
- Average
- Median
- Count
- Distinct(Count)
- Min
- Max
How does MAX work?
Returns the largest numeric value or largest string in a column. Ignores logical values.
NOTE: As the MAX function ignores logical values, TRUE and FALSE in the table are interpreted as texts. Texts are considered “larger” than numbers in DAX alphabetically, so the largest value in the column is TRUE, as it’s the last in alphabetical order.
How does MIN work?
Returns the smallest numeric value or shortest string in a column. Ignores logical values.
NOTE: As the MIN function also ignores logical values, same as the MAX function, the sequence of blank values, numbers and texts is followed in ascending order. A blank value in the sample column is considered as the
minimum value.
How does the average function work?
- It sums up all the numbers in a dataset
- Divides the result by the total count of numbers
How does the median function work?
- Sorts the numbers in asc order
- Selects the middle number
- If there are two middle numbers, take the average of the two as the median.
For example, if the ItemPrice column in the Sales table contained 2 values,
the formula would be calculated as:
(160 + 180) / 2 = 170
Example of AVERAGEX function
AVERAGEX (Sales, Sales[Freight] + Sales[TaxAmt])
In this example:
1. Sales[Freight] + Sales[TaxAmt] is calculated for each row
2. Then the sums are averaged
How does AVERAGEX work?
AVERAGEX(Table, Sales[Quantity] * Sales[Price])
It goes row by row and calculates the Sales Amount (quantity x price). Then it calculates an average of Sales Amount amongst all the rows that were calculated
For example, if the ItemPrice column and the QuantitySold column each contained 8 values, the formula would be calculated as:
((120 * 5) + (140 * 7) + (180 * 3) + (130 * 6) + (200 * 4) + (240 * 9) + (160 * 2) + (190 * 8)) / 8 = 962.50
What does AVERAGEA do?
AVERAGEA([ColumnName)
Calculates the average of non blank cells in a column
NOTE
-Values that evaluate to TRUE count as 1.
-Values that evaluate to FALSE count as 0 (zero).
-Values that contain non-numeric text count as 0 (zero).
-Empty text (“”) counts as 0 (zero).
For example, if the ItemPrice_withText column contained 8 values, with non-numeric values being counted as 0, the formula would be calculated as:
(120 + 1 + 0 + 130 + 180 + 0 + 280 + 1) / 8 = 89
Is the Median or Average less affected by outliers and extreme values?
Median
It is therefore useful for datasets with skewed distributions
What is returned if there are no rows to count using COUNT or DISTINCTCOUNT
(blank)
What does count do?
Counts the total number of rows, containing any or no value
What does COUNTBLANK do?
Counts the number of blank or empty rows in a column
How does COUNTA work?
It counts the number of cells in a column that are not blank
How does COUNTAX work?
Goes row by row and and counts rows where the specified expression results in a non-blank result.
COUNTAX syntax
COUNTAX(<table>, <expression>) Example COUNTAX (FILTER (Reseller, [Status] = "Active"), [Phone])
In the example above, it counts the number of non-blank rows in the phone column using the table that results from filtering the reseller table on Status=Active
What does COUNTROWS count?
=COUNTROWS([‘Orders’])
=COUNTROWS(RELATEDTABLE(ResellerSales))
It counts the number of rows in the specified table or a table defined by an expression
How do MINA or MAXA work?
- Gets the smallest or largest value in a column and doesn’t ignore logical values and text.
- This could therefore be used to get the smallest or largest date. MIN and MAX only work on numbers.
MINX syntax
MINX(<table>, <expression>)
How does MINX work?
MINX (FILTER (Sales, [SalesTerritoryKey] = 5), [Freight])
An expression is evaluated for each table row and then the smallest value is returned.
In this example, the Sales table is filtered for SalesTerritoryKey 5. Then the function gets the MIN freight for territory 5.
How does the MAXX function work?
MAXX (Sales, Sales [TaxAmt]+ Sales [Freight])
This function evaluates an expression for each row and then returns the largest value.
In the above example, the tax plus frieght is calculated for each row and then the max value is returned.
MAXX syntax
MAXX(<table>, <expression>)
Variance of a Sample Population
VAR.S(<column>)
Returns the variance of a column containing a sample population.
Just writing a long sentence so it will left justify
Example Variance using sample population
Assuming the column on which the variance/deviation will be calculated refers to a sample population:
Example: VAR.S(Sales[ItemPrice]) = 1.629
Standard Deviation of sample population
~~~
STDEV.S(<column>)
~~~</column>
Returns the standard deviation of a column containing a sample population.
Example Standard Deviation using sample population
Assuming the column on which the variance/deviation will be calculated refers to a sample population:
Example: STDEV.S(Sales[ItemPrice])= 40
Variance of Entire Population
VAR.P(<column>)
Returns the variance of a column, assuming that the column refers to the entire population.
Example Variance using entire population
Assuming the column on which the variance/deviation will be calculated contains the entire population of data points:
Example: VAR.P(Sales[ItemPrice]) = 1.425
Standard Deviation of entire population
STDEV.P(<column>)
Returns the standard deviation of a column containing an entire population.
Example Standard Deviation using entire population
Assuming the column on which the variance/deviation will be calculated contains the entire population of data points:
Example: STDEV.P(Sales[ItemPrice]) = 38
Combination Function
COMBIN(n,k)
Returns the number of ways k items can be selected from n, where the order DOES NOT matter.
Example: COMBIN(49,6) = 13.983.816
Permutation Function
PERMUT(n,k)
Returns the number of ways k items can be selected from n, where the order DOES matter.
Example: PERMUT(49,6) = 10.068.347.520