Domain 1 & 2 Flashcards
DateDIFF Syntax
DATEDIFF(date_part,startdate,enddate,startofweek)
DATEDIFF(‘month’,[orderdate],[shipdate])
Where do columns and rows show up on axis?
Columns appear on X axis
Rows appear on Y axis
Replace a data source steps
- Open workbook with original data
2.Data Tab > New Data Source - Set up data in canvas (drag to canvas)
- Back to workbook
- Data Tab>Replace Data Source
6.Select current and replacement source
NOT- Go to data source page and select “add” - that causes a cross data
Ascending vs Descending
Ascending: going up 0 to 100
Descending: going down 100 to 0
Difference between Index and Rank Function
Index deals with physical position of the record. Rank deals with value of record with highest value at highest rank.
Rank_Dense
Optimist tie Returns the dense rank for the current row in the partition. Identical values are assigned an identical rank, but no gaps are inserted into the number sequence. Use the optional ‘asc’ | ‘desc’ argument to specify ascending or descending order. The default is descending.
With this function, the set of values (6, 9, 9, 14) would be ranked (3, 2, 2, 1).
Rank_Modified
Pessimist tie Returns the modified competition rank for the current row in the partition. Identical values are assigned an identical rank. Use the optional ‘asc’ | ‘desc’ argument to specify ascending or descending order. The default is descending.
With this function, the set of values (6, 9, 9, 14) would be ranked (4, 3, 3, 1).
Rank_Percentile
Returns the percentile rank for the current row in the partition. Use the optional ‘asc’ | ‘desc’ argument to specify ascending or descending order. The default is ascending.
With this function, the set of values (6, 9, 9, 14) would be ranked (0.00, 0.67, 0.67, 1.00).
Rank_Unique
Returns the unique rank for the current row in the partition. Identical values are assigned different ranks. Use the optional ‘asc’ | ‘desc’ argument to specify ascending or descending order. The default is descending.
With this function, the set of values (6, 9, 9, 14) would be ranked (4, 2, 3, 1).
FIRST () function
Returns the number of rows from the current row to the first row in the partition. For example, the view below shows quarterly sales. When FIRST() is computed within the Date partition, the offset of the first row from the second row is -1.
0
-1
-2
-3
DATEPARSE
Converts Original date into a date format.
DATEPARSE(date_format,date_string)
DATEPARSE(‘yyyy-MM-dd’, “1986-03-25”) = #March 25, 1986#
DATEPARSE(‘yyyy-MM-dd’, [originaldate]) = #March 25, 1986#
DATETRUNC
Rounds up to first day of date_part
DATETRUNC(date_part, date, [start_of_week])
DATETRUNC(quarter, #9/22/2018#) = #7/1/2018#
P-Value
Measure of statistical confidence
Low P-Value vs High P-Value
Lower the P-Value, the greater the statistical confidence. Values below .05 are considered “statistically significant”
How are blends and left joins different?
Blends do not introduce duplication since aggregation happens before data sources are combined. In a Left join, sources are combined and then aggregation happens
Difference between a join and a blend in data connections
A join combines data first then aggregates after. a blend aggregates first then combines data.
IQR
Interquartile Range:
difference between upper and lower hinge
Range for middle 50% of data
Q3-Q1
UH-LH
Lower Hinge
25% of data is below lower hinge
“Lowest quarter”
“1st Quarter”
Q1
25% percentile
Upper Hinge
Highest quarter of data
Only 25% if data is above/higher
75% percentile
3rd Quarter
Q3
Best chart for distribution? Best chart for dates?
Histogram is the best chart for Distribution.
Line chart is the best chart for dates.
Measure Names and Measure Value fields are most used for:
When you want to dynamically switch between multiple measures in a viz (parameters)
What color pills are continuous and discrete data?
Continuous- Green
Discrete- Blue
Measure attributes
Measures are aggregated to the current level of detail. They are Quantitative. ( clicks, sales, profit, height). Often continuous in nature
Dimension attributes
Dimensions increase the level of detail. They are Qualitative (eye color, country, region, name) Usually discrete
Density maps
You have to change the marks card to Density for it to convert
Tree maps
Measure is on marks card twice: once in size and another one in color.
Calc for : Return Middle name (if any) and Last Name of Customer
MID ([CustomerName], FIND([CustomerName],” “) +1)
Find tells it to start at the “ “ after first name and +1 makes sure the space isn’t included
Calc for : Return Middle name (if any) and Last Name of Customer
MID ([CustomerName], FIND([CustomerName],” “) +1)
Find tells it to start at the “ “ after first name and +1 makes sure the space isn’t included
Calc for: Return a “Profit” or “Loss” Status based on profit
IIF ([Profit]>0,’Profit’,’Loss’)
IIF(_____, then, else, [unknown])
IF() vs IIF() Syntax
IF is a statement; IIF is a function:
IF ___THEN _____END
IF ___THEN _____ELSE ______END
IIF(_____, then, else, [unknown])
FIND Syntax
What is the syntax and explain what it does
FIND(string, substring, [start]) FIND(“Calculation”, “a”, 1) = 2
Returns the index position of substring in string or 0 if the substring isn’t found.
If nested within another syntax, it’s telling you where to start.
MID ([Cust Name], FIND([Cust Name],” “) +1)
tells you to start at the first SPACE in the Customer name field. The +1 makes sure the space isn’t included
Calc: Compute the 50-day moving average of sales for data in window
WINDOW_AVG(Sales), -49,0)
Then add new field to Row. Right click field to create Dual Axis Chart
FIRST()
When FIRST() is computed alongside INDEX, what is the result for the row with index=3
-2 - First starts at 0 and counts in negatives
INDEX FIRST()
1 0
2 -1
3 -2
LAST()
When last is computed alongside INDEX, what is the result for the row with index=3
0
LAST() Starts at the max and goes down to 0. All numbers are positive.
INDEX LAST()
1 2
2 1
3 0
WINDOW_AVG SYNTAX
WINDOW_AVG(expression, [start, end])
WINDOW_AVG(SUM([sales]), FIRST(), LAST() )
- Can change out AVG with Sum, Min, Max, STDV, Median,
Calc: field to link parameter called [dimension swap] with list to visualization
CASE [dimensionswap]
WHEN “Category” THEN [Categoryfield] – (do for all list items)
END
Add to Context
Forcing Tableau to process Context filter first, then any other dimension filter.
Applying filter to other worksheets
Click down arrow on filter pill>Apply to worksheet>Selected Worksheets> Pick the ones you want to apply it to
Order of Operations: Filters
Extract Filters(sql)
Data Source Filters
Context Filters
Dimension Filters
Measure Filters
Table Calculation Filters
Order of Operations
Sets, Conditional Filters, Top N, Fixed LOD
Include/Exclude LOD, data blending
Forecasts, Table Calculations, Totals
Trend Lines, Reference Lines
Calc: isolate domains in emails
MID ( “xyz@google.com”, FIND (“xyz@google.com”, “@”)+1) =”google.com”
SPLIT Syntax
SPLIT ( CustomerName), “ “, 2)
SPLIT ([field], seperator, which component you want)
split can’t be used when there is a middle name
Turning Amanda Babineaux into
Babineaux, A
RIGHT([CustomerName], LEN([LastName]))
+ “, “
+ LEFT ([customer name], 1)
Left Syntax
Find all products that start with Xerox:
LEFT([productname], 5) = XEROX
Calc: Find customer’s first order
Calc: Number of orders by customer
{fixed[customerID] : MIN (orderdate)}
{fixed[customerID] : CountD(orderID)}
Compound Growth Rate of Sales
with original compound growth rate syntax
POWER ( ZN (SUM[sales])) / LOOKUP (ZN (SUM[sales]))
, -[N Years] , ZN (1/ [N Years] -1
(ZN(End Sales Value/Beginning Sales Value) ^ (ZN(1/n)) - 1
Syntax for : DOES NOT EQUAL
!=
Profit Ratio
SUM(profit) / SUM(sales)
Calc: LOD Expression to show total sales by month without the regional componet
{EXCLUDE[region] : SUM(sales)}
Connect to Published Data Source in Tableau Server and Create a new Workbook
- Select all Data sources you want in “All Data Sources”
- At top of page from Action menu select “New Workbook”
Combining Sets
Must be based on same dimension
Select two sets > RTCLICK>Create Combined Set
Convert Sales into Bins with discrete set of bins equal to 500 and create a histogram that shows how many products fall into each bin.
Sales>Create>Bin
Set to 500
Drag field to Columns; Drag Product name to Row.
Change Row Measure to COUNT instead of dimension.
Create a filled map to show profit by state
- Add longitude and latitude in row and column
- Add sum(profit) to Color Marks ; Add state to Detail Marks
- Make sure Mark’s dropdown is on Map
” Add a reference line to show the average sales value”
Add an average reference line based on SUM(sales) BUT make sure the LABEL= VALUE !
What trend line models do you have in Tableau (5)
Linear
Logarithmic
Exponential
Polynomial
Power
Which two functions can be used in tableau to build a predictive model?
Model_Quantile
Model_Percentile
Model_Percentile Syntax
Model_Quantile Syntax
Give function for Sales & OrderCount
MODEL_PERCENTILE( SUM([Sales]),COUNT([Orders]))
MODEL_QUANTILE(0.5, SUM([Sales]), COUNT([Orders]))
LOOKUP(expression, [offset])
DATEDIFF(date_part, date1, date2, [start_of_week])
FIND(string, substring, [start])
MID(string, start, [length])
SPLIT(string, spacer, side)