Domain 2: Explore and Analyze Data (41%) Flashcards

1
Q

When would you use the date formats “d” vs. “dd” and so on in calculations?

A

d: Display the day as a number without a leading zero (1-31).

dd: Display the day as a number with a leading zero (01-31).

ddd: Display the day as an abbreviation (Sun, Sat).

dddd: Display the day as a full name (Sunday, Saturday).

ddddd: Display the date as a complete date (including day, month, and year), formatted according to your system’s short date format setting. The default short date format is m/d/yy.

dddddd: Display a date serial number as a complete date (including day, month, and year) formatted according to the long date setting recognized by your system. The default long date format is mmmm dd, yyyy.

aaaa: The same as dddd, only it’s the localized version of the string.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When would you use the week formats “w” vs. “ww”?

A

w: Display the day of the week as a number (1 for Sunday through 7 for Saturday).

ww: Display the week of the year as a number (1-54).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How would you use the month formats “m” vs. “mm” and so on in calculations?

A

M: Display the month as a number without a leading zero (1 12). If m immediately follows h or hh, the minute rather than the month is displayed.

MM: Display the month as a number with a leading zero (01-12). If m immediately follows h or hh, the minute rather than the month is displayed.

MMM: Display the month as an abbreviation (Jan-Dec).

MMMM: Display the month as a full month name (January-December).

MMMMM: Display the month as a single letter abbreviation (J-D)

oooo: The same as MMMM, but localized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why would you use “AM/PM” vs. “am/pm” in a calculation?

A

You would use “AM/PM” if you want those letters capitalized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe what happens when you use the following in a calculation:
* S
* Ss
* OOO
* tttt

A

S: Display the second as a number without leading zeros (0 59).

Ss: Display the second as a number with leading zeros (00 59).

000: Display milliseconds. Use a period character as a separator before specifying milliseconds.

t t t t t: Display a time as a complete time (including hour, minute, and second), formatted using the time separator defined by the time format recognized by your system. A leading zero is displayed if the leading zero option is selected and the time is before 10:00 A.M. or P.M. The default time format is h:mm:ss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How would this calculation render in a viz: yy-mm-dd (dddd)?

A

18-01-04 (Thursday)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How would this calculation render in a viz: “Fiscal Quarter” q “of” yyyy?

A

Fiscal Quarter 1 of 1976

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the default start-of-week in the US and the EU?

A

Sunday is the first day of the week in the US
Monday is the first day in the EU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can you specify fiscal year start for a datasource?
How do you do this for a specific date Dimension field?

A

Datasource:
1) Right-click (Control-click on Mac) the data source in the Data pane to open the Date Properties dialog box.
2) Set the Fiscal year start field to the appropriate month

Date Dimension field:
1) In the Data pane, right-click a date dimension (Ctrl-click on Mac)
2) Select Default Properties > Fiscal Year Start.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cite the format for a calculated field designating a field as a date field.

A

DATE([field])
This expression returns a date given a number, string, or date.

DATE automatically recognizes many standard date formats.

MAKEDATE is another similar function, but MAKEDATE requires the input of numeric values for year, month, and day.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cite the format for a calculated field that adds a specified number of date parts (months, days, etc) to the starting date.

A

DATEADD(date_part, interval, [date])

Example: Add 280 days to the date February 20, 2021:
* DATEADD(‘day’, 280, #2/20/21#) = #November 27, 2021#

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cite the format for a calculated field that returns the number of date parts (weeks, years, etc) between two dates.

A

DATEDIFF(date_part, date1, date2, [start_of_week])

Example: Number of days between March 25, 1986 and February 20, 2021
* DATEDIFF(‘day’, #3/25/1986#, #2/20/2021#) = 12,751

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cite the format for a calculated field that returns the name of the specified date part as a discrete string.

A

DATENAME(date_part, date, [start_of_week])

Example: DATENAME(‘month’, #1986-03-25#) = “March”

A very similar calculation is DATEPART, which returns the value of the specified date part as a continuous integer. DATEPART can be faster because it is a numerical operation.

By changing the attributes of the calculation’s result (dimension or measure, continuous or discrete) and the date formatting, the results of DATEPART and DATENAME can be formatted to be identical.

An inverse function is DATEPARSE, which takes a string value and formats it as a date.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cite the format for a calculated field that returns specifically formatted strings as dates.

A

DATEPARSE(date_format, date_string)

Example: DATEPARSE(‘yyyy-MM-dd’, “1986-03-25”) = #March 25, 1986#

The <date_format> argument describes how the <date_string> field is arranged. Because of the variety of ways the string field can be ordered, the <date_format> must match exactly.</date_format></date_string></date_format>

DATE is a similar function that automatically recognizes many standard date formats. DATEPARSE may be a better option if DATE does not recognize the input pattern.

MAKEDATE is another similar function, but MAKEDATE requires the input of numeric values for year, month, and day.

Inverse functions, which take dates apart and return the value of their parts, are DATEPART (integer output) and DATENAME (string output).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which calculated field function will return the name of the specified date part as an integer?

A

DATEPART(date_part, date, [start_of_week])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which calculated field function can be thought of as date rounding?

A

DATETRUNC(date_part, date, [start_of_week])

It takes a specific date and returns a version of that date at the desired specificity. Because every date must have a value for day, month, quarter, and year, DATETRUNC sets the values as the lowest value for each date part up to the date part specified.

Example: the first day of the quarter containing 9/22/2018
* DATETRUNC(quarter, #9/22/2018#) = #7/1/2018#

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which functions do you use to display the number of a date part? Example: Sunday = 1, March = 3, etc.?

A

DAY(date)
WEEK(date)
MONTH(date)
QUARTER(date)
YEAR(date)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the ISDATE function do?

A

It returns a Boolean response: aka True/False

The required argument must be a string. ISDATE cannot be used for a field with a date data type—the calculation will return an error.

Example:
* ISDATE(09/22/2018) = true
* ISDATE(22SEP18) = false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When might you use the MAKEDATE function?

A

When you have separate fields that are numerical and refer to a date part. MAKEDATE Returns a date value constructed from the specified <year>, <month>, and <day>.</day></month></year>

Example:
* MAKEDATE ([year field],[month field],[day field])

Related:
MAKEDATETIME(date, time)
MAKETIME(hour, minute, second)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What calculated field functions return the date or time at the time of rendering?

A

NOW(): Returns date/time (Ex: NOW() = 1986-03-25 1:08:21 PM)
TODAY(): Returns current date. (Ex: TODAY() = 1986-03-25)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

List the valid “date_part” arguments available in Tableau in calculated date fields.

A

‘year’
‘quarter’
‘month’
‘dayofyear’
‘day’
‘weekday’
‘week’
‘hour’
‘minute’
‘second’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does the “start_of_week” argument in date calculated fields denote?

A

The start_of_week parameter can be used to specify what day is considered the first day of the week, such as “Sunday” or “Monday”. If it is omitted, the start of week is determined by the data source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

In calculated date fields, you use “#” instead of double quotes when you want exact characters to render - Why?

A

It tells Tableau that the value inside the symbols is a date. Otherwise, it would consider the characters to be a string.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does the CONTAINS function do in calculated string fields?

A

Returns true if the given string contains the specified substring

CONTAINS(string, substring)
Example: CONTAINS(“Calculation”, “alcu”) = true

ENDSWITH(string, substring) and STARTSWITH(string, substring), though they look just at the end or startof strings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Name the functions that reveal just parts of a string in a calculated string field.

A

LEFT(string, number) - Returns the left-most <number> of characters in the string.</number>

RIGHT(string, number) - Returns the right-most <number> of characters in the string.</number>

(MID(string, start, [length]) -
Returns a string starting at the specified start position. The first character in the string is position 1. If the optional numeric argument length is added, the returned string includes only that number of characters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Which calculated field function returns the length of the string (as an integer)?

A

LEN(string)

Example: LEN(“Matador”) = 7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Which calculated field functions affect the case of text?

A

LOWER(string)
UPPER(string)
PROPER(string)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Which calculated field functions trim extraneous spaces?

A

LTRIM(string)
RTRIM(string)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How can you use a calculated field to replace text in a string?

A

REPLACE(string, substring, replacement

Searches <string> for <substring> and replaces it with <replacement>. If <substring> is not found, the string is not changed.</substring></replacement></substring></string>

Example: REPLACE(“Version 3.8”, “3.8”, “4x”) = “Version 4x”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Which calculated field functions snags parts of a field (when a consistent delimiter is present)?

A

SPLIT(string, delimiter, token number)

Returns a substring from a string, using a delimiter character to divide the string into a sequence of tokens.

Examples:
* SPLIT (“a-b-c-d”, “-“, 2) = “b”
* SPLIT (“a|b|c|d”, “|”, -2) = “c”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Name the literal expressions for these data types:
* Numbers
* Strings
* Dates
* Boolean
* Nulls

A

Numeric literals: written as numbers.

String literals: written with quotation marks.

Date literals: written with the # symbol.

Boolean literals: written as either true or false.

Null literals: written as null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

When assembling calculated fields, which operation character forces order of prededence?

A

Parentheses can be used as needed to force an order of precedence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How can you add comments into a calculated field (that will be ignored by Tableau when doing the calc)?

A

To add a comment to a calculation, type two forward slash (//) characters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

In a logical calculation, when would you use ELSE vs. ELSEIF?

A

You use ELSEIF to specify conditions.
You use ELSE to encompass any other condition not cited yet. It’s the catch-all logical statement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What does the logical function CASE do?

A

Evaluates the expression and compares it to the specified options (<value1>, <value2>, etc.). When a value that matches the expression is encountered, CASE returns the corresponding return. If no match is found, the (optional) default is returned. If there is no default and no values match, then Null is returned.</value2></value1>

Example:
CASE [Season]
WHEN ‘Summer’ THEN ‘Sandals’
WHEN ‘Winter’ THEN ‘Boots’
ELSE ‘Sneakers’
END
“Look at the Season field. If the value is Summer, then return Sandals. If the value is Winter, then return Boots. If none of the options in the calculation match what is in the Season field, return a Sneakers.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Which logical function is used to close an IF or CASE expression?

A

END

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Describe how the IF logical function works

A

Tests a series of expressions and returns the <then> value for the first true <test>.</test></then>

Example:
IF [Season] = “Summer” THEN ‘Sandals’
ELSEIF [Season] = “Winter” THEN ‘Boots’
ELSE ‘Sneakers’
END
“If Season = Summer, then return Sandals. If not, look at the next expression. If Season = Winter, then return Boots. If neither of the expressions are true, return Sneakers.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

When would you use CASE vs. IF in a calculated field?

A

In Tableau, CASE statements are easier to read and perform better than IF statements when the data is simple and doesn’t require complex TRUE/FALSE evaluations. CASE statements are also better for evaluating a list of values.

However, IF statements are more flexible and allow boolean logic in the test. IF statements are also easier to read when nested

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

How can you change a NULL value into a specified string?

A

IFNULL(expr1, expr2)

Returns <expr1> if it's non-null, otherwise returns <expr2>.</expr2></expr1>

Example: IFNULL([Assigned Room], “TBD”)
“If the Assigned Room field isn’t null, return its value. If the Assigned room field is null, return TBD instead.”

Compare with IFNULL and ZN:
* IFNULL always returns a value. ISNULL returns a boolean.
* ZN swaps a zero for a NULL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

How does the IIF function work?

A

IIF(<test>, <then>, <else>, [<unknown>])</unknown></else></then></test>

Checks whether a condition is met (<test>), and returns <then>if the test is true, <else> if the test is false, and an optional value for <unknown> if the test is null. If the optional unknown isn't specified, IIF returns null.</unknown></else></then></test>

Example: IIF([Season] = ‘Summer’, ‘Sandals’, ‘Other footwear’)
“If Season = Summer, then return Sandals. If not, return Other footwear”

IIF doesn’t have an equivalent to ELSEIF (like IF) or repeated WHEN clauses (like CASE). Instead, multiple tests can be evaluated sequentially by nesting IIF statements as the <unknown> element. The first (outermost) true is returned.</unknown>

41
Q

What does the IN function do in a calculated field?

A

<expr1> IN <expr2>

Returns TRUE if any value in <expr1> matches any value in <expr2>.

Examples:
* SUM([Cost]) IN (1000, 15, 200)
"Is the value of the Cost field 1000, 15, or 200?"
* [Field] IN [Set]
"Is the value of the field present in the set?"
</expr2></expr1></expr2></expr1>

42
Q

Which function do you use for returning the absolute value of a number?

A

ABS(number)

43
Q

Name the three rounding functions you can use in a calculated numeric field

A

ROUND(number, [decimals])
Rounds <number> to a specified number of digits.</number>

CEILING(number)
Rounds a <number> to the nearest integer of equal or greater value.</number>

FLOOR(number)
Rounds a <number> to the nearest integer of equal or lesser value.</number>

44
Q

When is the ATTR function useful?

A

ATTR(expression)
You use this when you want to returns the value of the expression if it has a single value for all rows. Otherwise returns an asterisk. Null values are ignored.

45
Q

Of the following numeric functions, which one is not supported by all data sources?
MIN
MAX
MEDIAN
AVE

A

MEDIAN is not available for the following data sources: Access, Amazon Redshift, Cloudera Hadoop, HP Vertica, IBM DB2, IBM PDA (Netezza), Microsoft SQL Server, MySQL, SAP HANA, Teradata.

For other data source types, you can extract your data into an extract file to use this function.

46
Q

There are three types of LODs: FIXED, INCLUDE, and EXCLUDE. Describe what each of them do.

A

FIXED level of detail expressions compute a value using the specified dimensions, without reference to the dimensions in the view.

INCLUDE level of detail expressions compute values using the specified dimensions in addition to whatever dimensions are in the view. INCLUDE level of detail expressions are most useful when including a dimension that isn’t in the view.

EXCLUDE level of detail expressions explicitly remove dimensions from the expression—that is, they subtract dimensions from the view level of detail. EXCLUDE level of detail expressions are most useful for eliminating a dimension in the view.

47
Q

What is a Level of Detail (LOD) calculation?

A

LOD expressions allow you to compute values at the data source level and the visualization level. Most importantly, LOD expressions let you control the granularity you want to compute.

They can be performed at a:
* more granular level (INCLUDE),
* less granular level (EXCLUDE), or
* entirely independent level (FIXED).

A level of detail expression has the following structure:
{[FIXED | INCLUDE | EXCLUDE] <dimension> : <aggregate>}</aggregate></dimension>

48
Q

What does this LOD do, in plain language?
{ FIXED [Customer Name] : SUM([Sales]) }

A

No matter what other aggregations are happening on the viz, you want to reveal the sales per customer.

49
Q

How do you create a quick LOD?

A

In the Data pane, control-click drag the measure you want to aggregate onto the desired dimension

50
Q

Which LODs will be affected by the other dimensions in the view?

A

INCLUDE and EXCLUDE

FIXED is the only LOD that ignores what is in the view.

51
Q

What is the Tableau Order of Operations (aka query pipeline)?

A

The Order of Operations is the order filters are enacted. They are:

1) Extract Filters
2) Data Source Filters
3) Context Filters
– Sets, conditional filters, Top N, FIXED
4) Dimension Filters
– INCLUDE, EXCLUDE, Data Blending
5) Measure Filters
– Forecasts, Table Calcs, clusters, totals
6) Table Calculation Filters
– Trend lines, reference lines

52
Q

What does a “moving calculation” (aka “rolling calculation”) mean?

A

For each mark in the view, a Moving Calculation table calculation (sometimes referred to as a rolling calculation) determines the value for a mark in the view by performing an aggregation (sum, average, minimum, or maximum) across a specified number of values before and/or after the current value.

A moving calculation is typically used to smooth short-term fluctuations in your data so that you can see long-term trends. For example, with securities data there are so many fluctuations every day that it is hard to see the big picture through all the ups and downs. You can use a moving calculation to define a range of values to summarize using an aggregation of your choice.

53
Q

What does a “percentile calculation” mean?

A

For each mark in the view, a Percentile table calculation computes a percentile rank for each value in a partition. You can use a Percentile table calculation to rank the total sales for each month in a year as a percentage, rather than a whole number (for example, 1 through 10).

54
Q

Which table calculation computes the difference between the current value and another value in the table for each mark in the visualization?

A

A “Difference From” table calculation.

With a Difference From, Percent Difference From, or Percent From calculation, there are always two values to consider: the current value, and the value from which the difference should be calculated. In most cases, you want to calculate the difference between the current value and the previous value, as in the procedure above. But in some cases you may want something different.

55
Q

Describe how a “Percent of total” table calc works.

A

For each mark in the view, a Percent of Total table calculation computes a value as a percentage of all values in the current partition. You can use a Percent of Total table calculation to calculate the percentage of total sales each month makes within a quarter.

56
Q

When ranking marks, what do each of these table calcs do?
RANK
RANK_DENSE
RANK_MODIFIED
RANK_PERCENTILE
RANK_UNIQUE

A

RANK: Returns the standard competition rank for the current row in the partition. Identical values are assigned an identical rank. Use the optional ‘asc’ | ‘desc’ argument to specify ascending or descending order. The default is descending.

RANK_DENSE: Returns the dense rank for the current row in the partition. Identical values are assigned an identical rank, but no gaps are inserted into the number sequence.

RANK_MODIFIED: Returns the modified competition rank for the current row in the partition. Identical values are assigned an identical rank.

RANK_PERCENTILE: Returns the percentile rank for the current row in the partition

RANK_UNIQUE: Returns the unique rank for the current row in the partition. Identical values are assigned different ranks.

57
Q

The following are table calcs used for assigning numbers for marks. Describe how they all work.

A

FIRST: Smallest number gets “1”; others get sequential negative numbers

LAST: Biggest number gets the biggest number; smallest gets a “0”

INDEX: Returns the index of the current row in the partition, without any sorting by value.

58
Q

What are the four categorical filter types (dimensions)?

A

General
Wildcard
Condition
Top

59
Q

What are the four quantitative filter types (measures)?

A

Range of Values
At Least
At Most
Special: Select the Special option to filter on Null values. Include only Null values, Non-null values, or All Values.

60
Q

How do you create a table calculation?

A

To create a table calculation filter, create a calculated field, and then place that field on the Filters shelf.

Filters based on table calculations do not filter out underlying data in the data set, because table calculation filters are applied last in the order of operations.

61
Q

Can you apply a filter from one sheet to all others using the same data source OR related data source?

A

Yes, to apply a filter to all worksheets using a related primary data source:
* On the Filters shelf, right-click the field
* Select Apply to Worksheets > All Using Related Data Sources OR
* Select Apply to Worksheets > All Using this Data Source

62
Q

Why would you use a context filter?

A

You may create a context filter to:
* Force a filter to be carried out first
* Create a dependent numerical or top N filter. You can set a context filter to include only the data of interest, and then set a numerical or a top N filter.

63
Q

How do you make a context filter?

A

To create a context filter, select Add to Context from the context menu of an existing categorical filter.

64
Q

What color do context filters appear?

A

Charcoal black

65
Q

What is a parameter?

A

A parameter is a workbook variable such as a number, date, or string that can replace a constant value in a calculation, filter, or reference line.

A parameter won’t do anything until it’s tied to an element in the viz.

Parameters can be referenced in calculations, filters, and reference lines.

66
Q

What are parameter actions?

A

Parameter actions let your audience change a parameter value through direct interaction with a viz, such as clicking or selecting a mark.

Parameter actions open up new possibilities for creating summary values and statistics without using calculations. You can configure parameter actions to let users select multiple marks that are automatically aggregated into a single parameter value.

67
Q

What are sets?

A

Sets are custom fields that define a subset of data based on some conditions.

You can choose to show them in two ways on a viz:
1) In/Out members
2) Show only members in the set

68
Q

What is a Set Control?

A

To give your audience the ability to quickly modify members of a set, you can also display a Set Control. A set control is a worksheet card that is very similar to a parameter control or filter card.

69
Q

Can you combine Sets?

A

Yes, you can combine two sets to compare the members.

When you combine sets you create a new set containing either the combination of all members, just the members that exist in both, or members that exist in one set but not the other.

Example: To determine the percentage of customers who purchased both last year and this year, you can combine two sets containing the customers from each year and return only the customers that exist in both sets.

70
Q

How does a dynamic set differ from a fixed set?

A

There are two types of sets: dynamic sets and fixed sets. The members of a dynamic set change when the underlying data changes.

Dynamic sets can only be based on a single dimension.
Fixed sets can be based on a single dimension or multiple dimensions.

71
Q

Why would you want to create bins for a continuous measure?

A

It’s a way of grouping values to better understand their distribution. A histogram is often used in this capacity.

72
Q

How do you create Bins in Desktop?

A

1) In the Data pane, right-click (control-click on Mac) a measure and select Create > Bins.

2) Either enter a value in the Size of bins field or have Tableau calculate a value for you.

3) After you click OK to dismiss the Create Bins dialog box, a new binned field appears in the Dimensions area of the Data pane.

73
Q

How can you create a histogram without using Show Me?

A

1) Click a [Field] (bin) dimension in the Data pane and choose Convert to continuous.

2) Drag the [Field] (bin) dimension from the Data pane and drop it on the Columns shelf.

3) Drag the original [Field] field from the Measures area of the Data pane and drop it on the Rows shelf.

4) Click SUM(Field) on Rows and change the aggregation from Sum to Count.

74
Q

How do you create a Hierarchy in Desktop?

A

1) In the Data pane, drag a field and drop it directly on top of another field.

2) Drag additional fields into the hierarchy as needed. You can also re-order fields in the hierarchy by dragging them to a new position.

75
Q

For what are Groups useful?

A

Groups are useful for both correcting data errors (e.g., combining CA, Calif., and California into one data point) as well as answering “what if” type questions (e.g., “What if we combined the East and West regions?).

76
Q

How do you create a Group in Desktop?

A

There are multiple ways to create a group.
* You can create a group from a field in the Data pane, or
* By selecting data in the view and then clicking the group icon.

When you create groups in Tableau, you have the option to group all remaining, or non-grouped members into an “Other” group.

77
Q

What types of map visualizations can Tableau create?

A
  • Proportional symbol maps
  • Choropleth maps (filled maps)
  • Point distribution maps
  • Density maps (heatmaps)
  • Flow maps (path maps)
  • Spider maps (origin-destination maps)
78
Q

For what are choropleth maps (filled maps) good?

A

Choropleth maps are great for showing ratio data.

Example: If you want to see obesity rates for every county across the United States, you might consider creating a choropleth map to see if you can spot any spatial trends.

79
Q

When would a proportional map work well?

A

Proportional symbol maps are great for showing quantitative data for individual locations.

Example: You can plot earthquakes around the world and size them by magnitude.

80
Q

When would a point distribution map work best?

A

Point distribution maps can be used when you want to show approximate locations and are looking for visual clusters of data.

Example: If you want to see where all the hailstorms were in the U.S. last year, you can create a point distribution map to see if you can spot any clusters.

81
Q

For what are density maps (heat maps) best at showing?

A

Density maps, also called heatmaps, can be used when you want to show a trend for visual clusters of data.

Example: If you want to find out which areas of Manhattan have the most taxi pickups, you can create a density map to see which areas are most popular.

82
Q

When would a path map work well in a viz?

A

You can use flow maps to connect paths across a map and to see where something went over time.

Example: You can track the paths of major storms across the world over a period of time.

83
Q

When would you use a spider map?

A

You can use a spider map to show how an origin location and one or more destination locations interact.

Example: You can connect paths between metro stations to plot them on a map, or you can track bike share rides from an origin to one or more destinations.

84
Q

How do you add Grand Totals to a viz?

A

1) Click the Analytics pane.

2) In the Analytics pane, under Summarize, drag Totals into the Add Totals dialog, and drop it over either the Row Grand Totals or Column Grand Totals option.

Row grand totals appear automatically on the right-side of the visualization. Column grand totals appear automatically at the bottom of the visualization.

85
Q

When using Grand Totals, why would a figure appear incorrect?

A

When you first turn on grand totals, the totals are computed using disaggregated data in the underlying data source. The discrepancy is likely due to the fact that Tableau is averaging the data in the underlying data source. There may be more numbers involved than is in the view.

To avoid this, do the following:
1) From the Analysis menu choose Totals > Total All Using > Average .

Now the average is performed on the values you see, and not on the disaggregated date in the data source.

86
Q

What is the difference between a reference line, a reference band, and a reference distribution?

A

Reference line: a constant or computed value on the axis

Reference band: Reference bands shade an area behind the marks in the view between two constant or computed values on the axis.

Reference distribution: Reference distributions add a gradient of shading to indicate the distribution of values along the axis. Distribution can be defined by percentages, percentiles, quantiles (as in the following image), or standard deviation.

Note: Reference distributions can also be used to create bullet charts.

87
Q

How do you create a reference line?

A

To add a reference line:

Drag “Reference Line” from the Analytics pane into the view.
Tableau shows the possible destinations.
The range of choices varies depending on the type of item and the current view (Table, Pane, Cell).

if the view contains a line chart with multiple or dual axes—Tableau shows you an expanded drop target area with specific continuous fields. If you drop onto those fields, Tableau will the line is added on the corresponding axis, with the specified scope.

88
Q

For what is a bullet chart good?

A

A bullet graph is a variation of a bar graph developed to replace dashboard gauges and meters. The bullet graph is generally used to compare a primary measure to one or more other measures in the context of qualitative ranges of performance such as poor, satisfactory, and good.

You can create a bullet graph by adding a distribution to indicate the qualitative ranges of performance, and a line to indicate the target.

Show Me is the easiest way to make these, but you can do it manually.

89
Q

What are the computation types possible with distribution bands?

A

Percentages: shades intervals at the specified percentiles

Quantiles: breaks the view into the specified number of tiles using shading and lines

Standard Deviation: places lines and shading to indicated the specified number of standard deviations above and below the mean

90
Q

How do you make a box plot without using Show Me?

A

1) Right-click (Control-click on a Mac) on a quantitative axis and select Add Reference Line.

2) In the Add Reference Line, Band, or Box dialog box, select Box Plot.

3) Under Plot Options, specify placement for the whiskers. You can choose either:
* “Data within 1.5 times the IQR” - places whiskers at a location that is 1.5 times the interquartile range—that is, 1.5 times further out than the width of the adjoining box. This is also known as a schematic box plot.
* “Maximum extent of the data” - places whiskers at the farthest data point (mark) in the distribution. This is also known as a skeletal box plot.

4) Specify whether to Hide underlying marks (except outliers)—that is, whether to hide all marks except those beyond the whiskers.

5) Configure the appearance of the plot by selecting a Style , Fill, Border, and Whiskers.

91
Q

When would you use the 5 trend line options available in Desktop: Linear, Logarithmic, Exponential, Power, and Polynomial

A

Linear: Used when something is increasing or decreasing at a steady rate.

Logarithmic: Best for data that increases or decreases quickly and then levels out.

Exponential: Most useful when data values rise or fall at increasingly higher rates

Power: Best used with data sets that compare measurements that increase at a specific rate

Polynomial: Used to represent data that fluctuates, such as gains and losses over a large data set.

92
Q

How do you add a trend line in Desktop?

A

To add a trend line to a visualization:

1) Select the Analytics pane.

2) From the Analytics pane, drag Trend Line into the view, and then drop it on the Linear, Logarithmic, Exponential, Polynomial, or Power model types.

93
Q

How can you add a forecast to your viz?

A

To create a forecast, your view must use at least one date dimension and one measure.

To turn forecasting on, right-click (control-click on Mac) on the visualization and either choose:
* Forecast >Show Forecast, or
* Choose Analysis >Forecast >Show Forecast.

94
Q

When creating a forecast, an options dialog box presents four options. Define each:
* Forecast length
* Source data
* Forecast model
* Prediction interval

A
  • Forecast length: Determines how far into the future the forecast extends.
  • Source data: Use the Source Data section to specify time aggregation granularity, periods to ignore, and replacing nulls with zeroes.
  • Forecast model: Specifies how the forecast model is to be produced.
  • Prediction interval: You can set the prediction interval to 90, 95, or 99 percent, or enter a custom value.
95
Q

What type of regression do predictive modeling functions in Tableau use as a default?

A

Predictive modeling functions in Tableau by default use linear regression to build predictive models and generate predictions about your data

96
Q

When would you use the two table calculation functions used in predictive modeling:
* MODEL_PERCENTILE
* MODEL_QUANTILE

A

MODEL_PERCENTILE tells you, as a percentile, where the observed mark falls within a range of probable values for each mark. If the percentile is very close to 0.5, the value observed is very close to the median value predicted. If the percentile is close to 0 or 1, the value observed is at the lower or upper boundaries of the model range and is relatively unexpected.

You can use MODEL_QUANTILE to generate a confidence interval, missing values such as future dates, or to generate categories that don’t exist in your underlying data set.

These can be used to identify outliers, estimate values for sparse or missing data, and predict values for future time periods.

97
Q

Explain each of the parts of this calculated field:
MODEL_QUANTILE(0.5,SUM([Sales]),ATTR(DATETRUNC(‘month’,[OrderDate])))

A

MODEL_QUANTILE(0.5,SUM([Sales]),ATTR(DATETRUNC(‘month’,[OrderDate])))

  • MODEL_QUANTILE - This is the predictive modeling function. It specifies a line to be drawn
  • (0.5, - Defines the quantile as the median
  • SUM([Sales]), - This is the target expression
  • ATTR(DATETRUNC(‘month’,[OrderDate]))) - this is predictor expression
    Note: The ATTR was needed to make sure that all functions are aggregations.
98
Q

When would you use the three types of regression within predictive models?
* Linear (linear)
* Regularized linear (rl)
* Gaussian process regression (gp)

A

Linear regression (default): Use when you have only one predictor, and that predictor has a linear relationship with your target metric.

Regularized linear regression: Use when you have multiple predictors, especially when those predictors have a linear relationship to the target metric and those predictors are likely affected by similar underlying relationships or trends.

Gaussian process regression: Use when you have time or space predictors, or when you’re using predictors that might not have a linear relationship with the target metric.

99
Q
A