Data transformation in Tableau Prep Flashcards
The _ tab allows you to select the connection to your data, which can be stored in a file (e.g., Excel), on a cloud server (e.g., BigQuery), or a different type of data storage solution.
Connections
(Image 1)
How many ways are there to connect to your data?
- “Connection” Tab
- “Connect to Data” button
(Image 1)
How can you select the connection to your data?
With the Connections tab at the upper left corner
e.g. Connections > To a File > Text File
OR
Select “Connect to Data” at the top of screen, above “Recent Flows”
(Image 1)
The connection to your data can be stored [in a file/ on a cloud server/both]
both
also a different type of data storage solution
(Image 1)
How to edit an existing data flow
Select “Open a Flow” (Above “Recent Flows”)
(Image 1)
How to create a new data flow
Select “Connect to Data” (Above “Recent Flows”)
(Image 1)
What is an easy way to familiarize yourself with the functionalities available in Tableau Prep Builder?
- Open the 2 “Sample Flows” at the bottom of the screen and view how the data flows were set up and manipulate them
- View “Training” and “Resources” on the “Discover” tab
(Image 1)
T/F
After Connecting to the data, you need to update the parameter settings
True
After connecting, you click the table on the left side and change the settings as needed
What do the settings in Image 2, where will the column header and data values start?
Column header will be the first row and the data values will start from the second row
T/F
Tableau Prep Builder cannot detect various data types (e.g., dates, numeric values, strings) and relationships between data fields.
False
Tableau Prep Builder can detect various data types (e.g., dates, numeric values, strings) and relationships between data fields.
T/F
Tableau Prep is helpful for loading data from multiple data sources that have different data types
False
Tableau Prep is helpful for loading data from multiple data sources that have the same data types
Tableau Prep is helpful for loading data from multiple data sources that have the same data types bc ___
Tableau Prep Builder detects various data types (e.g., dates, numeric values, strings) and relationships between data fields.
After opening the dataset, the second step is to _
clean the dataset
How do you clean a dataset?
- click on the + icon next to the dataset in the top workspace
- select “Clean Step”
(Image 3)
Change data types, grouping similar values together, and splitting column values is an example of _ in Tableau Prep
Cleaning data
How can you replace the negative values with null values in a column (let’s call it “Founded” column)?
What are the steps?
Image 6 shows what you have to what you want the chart to look like
1- Get to “Clean Step”:
- (click + icon next to the dataset in the top workspace
- “Clean Step”) (Image 3)
2- Create a calculated field:
- Select the “…” button in the Founded column
- “Create Calculated Field”
- “Custom Calculation” (Image 4)
- Fill in “Field Name” with “Replace Negative with Nulls” &
- write “IF [Founded] < 0 THEN NULL ELSE [Founded] END”)
- “Save” (Image 5)
T/F
Aggregating data allows you to adjust to the level of detail of your data table.
True
Aggregating data allows you to _
adjust to the level of detail of your data table
T/F
Aggregating data allows you to combine data from multiple tables together that have the same level of detail.
True
T/F
Aggregating data allows you to combine data from multiple tables together that have different levels of detail.
False
Aggregating data allows you to combine data from multiple tables together that have the same level of detail.
How can you aggregate data in Tableau Prep?
1- Add aggregate step to workflow canvas:
- (click + icon next to the dataset in the top workspace
- “Aggregate”) (Image 7) Results in Image 8
2- Add the appropriate fields:
- drag and drop any additional fields on the left pane onto the fields on the right** Image 9**
- Change to the appropriate aggretation (e.g. change from SUM to AVG)
See Image 10 for result
How can you create a pivot in Tableau Prep?
1- Add “Pivot”:
- (click + icon next to the dataset in the top workspace
- “Pivot”) (Image 11)
2- Edit “Pivoted Fields” pane:
- Select appropriate option: Columns to Rows VS Rows to Columns (Image 12)
- Drag & Drop the appropriate fields to “Pivoted Fields” panel
How can you create a join in Tableau Prep?
(Assume that “Table 1” is already in the Flow and “Table 2” is already connected)
Create a Join interface:
Way 1: Drag “Table 1” to the right of “Table 2”
* Hover “Table 2” onto the RIGHT of “Table 1”
* drop it in the “Join” popup. (See Image 15 & 16 to see that the table can be original or cleaned)
Way 2 ** Add a Join Step**
1. Click + icon next to the dataset in the top workspace and select “Join”) (Image 3)
2. Hover Table 2 over to the LEFT of the Join step
3. Drop Table 2 into the “Add” popup. (Note that if you hover to the RIGHT/BELOW the Join step (using the Join/Union popups), it will create another Join/Union step. Not what we want. We want to add a Table to Join step we just created in step 1 of Way 2.)
After either Way 1 or 2, edit join type & the join clause
Results: Image 13
How can you create a union in Tableau Prep?
(Assume that “Table 1” is already in the Flow and “Table 2” is already connected)
Create a Union interface:
1a- (Way 1) Drag “Table 1” below “Table 2”
* Hover “Table 2” UNDER “Table 1” and drop it in the “Union” popup. (See Image 15 & 16 to see that the table can be original or cleaned.)
1b- (Way 2)** Add a Union Step**
1. Click + icon next to the dataset in the top workspace and select “Union”) (Image 3)
2. Hover the 2nd table over to the LEFT of the Union step
3. Drop into the “Add” popup. (Image 17) (Note that if you hover to the RIGHT or BELOW of the Union step (using the Join/Union popups), it will create another Join/Union step. Not what we want. We want to add a Table to the existing Union step.)
Results: Image 14
T/F
The Union interface has info on resulting and mismatched fields.
True
T/F
To make your data run faster, rename steps to reflect their purpose or the transformation they perform.
False
To make your flow easier to understand and maintain, rename steps to reflect their purpose or the transformation they perform.
T/F
Using multi-field operations where possible helps to minimize steps, avoiding unnecessary complexity.
True
Which of these will simplify your workflow:
* cleaning and validating data
* limiting initial rows
* using meaningful step names
* using sampling
* using calculated fields wisely
* minimizing steps
* filtering early
- Use meaningful step names: Rename steps to reflect their purpose or the transformation they perform. This practice makes your flow easier to understand and maintain.
- Minimize steps: While it’s important to break down your workflow into logical steps, avoid unnecessary complexity. Use multi-field operations where possible and consolidate steps that perform similar transformations.
Which of these will optimize preformance:
* cleaning and validating data
* limiting initial rows
* using sampling
* using calculated fields wisely
* minimizing steps
* filtering early
- limiting initial rows
- using sampling
- filtering early
Apply filters as [early/late/centric] as possible in your flow to reduce the volume of data processed in subsequent steps.
early
When exploring and building your flow, limit the number of [columns/rows] initially loaded to speed up performance.
rows
Where can you limit the number of rows initially loaded?
In the input step setting
Try to keep your columns below _
10
The more columns that are in your input data, the [larger/smaller] the number of rows that Prep will allow
smaller
T/F
Prep will somtimes apply sampling if the number of rows in your dataset is larger than the maximum number of rows.
False
Prep will only apply sampling if the number of rows in your dataset is larger than the maximum number of rows.
https://www.tableau.com/blog/how-to-use-sampling-in-tableau-prep
Prep will only apply sampling if the number of rows in your dataset is larger than the maximum number of rows.
True
https://www.tableau.com/blog/how-to-use-sampling-in-tableau-prep
Which of these will Ensure Data Quality:
* cleaning and validating data
* limiting initial rows
* using sampling
* using calculated fields wisely
* minimizing steps
* filtering early
- Clean and validate data: Utilize the Clean step to address data quality issues such as duplicates, nulls, and outliers. Validate your data at various stages of the flow to ensure accuracy.
- Use calculated fields wisely: Calculated fields are powerful for data transformation, but complex calculations can impact performance and readability.
T/F
The Clean step is used to address data quality issues such as duplicates, nulls, and outliers.
True
T/F
Validate your data at early stages of the flow to ensure accuracy.
False
Validate your data at various stages of the flow to ensure accuracy.
T/F
Tableau Desktop focuses on data preperation & cleaning.
False
Tableau Prep Builder focuses on data preperation & cleaning.
Tableau Desktop VS Tableau Prep Builder
What’s their focus?
Visualizing data & creating dashboards
VS
data preparation & cleaning
T/F
Tableau Prep has a coding interface.
False
A Table is a dataset
True
A file is a dataset
True
A dataset is what?
A dataset is simply the collection of data, often organized as tables or files, that you use to create visualizations and dashboards.
T/F
Dataset, datasource, data frame, and database are all the same thing.
True
What are 3 other names for data source
Dataset, data frame, and database
A database with multiple tables is called a
Relational Database