Evaluate, Prepare and Connect to Data - 25% Flashcards
benefits of using an extract
underlying data source is slow
snapshot of data optimizes for aggregation
can be used offline
privacy - data source filters/hide fields not used
when to use an extract
If you want your visualizations to show weekly/monthly/annual trends. Extracts are also recommended for when you’re publishing a multi-connection data source.
If workbook performance is more important than data freshness, then use an extract.
If your workbook contains sensitive data, an extract may be best.
benefits of live connection
Real-time updates
cons of live connection
Databases are not always optimized for fast performance (unlike extracts) – As data queries go through the database, they can only as fast as the database itself. Accordingly, working with a live connection may be slow.
Other factors can affect speed – e.g. Poor network speed and network traffic can slow down your workbook.
Stress – Live connections, especially within complex workbooks, can stress some traditional databases.
when to use live connection
A business that needs incoming sales data to make real-time decisions would require a live database connection.
If business decisions need to be predicated upon real-time data, utilise a live connection.
How to pivot data in tableau
Select columns you want to pivot
You can add to the pivot
Troubleshooting pivots
Red fields in the view and fields with exclamation points in the Data pane: Because the original fields are replaced with new pivot fields, any references to the original fields in the view will no longer work.
Null values in the grid: If all of the original fields used in the pivot are removed, for example in an extract refresh, null values display in the pivot fields.
No pivot option: Pivot appears when you select two or more columns in a single Microsoft Excel, text file, Google Sheets, and .pdf data source. If using a different data source in Tableau Desktop, you can use custom SQL to pivot
What changes can be made in the metadata
Datatype of some fields
Renaming and hiding fields
Column aliases can be assigned
Do changes to the metadata modify the underlying data
no
What are relationships
Relationships are the default method in TD; this preserves the OG table LOD when combining info
Allow for context-based joins to be performed on a sheet by sheet basis making each data source flexible
Can published data sources be used in joins
No; you must edit the OG data sources to natively contain the join or use a data blend
Advantages of relationships
Makes data easier to define, change, and reuse
Easier to analyze data across multiple tables at the correct LOD
Does not require LOD expressions
Only query data from tables with fields use din current viz
What are joins
Static ways to combine data
Must be defined between tables before analysis
There must be a matching field to join the data
Can’t be changed without impacting all sheet
Requirements for using relationships
Related fields must be same data type
You can’t define relationships based on geographic fields
Circular relationships arent supported in data model
Cant edit relationship in published data source
Cant define relationships between published data sources
Workbook must be used in embedded data source to edit relationships and performance options in data source page on server/online
Factors that limit the benefits of using related tables
Dirty data
Using data source filters will limit ability to join culling data
Tables with a lot of unmatched values across relationships
Interrelating multiple fact tables with multiple dimension tables