Labs Flashcards

1
Q

Which 4 activities fall into data engineering (data pre-processing)?

A
  • data integration
  • data transformation
  • data reduction
  • data cleaning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is ETL?

A

Extraction, Transformation & Load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the definitions of Extraction, Transformation & Load (ETL)?

A
  • Extraction: selecting data from one or more sources and reading the selected data
  • Transformation: converting data from their original form to whatever form the DW needs. This step often also includes cleansing of the data to remove as many errors as possible.
  • Loading: putting the converted (transformed) data into the DW
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Thinking from a Power BI view, what are the different activities in the data integration process (5 in total)?

A
  • Data Loading – read and convert from data sources
  • Data Transformations – join, aggregate, filter, convert data
  • Data de-duplication – finds multiple records referring to the same entity, merges them
  • Data Profiling – builds tables, histograms, etc. to summarize data
  • Data Quality – test against master values, known business rules, constraints, etc.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does th Power Query Editor do?

A
  • you can connect to one or many data sources,
  • shape and transform the data to meet your needs,
  • then load that model into Power BI Desktop
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is appending in Power BI?

A

When you have additional rows of data that you’d like to add to an existing query, you append the query.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is merging in Power BI?

A

When you have one or more columns that you’d like to add to another query, you merge the queries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is M language used for?

A

M is the query language that can be viewed in the Power Query Editor.

Power Query (or M) is an ETL (Extraction, Transformation, Load) tool, so it gets data and transforms it into tables that can be loaded and structured in a data model.

e.g. after filtering, if you go to the Advanced Editor, you can find the underlaying M code that was written by Power BI for filtering the values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is DAX language used for?

A

DAX is the language you use when you create transformations in the Power BI Desktop main screens rather than via the query editor.

DAX creates Measures & Calculated columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between calculated columns and measures?

A

A calculated columns works at row level.

A Measure value is only activated given a certain context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some of Power BI’s built-in time intelligence functions?

A
  • SAMEPERIODLASTYEAR
  • TOTALYTD
  • PARALLELPERIOD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which three views do you have in Power BI Desktop?

A
  • Report view
  • Data View
  • Relationship View
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a dimensional model used for?

A

A Dimensional model is designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are relation models used for?

A

relation models are optimized for addition, updating and deletion of data in a real-time Online Transaction System (Databases).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a dimensional model?

A
  • A Dimensional Model is a data structure technique optimized for Data warehousing tools. It is the architecture behind a DW.
  • The concept of Dimensional Modelling was developed by Ralph Kimball and is comprised of “fact” and “dimension” tables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a surrogate key?

A

One should create anonymous integer Primary keys for every dimension, a Surrogate Key (SK).

17
Q

What are Additive, Semi-Additive and Non-Additive facts?

A
  • Additive facts can be summed up through all of the dimensions in the fact table. (Sales Amount, Quantity)
  • Semi-additive facts can be summed up for some of the dimensions in the fact table, but not the others. (Current Bank Account Balance)
  • Non-additive facts cannot be summed up for any of the dimensions present in the fact table. (Record Student attendance)