Prepare - Clean, transform, and load data in Power BI Flashcards

Question

What are column transforms?

Answer 1

Transforms values within columns

Answer 2

``` Fill values Sort columns Rename columns Move columns Replace Values Parse text Extract Text ```

Answer 3

- Column quality = copy quality metrics - Column distribution = copy distribution - Column profile = copy value distribution

Answer 4

- To check the first 1000 rows (however this isn't a full reflection of data) - Change to entire dataset so you can see a more complete picture (however this could slow Power Query editor down)

Answer 5

- Shows shape of data in that column | - Is the distribution uniform

Answer 6

- Prefixes and suffixes associated with particular database tables

Answer 7

- Can select (…) and group the values

Answer 8

- Resolve inconsistencies - Unexpected or null values - Data quality issues - User-friendly values

Answer 9

Nested cells: - Errors - Lists - Records - Attributes - Tables Selecting any of these links adds another query step

Answer 10

- When the data source does not communicate the data type

Answer 11

- Options & Settings > options > CURRENT FILE > Data Loaded > Type Detection

Answer 12

- Sometimes it can lead to incorrect formatting which has downstream implications - Replace errors otherwise Power Query may convert to text

Answer 13

- Not all data types are available (e.g., percentage, currency etc) - Further formatting can occur in Data view within Power BI desktop

Answer 14

- If you are not going to perform mathematical calculations on it then it is not worth being numeric (e.g., Postcodes have numbers but will not be used in a calculations)

Answer 15

- Power Query caches output rather than updating immediately - Click on another query after data conversion to force changes

Answer 16

- Change date format to match local area

Answer 17

- Incorrect locale can result in faulty data (e.g., 9/11 in US and 11/9 in UK) - Power BI cannot detect this nuance

Answer 18

- Allows us to check if the queried data has loaded correctly - Get an initial check on data quality - Nulls are not included in error count (e.g., null not recognised as text)

Answer 19

- No because it shows the first 1000 rows | - Right-Click and change to entire dataset

Answer 20

- The shape of the data

Answer 21

- Group values to profile different values

Answer 22

- Match entire cell: will only replace if the value is a complete match rather than just a part match - Replace using special characters: null is an example of special character

Answer 23

- Checking nested contents (e.g., errors, lists, records and tables)

Answer 24

Can combine tables in several ways: - Combine tables (in Power Query or DAX) - Create relationships (in Power BI in Model view)

Answer 25

Keys for joins - Not necessary to create composite key - Column data type important - Can use multiple columns Keys for relationships - More forgiving of data type - Only creates relationships on single pair of columns - May need to merge columns first if you have composite key

Answer 26

Two options - 1) Create new column - 2) Merge in place n.b. column selection order important (order of text follows order of column)

Answer 27

- Data view will give an exact answer once loaded

Answer 28

- When you change query steps | - The last steps query the previous steps

Answer 29

- Creates a new column for each value

Answer 30

- Selected columns are converted into 2 columns

Answer 31

- Swaps the rows and columns

Answer 32

- Column from examples - Power Query attempts to write a transformation sequence that reflects your actions - Custom column - type own M formula - Invoke custom function - invokes a custom function for every row of a table - Conditional column - specify if-then-else logic - Index column - creates a sequential column that’s starts and increments the value you specify - Duplicate column - copy of column you select

Answer 33

- Reduce or keep rows - Pivot or unpivot - Adding columns

Answer 34

- Creates a column without a data type so you have to specify the data type manually before loading to ensure it loads correctly

Answer 35

Expand list - Expand to new rows: creates a new for each row inside the row's list - Extract values: keep row numbers the same and concatenate list rows into one cell with a separator you specify

Answer 36

Select the cell with the hyperlink

Answer 37

- Braces = a list | - Double period = range

Answer 38

- Joins when there are multiple matching columns | - Columns which don't line up with new table(s) will be filled with null values (table gets taller)

Answer 39

- Need a set of keys or matching values | - Specify which rows of the first table should be combined with which rows of the second table

Answer 40

- Left outer = all from first match from second table - Right outer = all from second matching from first table - Full outer = all rows from both - Inner = only matching rows - Left anti = rows only in first - Right anti = rows only in second The merge joins are all about the a) columns that have matching values and b) the order in which you select the tables. So if there is 1112 in the first table and 1234 in the second table we will only see the additional columns for 1(3 times) and 2 (once) for a first outer join…Conversely a right outer join would be about which columns in the second match from columns in the first

Answer 41

Expands the table - Expand: select the columns from the joined table that will be added to the current table (matching rows duplicated) - Aggregate: will aggregate rows and won't duplicate any rows…aggregating means you can apply statistical functions to the columns that have joined

Answer 42

When words are lowercase to uppercase….camelCase

Answer 43

- Report builders may want to browse data model fields and will need to have some intuitive understanding of what each field means (aim to make the fields understandable to a non-technical person),

Answer 44

- Although you can use the pre-built transformation buttons in Power Query Editor there is a lot more flexibility when you write M code

Answer 45

- Source = is the only step that is the same - Navigation = all navigation steps are grouped in Power Query - Dealing with spaces = when you have special characters (e.g., a space) for a command M uses quotation marks and a prefixes them with a hashtag (e.g., #"Changed Type")

Answer 46

- View > Formula bar

Prepare - Clean, transform, and load data in Power BI Flashcards

(70 cards)