Data Preparation Flashcards

1
Q

order of operations with data

A
  1. gather raw data
  2. sort and filter
  3. format data
  4. remove duplicates
  5. summarize in better ways
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how to remove duplicates

A

data- remove duplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

fill a whole new column with just the first info before the column if the content looks like this: mirutzi suzuki, aos

A

write out content for 1-3 cells then, data - flash fill

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how to create a column called Car ID?

A

write one under new column, car id, then highlight all the cells, then home- fill- series - series in: columns, type: linear, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

import data

A

data - get data - From Text/CSV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

choose table style that’s all white

A

click any cell - table design - table styles - etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do you highlight the duplicate values in the Product Id column in light red with dark red font.

A

highlight column - home - styles - cond. formatting - highlight cells rules - duplicate values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

make a note of how many duplicates have been removed

A

count (bottom-right of excel window, or use =COUNTA(range) in an empty cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

tips to data prepare

A

create a lookup sheet that you can use for data prep

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

You have a column called Order Date that shows the date the order was placed with DataCo from 2015 to 2018. You might eventually want to add more date information to help with your analysis. A lookup sheet would be useful for this. how do you create a data lookup sheet?

A

create new sheet called Dates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Using the advanced Fill Series feature, how do I populate the Date column to show all dates from “1/1/2015” to “12/31/2018”, ensuring that every calendar day is included in the range.

A

click cell with 1/1/2015, then home - editing group - fill - series - stop value field 12/31/2018

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

remove filters currently applied

A

data - push the filter button

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

calculate the length of the text string in each cell

A

=len()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

remove all unwanted spaces (like trailing spaces) from the values in a cell

A

=trim()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ensure that a column is populated with values only, and not the formula used in previous steps

A

select whole column - ctrl+c - ctr+alt+v

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A column shows USA as “EE.UU.” How do you find and select all in “customer country” column that contain “ee. uu.” and replace it with “USA”

A

column - home - find and select - replace all

17
Q

combine text

A

=concat(cell,”, “cell)

18
Q

Protect the Customers worksheet to only allow users to select locked and unlocked cells and format cells, columns, and rows

A

search “protect” or
Review - protect sheet

19
Q

create new column week day and ensure week starts sunday

A

create column, then
=WEEKDAY(A2,1)

20
Q

Display the current date in cell E3, adding a clear label above the date.

21
Q

what formula is needed to create a Male Over 30 column

A

=AND([@Gender]=”Male”,[@Age]>30)

or
=IF(AND(NOT([@Gender]=”Female”),[@Age]>30),”TRUE”,”FALSE”)

22
Q

Calculate the average product price from all products sold in price column.

A

=average (click column)

23
Q

using NOT(), create a flag that identifies products with a price higher than or equal to the average.

A

create a cell in H3 that displays avg product price, then,

=Not(@product price>=$h$3)

24
Q

If-And logic

A

=IF(AND(E2=0, F2=”Same Day”), “Same Day - On Time,” “Other”)

If(both (e2=0, and f2-same day) then “same day - on time”; otherwise= “other”)

25
Alter this"Cash Payments" formula by using a nested IF() so it has three output values: "Cash Over 200", "Cash Not Over 200", and "Non-Cash Payments" =IF(AND(Z2="CASH",Y2>200), "Cash Over 200","Other")
26
VLOOKUP
VLOOKUP(cell content I'm using as reference, table array that has both the reference column and the content (column) I'm after, the number col in the array that I highlighted, exact or appx match)
27
Using order id and order quantity in your pivot tables, calculate the average order quantity per order per market in column D. Add the header "Average Quantity per Order" above your calculation.
pivottable- columns: SUM of order quantity, COUNT of order Id; rows: market; divide
28
how do you see how sales vary per weekday?
1. create a new column called "Day of Week" in orders sheet 2. Pull in the Week Day (number) from the Dates ie. =vlookup(orderdate, dates array containing weekday#, etc), format as general, then new column =text(weekday#,"dddd") 3.new pivot - rows:weekday, columns:sum total order sales
29
how do you refresh a pivottable
click on pivtotable - pivottable analyze - refresh