Data Preparation Flashcards
order of operations with data
- gather raw data
- sort and filter
- format data
- remove duplicates
- summarize in better ways
how to remove duplicates
data- remove duplicates
fill a whole new column with just the first info before the column if the content looks like this: mirutzi suzuki, aos
write out content for 1-3 cells then, data - flash fill
how to create a column called Car ID?
write one under new column, car id, then highlight all the cells, then home- fill- series - series in: columns, type: linear, etc
import data
data - get data - From Text/CSV
choose table style that’s all white
click any cell - table design - table styles - etc
how do you highlight the duplicate values in the Product Id column in light red with dark red font.
highlight column - home - styles - cond. formatting - highlight cells rules - duplicate values
make a note of how many duplicates have been removed
count (bottom-right of excel window, or use =COUNTA(range) in an empty cell
tips to data prepare
create a lookup sheet that you can use for data prep
You have a column called Order Date that shows the date the order was placed with DataCo from 2015 to 2018. You might eventually want to add more date information to help with your analysis. A lookup sheet would be useful for this. how do you create a data lookup sheet?
create new sheet called Dates
Using the advanced Fill Series feature, how do I populate the Date column to show all dates from “1/1/2015” to “12/31/2018”, ensuring that every calendar day is included in the range.
click cell with 1/1/2015, then home - editing group - fill - series - stop value field 12/31/2018
remove filters currently applied
data - push the filter button
calculate the length of the text string in each cell
=len()
remove all unwanted spaces (like trailing spaces) from the values in a cell
=trim()
Ensure that a column is populated with values only, and not the formula used in previous steps
select whole column - ctrl+c - ctr+alt+v
A column shows USA as “EE.UU.” How do you find and select all in “customer country” column that contain “ee. uu.” and replace it with “USA”
column - home - find and select - replace all
combine text
=concat(cell,”, “cell)
Protect the Customers worksheet to only allow users to select locked and unlocked cells and format cells, columns, and rows
search “protect” or
Review - protect sheet
create new column week day and ensure week starts sunday
create column, then
=WEEKDAY(A2,1)
Display the current date in cell E3, adding a clear label above the date.
=today()
what formula is needed to create a Male Over 30 column
=AND([@Gender]=”Male”,[@Age]>30)
or
=IF(AND(NOT([@Gender]=”Female”),[@Age]>30),”TRUE”,”FALSE”)
Calculate the average product price from all products sold in price column.
=average (click column)
using NOT(), create a flag that identifies products with a price higher than or equal to the average.
create a cell in H3 that displays avg product price, then,
=Not(@product price>=$h$3)
If-And logic
=IF(AND(E2=0, F2=”Same Day”), “Same Day - On Time,” “Other”)
If(both (e2=0, and f2-same day) then “same day - on time”; otherwise= “other”)