Data Analysis Flashcards
What is EDA?
Exploratory Data Analysis
what is the process of eda
- prep (collecting, cleaning)
- explore (learn variables, compute summaries, find correlations/trends, visualize the data
- hypothesis and further analysis
what are summary statistics?
measures of central tendency: mean, median, avg
Format Sum of Licenses Bought (in your pivot table) as a number with a , separator and 0 decimal places.
Right-click # - number format - number
Create a Calculated Field that divides Sales Amount by Number of Users and call it “Sales per Sub”.
click in pivot - pivottable analyze - fields,items, and sets - calculated field
which is switch and which is and IF statement?
B2>100,”you’re rich!”,”Keep saving.”)
B2,1,”One”,2,”Two”,3,”Three”
IF=B2>100,”you’re rich!”,”Keep saving.”)
SWITCH=B2,1,”One”,2,”Two”,3,”Three”
how to find relationship between discounts and number of units sold
calculated field - =discounts/units sold
explain SWITCH logic
=switch[@[from this column]],
(if)”Basic”,(then)10,
“Premium”,10,
“Business”,25,
“Enterprise”,27,
(if none of these, then) 0)
0 is a necessary default value
IF formula for new column called “Non-Paying Users” that finds the difference on the row level between the Number of Users and Licenses Bought
=IF(#ofusers -licensesbought<0,0,#ofusers -licensesbought)
logic: if #users-licenses is less than 0, then 0, if not, then show the difference
how do you re-include a moved column into the same nominal table as the rest of the sheet?
click in table - table design - resize table
how do i Update a formula to only multiply if the Sales Month is the latest month, otherwise default to 0?
- find latest Sales Month =MAX([Sales Month])
- Modify the formula in your “Current Upsell $” column to only multiply when the Sales Month matches the latest month
=IF([@[Sales Month]]=[@[Max Sales month]],[@[Non-paying users]]*[@[Upsell Per non-paying customer]],0)
how do you Count the number of customers per month using Account Sales History, Customer ID, and Sales Month
=countif(whole sales month column, sales month specified on new sheet)
how do you calculate the total number of sales for the month
=sumif(range:Sales month from account sales history og sheet, single cell sales month from sales trends sheet, sales amount from og sheet)
how do you average the Total Sales per month by Number of Customers
=averageif(range:#of customers columnB:B, criteria:first cell #ofcust B:2,average_range: Total Sales column)
how do you test your hypothesis that hat the share of non-paying customers (“Basic” subscriptions) are growing faster than our paying customers?
use COUNTIFS() to count the number of customers with a Basic subscription for the Sales Month.