Marketing Analytics Flashcards
Why segment the market?
Value may be more significant to the consumer; better consumer experience; more cost effective
What is mass marketing?
Offer 1 thing to the whole mass
What is segment-based marketing?
Offer different offerings for each segment - ex. Group A vs. Group B
What is one-to-one marketing?
Offer different offerings to each customer - ex. A -> Customer 1, B -> Customer 2, etc.
What are the 4 bases to segment consumer markets?
- Demographic: age, gender, income, ethnicity, etc.
- Geographical: world region, country, city vs. rural, etc.
- Behavioral: usage, loyalty, etc.
- Psychographic: lifestyles, beliefs, attitudes, interests, personality, values, etc.
What is the OCEAN model for segmenting markets?
O = openness - do they enjoy new experiences?
C = conscientiousness - do they prefer plans and order?
E = extraversion - do they like spending time with others?
A = agreeableness - do they put other people’s needs before theirs?
N = neuroticism - do they tend to worry a lot?
How would you calculate the distance between respondents? (3 steps)
- Decide on the clustering variables & importance of each of the characteristics
- Select a measure of (dis)similarity using the Euclidean distance
- Select a clustering method like hierarchical clustering
What is the Euclidean Distance formula?
Euclidean distance (De) of (x,y) = √ (x2 – x1)^2 + √ (y2 – y1)^2
You need a data matrix to be able to compute the distance matrix (distance matrix is the values from Euclidean distance formula). This would be used if you wanted to calculate the distance between consumer perspectives (for example, importance of innovation vs. constant communication)
What are the 3 key criteria for actionable segmentation?
- Distinctive: customers within a segment are similar but differ from customers in other segments
- Substantial: sufficient large to create value
- Accessible: ability to reach customers within segments
What does the intercept in a linear regression stand for?
It is the value of the dependent variable when both independent variables are 0. Ex. it’s the value of sales when radio and tv advertising is 0.
What is the p-value stand for?
It is the measure of statistical significance, so if the p-value is below 0.05 then the pattern or relation we find is statistically significant.
What is the adjusted r-squared show?
It shows the variation in intercept vs. independent variables. Ex. sales vs. radio and tv advertising.
This is a measure of the goodness of fit of the model therefore the higher it is, the better you can explain your marketing mix.
What is an elasticity?
It’s the % change in response variable for a 1% change in the predictor variable
Ex. the % change in sales for a 1% change in advertising spending.
How do you use the ratio of elasticities method? (3 steps)
- Sum elasticities
- Compute ratio of elasticities
- Multiply ratio of elasticities by total budget to spend & make recommendation
How do you compute elasticities from a linear regression model?
Advertising elasticity = advertising estimate (from linear regression) * (baseline advertising / baseline sales)
What is underfitting?
low R-squared and low predictive accuracy
What is overfitting?
high R-squared and low predictive accuracy
If a model has a high R-squared and a low predictive accuracy, the model may be tailored to the specific data but fails to generalise outside the sample
What happens when a firm always spends the same amount on facebook campaigns and instagram campaigns in the same weeks?
Both variables will be perfectly correlated (ie. 1) and the model will not be able to distinguish between the impact of the two perfectly correlated variables
So, even if predictors are highly correlated (ie. higher than .80), the model can suffer from ‘multicollinearity’ which reduces the accuracy of the estimates because the observed effect might be overstated
How do you deal with multicollinearity?
- Calculate correlations between predictor variables
- If the correlations are high, you can reformulate the variables where possible
Ex. you can sum advertising spendings across channels (ie. online vs. offline)
Omitted variable bias exists if two conditions are met:
- an omitted variable affects your response variable
AND - the omitted variable correlates with one or more predictor variables
What do we know about carryover effects? When can they happen?
In reality, consumer response to advertising can be delayed, therefore not accounting for carryover effects can cause advertising elasticities to be under-valued.
How do you measure the effect of x (ex. advertising) beyond the current time period?
Adstock equation
Ex. Adstock(t) = coefficient(t) + lambda *Adstock(t-1)
For example, if lambda = 0.3, then adstock from one time period ago still had a 30% effect in the current time period
What is synergy effects?
The combined used of marketing mix instruments
Why use predictive analytics in marketing?
To leverage historical data to identify the likelihood of future outcomes.
It also helps us predict how consumers will behave and then we can predict what/how to change the marketing strategy for them.
What is the difference between Statistical Modeling vs. Machine Learning?
Statistical Modeling makes statistical inference about relationships between variables using historical data
Ex. Linear regression model to understand return on marketing investment
Machine Learning models outperform statistical models in terms of predictive accuracy. They vary in interpretability
What are the two types of Machine Learning algorithms?
Supervised vs. Unsupervised
Unsupervised learning: given set of inputs (no output) and algorithm looks for patterns and structures in data
Ex. Hierarchical clustering
Supervised learning: given set of inputs and outputs and algorithm learns to generate outputs from inputs
Ex. Classical And Regression Tree (CART)
Explain how overfitting can occur in a model?
If a model has a high r-squared but a low predictive accuracy, the model may be tailored to the specific data, but fails to generalize outside the sample.
Including too many predictors in the training sample can result in poor predictive performance.
What happens when we don’t include a predictor variable that drives response?
- Lower r-squared
- Higher MAPE
- Biased estimates
What does CART stand for and what are they used for?
Classification And Regression Tree (CART)
- Classification Tree
- Suitable for discrete outcomes
- Ex: Predicting customer churn (0/1) - Regression Tree
- Suitable for continuous outcomes
- Ex: Predicting brand sales
Can a classification tree model account for:
- Non-linear effects?
- Synergy effects?
- Carryover effects?
- Multicollinearity?
- Predictive accuracy?
- Yes
- Yes
- Yes
- Multicollinearity no bc mutually exclusive
- Yes, more accurate than the linear regression model bc it takes into account non-linear effects
What are the advantages of using a classification tree? (3)
- Easy to implement
- No need to specify relation in advance
- Implicitly performs variable selection
- Can automatically account for nonlinear relationships and interaction effects
- Predictive accuracy
What is the major disadvantage of using a classification tree and what is the solution?
Disadvantage = Overfitting, harder to generalize data
Solution = Random forest
What is a random forest?
- Random forests combine results from multiple trees
- They select a random set of predictors to build multiple trees, and then average the results
- This leads to more generalizable data and higher predictive accuracy
What are the benefits of A/B testing in marketing? (4)
- Increase website traffic
- Reduce bounce rates
- Increase conversion rates
- Reduce cart abandonment
What are some limitations of A/B testing? (3)
- A/B testing can help making data-driven decisions on marketing mix decisions (ie. discount) as well as on seemingly subjective questions of design, color, layout of marketing content
- Power lays in ability to easily and continuously test what is working for your brand or company in the market
- However, A/B testing is not free of costs (ie. set up, opportunity costs, analysis)
What is the solution for multivariate testing?
Where A/B testing tests different versions of a web page, multivariate testing tests multiple elements on a web page
What is algorithmic bias in marketing?
Algorithmic bias in marketing happens when the algorithms used to make marketing decisions favor one group of people over another, often unfairly.
Imagine you’re at a big party, and the bouncer only lets in people wearing blue shirts. That’s a form of bias, right?
- Can occur by incorporating potentially discriminating characteristics (ie. gender)
- Can also occur when excluding discriminating characteristics (ie. If you can’t entirely avoid gender in your model bc it correlates with behaviour then if you used behaviour and not gender, then an omitted bias can occur
How can you mitigate algorithmic bias?
- Bias detection: does the algo produce the same outcome for identical customers differing on characteristics like gender or race?
- Diversity: does programming team have different background, experiences/perspectives?
- Incorporate fairness into algos: can we optimize our desired outcome and fairness?