3_2: Market Analytics: Analysing and Predicting aggregated Demand and Competiton Flashcards
What is Customer Segmentation?
Segmentation: slicing a pie
is the process of dividing a heterogeneous customer base into distinct groups based on shared characteristics, behaviors, or needs
–>Groups are homogenous within and **heterogenous
Name the 3 difficulties of customer segmentation?
Finding actionable outcome: is the result meaningful for the business need=
Choice of method: there is no method that is priori preferable to others
Iterative approach: much time and rounds of data collection
RFM analysis is used to predict …?
used to predict the response rate and profitability generated by marketing campaings
–>tool to identify a organization´s best customers
What is the meaning behind “RFM”?
Recency (R): numbe rof time units that have passed since last purchased
Frequency (F): average number of purchases per time unit
Monetary value (M): total doolar amount spent per time unit
For what is the RFM method used
- it is used to identify the most engaged customers based on the observation that recency, frequency, and monetary metrics are often correlated with the probability of response and lifetime value
- common to use same discrete scoring for all three metrices –>three dimensional cube
- targeting decisions by selecting a subset of segment from the RFM cube
4 Steps of the RFM Analysis
- recency: sort databse in terms of most recent transactions and score your customers
- frequency: re-sort the dartabase on frequency
- Monetary value: Re-sort the database on sales dollar volume (monetary value)
-
Selection use the three columns and each customers total score
–>highest scores are the best customers
Advantages and Disadvantages of RFM Analysis?
Pros:
- Valuable for short term financial orientation
- requires no marketing strategy
- Fast, simple, and easy to use, exlain and implement
Cons:
- Liimited marketing usage as iit is only about engagement
- does not measure the factors that impact customer behavior
What is Behavioral segmentation?
Goal to understand customer behavior (marketing orientation)
- uses behavioral data instead of financial data
- –>Customer core behavior doen´t change
What is Supervised learning (Classification)?
Group affiliation: is known
Goal: to predict outcome data from independent variables
From RFM Analysis to Behaviroal Segmentation
Picture
What is Unsupervides learning (clustering)?
Group affiliation: unkown
Goal: discover grouppings from data structue
What are the two methods of distance-based clustering?
Hierachircal clusteirng
K-Mean-based clustering
–>minimize the discance between the group member while max. distance to members of other groups
What are the two methods of Model-based clustering?
General description of Model-based clustering?
Model-based clustering
Latent class analyis
–>Model data so that the observed variance can be represented by a small group with specific distrib. characteristics
How does Heararchical clustering work?
observations are group acc. to their similarity (distance matrix) clust method used complete linkage method
What are the steps in Hierachical Clustering?
(Distance-based clustering)
- Calculation of distance between the observation by Euclidean distance dissimilarity matrix
- the model uses the complete linkage method, comparing distance between all group members
- Output dendogram which is interpreted by height and where observations are joined
Dendogram: Hierarchical clusteirng
(distance-based clustering)
At the lowest level, the groups are combined into smaller groups that are relatively similar.
–>These groups are sucessively combiine with less similar groups
Height = dissiminlarity
What is k-Mean-based clustering?
(also k-mean clustering) = find groups based of sum-squares deviation from the multivariate center of the assigned group
–>centers need to be specified
What are the steps in (k-)mean-based clustering?
-
Choose number of clusters and maximum distance
–>requires numeric data - Find observation for cluster 1
- Take second obersavtion if far enough from 1 –>Cluster
- > Take next observation and compare with 1 and 2 (ggf. cluster 3)
What do k-means cluster plots show?
What are the limitations
whether it is possible to differentiate groups based on key variables
Limitation:
K-means requires arbitrary specification of clusters (use different values for k)
–>difficult to determine whether one solution is better than the other
What is the problem with K-means cluster plots?
Difficult whether one sultion is better than another
–>Repeat analysis for several number of clusters to compare the results
How can the outcome of the k- mean based model be tested?
Distance based
- Check mean values by ussing aggregate()
- Plot k-mean cluster to chech if it is possible to differentiate groups based on key variables
- Alternatively plot two continous variable by segment
How can the solution of the model based clusteirng be tested? ( mclust())
How to compare between models?
- Check mean values by using aggregate()
- PLot model based clusters
- use other values for G, and compare the model outputs:
- Log liklihood –>less negative
- BIC –>lowest value
–>also used for model comparisoon
How can the solution of the Laten class analysis be tested?
- Check mean values using aggregate()
- Plot the LCA clusters
- compare predicted class memberships
How can the solution of hierarchical clustering be tested?
Distance based clustering
- Zooming in and focusing on certain branches of the dendogram
- use the ceophentic correlation coefficient (CPC) = measure the correlation between the original dissimilarty and the cophentic distances
- CPCC close to 1 –>strong positive correlation
What is the output of model-based clustering ?
Either shows the optimal number of clusters if g was not predetermined
- BIC
- Log-Liklihood
Key facts about model-based clustering?
(mclust)
- observation come from groups with different statistical distributions –>algorithm try to find best set of such underlying distribution
- it clusters as being drawn from a mixture of normal distribution
- Can only be used with numerical data
What is the Laten Class Analysis (LCA)?
(Model-based Clustering)
differences are attributable to unobserved groups that one wishes to uncover (nclass is predetermined) –>Bayesian technique
–>Goal: estimate probabilities of membership in each class and assing individual to their most likely class
Steps in Latent-class analysis?
(Model-based clustering)
- Variable scores are caused by the hidden groups
- LCA posits a latent variable that maximizes liklihood of obserrving the scorces and the variables
- It creates a probability of each observation belonging to each segment
- Segment with highest probability is the segment where most observations are placed
Advantages of LCA?
- Possible for complex data
- Provides optimal number of clusters
- Provided indicator for variables
- Segment probability score
What are diagnostics to test for statistical fit? ( Latent class analysis - model based clustering)
- BIC:Bayes information criterion (lower values better)
- Error rate (better if lower)
- Negative log likelihood (better if less negative)
Clustering process in R?
- Transofrm to right format
- Compute distance matrix
- Apply clustering method
- Analyze groups
- Examine solution in the model and apply
How does Classification work?
(Supervised learning)
= use observations with known status to derive predictors which then can be applied to new observations
Classification process in R
- Colllcecct data (group membership is known)
- Splitting data: Trainsing set with 50-80% of obserations and test set with 20-50%
- Build prediction model: identify predictors from training data
- Assess performance by applying predictors to test dataset
How can the performance of the Naiive Bayes Classification be assed?
- Considering raw agreement rate
mean(raw$Segment == predition) = 0.92 –>92% correct prediction - compare performance against random chance using ARI
- 1 = perfect agreement, 0 = random -1 = complete disagreement - asses performance for each different class using Confusion Matrix
- actual segment is left (rows)
- predicted (columns
How can the performance of the Random forrest be assed?
- compare performance against random chance using ARI (adjustedRandIndex)
1= perfect agreement, 0 = random -1 = complete disagreement - asses performance for each different class using Confusion Matrix using test data
- actual segment is left (rows)
- predicted (columns
How can the outcome of the Naive Bayes Classification be tested?
- use test data to predict() values based on trained model using test data
- Revview the segment frequencies and compare to the inital a-priori frequencies based on the training data
How can the outcome of the Random Forest be tested?
- use test data to predict() values based on trained model using test data
- plot the clusters based on test data
What does Naive Bayes Classification do? (Supervised learning)
= Training data is used to learn probability of class membership as a function of each predictor variable considered independently –>using bayes rule
–>starts with observed probabilities of vairbales conditiona on segments found in the training data
—>only uses one model
What is the Random forrest classification (compared to Naive bayes classification? (Supervised learning)
Instead of unsing a sinlge model, it builds and ensemlbe of models that jointly classify the data by fitting many classification trees (forrest)
–>not providing class membership
What is the Class imbalance? and how can it be resolved in RandomForrest models?
using randomForest for prediction the model might generate values with 90% being in one group –>imbalance
–>Resolving:
- looking at frequency table of the training data, to see the group allocation –>pick smallest
- SSet sampsize= “value” in randonForest model
What is meant by importance analysis in the Random Forest model?
the model uses many predictor variables, thus it is useful to know the importance of different classification variables
–>randomForest( importance = TRUE)
Advantage of Random forrest model?
- many classification trees(format) instead of one model
- More accurate because more models are applied
- Useful to estimate the importance of predictor variables
Classification trees are used for?
used to predict a categorical (and usually) binary dependent varible
What are the steps of classification trees?
- Identify most effect prediction variable for predicting the binary dependent variable
- Stat wit root node with all combinations and then use independent variables to split the root node to create most improvement in class separation
Classification trees:
What is a Pure decision node?
Pure decision node: all data points associated with that node have same value of dependent variable
Classification trees:
What is Impurity?
Impurity (unreinheit): assess the degree of impurity or heterogeneity within a subset of data in the classification tree.
impurity of a split: is the weighted average of impurities for the nodes involved in that split
Classification trees:
What is entropy?
–>Entropy is a measure of impurity or disorder
- Entropy is calculated using conditional probability
- Always between 0 and 1
–>**Lower entropy –>decreasing impurity!
What is Collaborative filtering?
filter and predict choices based on other people behavior
–>Typically using rating matrixes
Idea: people with similar preferences tend to like similar items
What are the two method of collaborative filtering?
- Neighborhood based method: prediict unknown rating by using the nearest neighbor approach
- Model-based method: use more complex, predictive models
Collaborative filtering: baseline estimates:
A rating matrix exhibts string user and item biases
- one can account for these systematic user and item effects:
The model caputures only the average user and item effect but it can help to absorb the biases and isolate the signal that represent user item interactions
Advantages Collaboorative filtering?
Advantages:
- Make recommendations without any additional information about catalog items
- helps to produce non-trivial recommendations
- rating captures human tastes and judgments
What are the disadvantes of Collaborative filtering?
Disadvantages:
- difficult to build reliable prediction models with trustworthy rating
- content filtering is biased towards popular items and standard choices
- cold start problem does not work for new users
- Product standardization is difficult
- Arbitrary assumptions
Advantages of User-based filtering?
- easy to program
- more attractive when users are personally familiar
Advantage of Item-based filtering?
- preferred when customers are not familiar
- item-based matrix of correlation is more stable over time
- matrix needs to be updated less often