Clustering Flashcards
What is cluster analysis?
A statistical technique that can be applied to data that exhibit natural groupings
Could you give some examples of some measures of similarity
Correlation coefficient, Distance measure, association coefficients
What is one way of measuring distance?
Euclidean distance. Square difference between two variables
What are the basic steps involved in cluster analysis?
- Formulate the problem, select the variable you want to use as basis
- Compute the distance between customers
- apply the clustering procedure to distance measure
- Decide on the number of clusters
- Map and interpret clusters-draw conclusion-
What is K-means clustering?
It is a popular algorithm used for clustering as it is simple and speedy. The user must specify the number of clusters required before starting the algorithm
Can you go through the steps of the k-means clustering
1.Choose the number of clusters k
2.Generate k random points as cluster centroids
3.Assign each point to the nearest cluster centroid
4.Recompute the new cluster centroid (average of all the points in a cluster)
5.Repeat the two previous steps until some convergence criterion is met. Usually the convergence criterion is that the assignment of customers to clusters has not changed over multiple iterations
What is the main issue with K-mean clustering?
It does not provide an estimate of number of clusters to use
What is the elbow criterion?
It is a way of determining number of clusters to use. It states that you should choose a number of clusters so that adding another cluster does not add sufficient information
What are the criteria frequently used to evaluate the effectiveness of the segmentation scheme?
Identifiability- extent to managers can recognize segments in the marketplace
Sustainability – satisfied is segments represent a large enough portion of the market to ensure profitable customization of the marketing program
Accessibility- extent to which managers can reach the identified segments through marketing campaigns
Actionability- needs of target segment are consistent with the goals and core competencies of the firm
What is it called when you describe the clusters?
Profiling
How would you categorize cluster analysis under source, methodology and objective?
Source: primary
Methodology: quantitative
Objectives: descriptive