Data Analysis IV: Cluster Analysis (Week 11) Flashcards

1
Q

What is the aim of cluster analysis?

A

To form clusters/segments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is market segmentation?

A

Involves viewing a heterogeneous mkt as a number of smaller homogeneous mkts,

in response to differences b/w customers, and acting upon these diffs. b/w subgroups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the steps involved for segmentation?

A
  • Determine segmentation BASIS
  • Determine segmentation METHOD
  • CREATE segments
  • DESCRIBE segments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps involved for targeting?

A
  • SELECT one or more segments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the steps involved for positioning?

A
  • Develop STRATEGY & TACTICS for selected segments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the types of segmentation basis?

A

General (consumer-based) vs. product specific

Observable (objective) vs. unobservable (subjective)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are general & observable variables to form a segmentation basis?

A

Cultural, geographic, demographic & socio-economic variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are product-specific & observable variables to form a segmentation basis?

A

User status, user frequency, store loyalty & patronage, situations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are general & unobservable variables to form a segmentation basis?

A

Psychographics, values, personality & lifestyle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are product-specific & unobservable variables to form a segmentation basis?

A

Psychographics, benefits, perceptions, elasticities, attributes, preferences, intentions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the criteria for effective segmentation?

A

Segments should be:

  • Identifiable
  • Substantial
  • Accessible
  • Stable
  • Responsive
  • Actionable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is a substantial segment necessary for effective segmentation?

A

Sizable segment: To maintain profitability

Very costly to create multiple mktg msgs for diff. small segments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is an accessible segment necessary for effective segmentation?

A

Able to reach individual segments separately (and target specific groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is a stable segment necessary for effective segmentation?

A

Stable over time, not switching from one to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is a responsive segment necessary for effective segmentation?

A

Responsive - Diff. segments give diff. responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is an actionable segment necessary for effective segmentation?

A

Able to distinguish between diff. segments, diff. platforms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we form clusters? (i.e. What are the segmentation methods?)

A
  1. A-priori

2. Post-hoc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the a-priori segmentation method?

A

Segments determined by researchers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the post-hoc segmentation method?

A

Based on analyses.
Hard clustering - e.g. cluster analysis
Soft clustering - e,g, latent class analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is hard clustering?

A

Person CANNOT belong to >1 segment, can only belong to 1 cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is soft clustering?

A

Ppl have a certain PROBABILITY to be part of segment

22
Q

How do we want observations to be clustered?

A

We want observations WITHIN a cluster to be CLOSE TOGETHER

And CLUSTERS to be FAR from each other

23
Q

Why is it impossible to try all options for clustering?

A
  • Observations are closer together
  • Have more than 2 variables
  • Have many observations
  • How many clusters?
24
Q

What are the types of algorithms for cluster analysis?

A
  1. Hierarchical

2. Iterative (e.g. k-means)

25
Q

What is the hierarchical algorithm for cluster analysis?

A

Start with: all subjects separately (agglomerating) OR all in one cluster (divisive)
Combine/separate until reaching the end

26
Q

What is the iterative algorithm for cluster analysis?

A

Start with a solution

Move subjects b/w clusters until convergence

27
Q

What are the types of hierarchical algorithms?

A
  1. Agglomerating (e.g. Nearest Neighbour, Ward’s)

2. Divisive

28
Q

What is the agglomerating hierarchical algorithm?

A

Start: Each subject separately
Process: Join subjects/clusters together
End: All subjects in 1 cluster

29
Q

What is the divisive hierarchical algorithm?

A

Start: All subjects in 1 cluster
Process: Split clusters
End: Each subject separately

30
Q

What are the steps involved in agglomerating hierarchical algorithm?

A

E.g. 15 observations
Start: 15 clusters
- Calculate distances b/w points
- Combine points w/ smallest distance (homogeneous within clusters, heterogeneous b/w clusters)
- Calculate distance b/w 14 clusters and 1 cluster
- Form next cluster w/ smallest distance

31
Q

What are the methods to calculate the distance b/w a subject and a cluster, or between 2 clusters?

A
  1. Nearest Neighbour (Single Linkage)
  2. Centroid Method
  3. Furthest Neighbour (Complete Linkage)
  4. Ward’s Method
32
Q

What are the steps involved for Nearest Neighbour (Single Linkage)?

A

Choose subject for which the distance to NEAREST subject is SHORTEST

33
Q

What are the advantages of using Nearest Neighbour (Single Linkage) method?

A

Tendency to create chain-like clusters

Suitable for detecting outliers

34
Q

What are the steps involved for the Centroid Method?

A
  • Choose subject for which the distance to the MEAN of the cluster is SHORTEST
35
Q

What is the advantage of using the Centroid Method?

A

Little influence of outliers

36
Q

What are the steps involved for Furthest Neighbour (Complete Linkage) method?

A

Choose subject for which the distance to MOST FAR AWAY subject in cluster is SHORTEST

37
Q

What is the disadvantage of using the Furthest Neighbour (Complete Linkage) method?

A

Very sensitive to outliers

38
Q

What are the steps involved for Ward’s Method?

A

Choose subject that MINIMISES WITHIN-CLUSTER VARIANCE

39
Q

What is the advantage of using Ward’s Method?

A

Creates cluster of SIMILAR SIZE that are relatively COMPACT

40
Q

How do we determine whether there is an improvement if we move a subject to the other cluster?

A

Need to tell them:

  1. No. of cluster
  2. Exact composition - who’s in what cluster
41
Q

What is the 3-step approach that combines hierarchical and iterative algorithms?

A
  1. Nearest Neighbour
    - To detect OUTLIERS
  2. Ward’s Method
    - To decide on NO. OF CLUSTERS and obtain initial solution for Step 3
  3. K-Means
    - To obtain FINAL cluster solution
42
Q

What is the output obtained from the Nearest Neighbour method? How do we identify outliers?

A

Output: DENDROGRAM
Identify outliers: Dendrogram: - Indicates agglomeration order. Last subjects to be added may be outliers
Agglomeration schedule

43
Q

How do we decide on the number of clusters when executing Ward’s Method?

A
  • Manageable no. of clusters?
  • Size of clusters?
  • Interpretation of clusters
  • Large “horiz. jump” in dendrogram
  • Large jump in coefficients (agglomeration schedule)
44
Q

What is the output obtained from Ward’s Method?

A

Cluster membership in data columns - Frequency tables for e.g. 3- & 4-cluster solution

Cluster means

45
Q

What is the output obtained from K-Means Clustering?

A

Final cluster membership in data columns

Freq. table of final cluster size

46
Q

What are the 2 operational issues for cluster analysis?

A
  1. Distance measure

2. Standardisation

47
Q

What is the operational issue regarding distance measure?

A
  • For continuous variables: Euclidean distance

- For binary: E.g. Simple matching coefficient

48
Q

What is the operational issue regarding standardisation?

A
  • Per variable: Use if variables are measured on diff. scales
  • Per subject: Use if subjects have very diff. means
49
Q

What are the considerations to decide which clusters to target?

A
  • Fit with cluster positioning?
  • Cluster size?
  • Cluster profitability
50
Q

What are the considerations for positioning?

A

Diff strategy for diff. target groups?

  • One vs. multiple brand(s)
  • Diff. sales pitch