Data Analysis IV: Cluster Analysis (Week 11) Flashcards

Question 1

Q

What is the aim of cluster analysis?

Answer

A

To form clusters/segments

Question 2

Q

What is market segmentation?

Answer

A

Involves viewing a heterogeneous mkt as a number of smaller homogeneous mkts,

in response to differences b/w customers, and acting upon these diffs. b/w subgroups

Question 3

Q

What are the steps involved for segmentation?

Answer

A

Determine segmentation BASIS
Determine segmentation METHOD
CREATE segments
DESCRIBE segments

Question 4

Q

What are the steps involved for targeting?

Answer

A

SELECT one or more segments

Question 5

Q

What are the steps involved for positioning?

Answer

A

Develop STRATEGY & TACTICS for selected segments

Question 6

Q

What are the types of segmentation basis?

Answer

A

General (consumer-based) vs. product specific

Observable (objective) vs. unobservable (subjective)

Question 7

Q

What are general & observable variables to form a segmentation basis?

Answer

A

Cultural, geographic, demographic & socio-economic variables

Question 8

Q

What are product-specific & observable variables to form a segmentation basis?

Answer

A

User status, user frequency, store loyalty & patronage, situations

Question 9

Q

What are general & unobservable variables to form a segmentation basis?

Answer

A

Psychographics, values, personality & lifestyle

Question 10

Q

What are product-specific & unobservable variables to form a segmentation basis?

Answer

A

Psychographics, benefits, perceptions, elasticities, attributes, preferences, intentions

Question 11

Q

What are the criteria for effective segmentation?

Answer

A

Segments should be:

Identifiable
Substantial
Accessible
Stable
Responsive
Actionable

Question 12

Q

Why is a substantial segment necessary for effective segmentation?

Answer

A

Sizable segment: To maintain profitability

Very costly to create multiple mktg msgs for diff. small segments

Question 13

Q

Why is an accessible segment necessary for effective segmentation?

Answer

A

Able to reach individual segments separately (and target specific groups)

Question 14

Q

Why is a stable segment necessary for effective segmentation?

Answer

A

Stable over time, not switching from one to another

Question 15

Q

Why is a responsive segment necessary for effective segmentation?

Answer

A

Responsive - Diff. segments give diff. responses

Question 16

Q

Why is an actionable segment necessary for effective segmentation?

Answer

A

Able to distinguish between diff. segments, diff. platforms

Question 17

Q

How do we form clusters? (i.e. What are the segmentation methods?)

Answer

A

A-priori

2. Post-hoc

Question 18

Q

What is the a-priori segmentation method?

Answer

A

Segments determined by researchers

Question 19

Q

What is the post-hoc segmentation method?

Answer

A

Based on analyses.
Hard clustering - e.g. cluster analysis
Soft clustering - e,g, latent class analysis

Question 20

Q

What is hard clustering?

Answer

A

Person CANNOT belong to >1 segment, can only belong to 1 cluster

Question 21

Q

What is soft clustering?

Answer

A

Ppl have a certain PROBABILITY to be part of segment

Question 22

Q

How do we want observations to be clustered?

Answer

A

We want observations WITHIN a cluster to be CLOSE TOGETHER

And CLUSTERS to be FAR from each other

Question 23

Q

Why is it impossible to try all options for clustering?

Answer

A

Observations are closer together
Have more than 2 variables
Have many observations
How many clusters?

Question 24

Q

What are the types of algorithms for cluster analysis?

Answer

A

Hierarchical

2. Iterative (e.g. k-means)

Question 25

Q

What is the hierarchical algorithm for cluster analysis?

Answer

A

Start with: all subjects separately (agglomerating) OR all in one cluster (divisive)
Combine/separate until reaching the end

Question 26

Q

What is the iterative algorithm for cluster analysis?

Answer

A

Start with a solution

Move subjects b/w clusters until convergence

Question 27

Q

What are the types of hierarchical algorithms?

Answer

A

Agglomerating (e.g. Nearest Neighbour, Ward’s)

2. Divisive

Question 28

Q

What is the agglomerating hierarchical algorithm?

Answer

A

Start: Each subject separately
Process: Join subjects/clusters together
End: All subjects in 1 cluster

Question 29

Q

What is the divisive hierarchical algorithm?

Answer

A

Start: All subjects in 1 cluster
Process: Split clusters
End: Each subject separately

Question 30

Q

What are the steps involved in agglomerating hierarchical algorithm?

Answer

A

E.g. 15 observations
Start: 15 clusters
- Calculate distances b/w points
- Combine points w/ smallest distance (homogeneous within clusters, heterogeneous b/w clusters)
- Calculate distance b/w 14 clusters and 1 cluster
- Form next cluster w/ smallest distance

Question 31

Q

What are the methods to calculate the distance b/w a subject and a cluster, or between 2 clusters?

Answer

A

Nearest Neighbour (Single Linkage)
Centroid Method
Furthest Neighbour (Complete Linkage)
Ward’s Method

Question 32

Q

What are the steps involved for Nearest Neighbour (Single Linkage)?

Answer

A

Choose subject for which the distance to NEAREST subject is SHORTEST

Question 33

Q

What are the advantages of using Nearest Neighbour (Single Linkage) method?

Answer

A

Tendency to create chain-like clusters

Suitable for detecting outliers

Question 34

Q

What are the steps involved for the Centroid Method?

Answer

A

Choose subject for which the distance to the MEAN of the cluster is SHORTEST

Question 35

Q

What is the advantage of using the Centroid Method?

Answer

A

Little influence of outliers

Question 36

Q

What are the steps involved for Furthest Neighbour (Complete Linkage) method?

Answer

A

Choose subject for which the distance to MOST FAR AWAY subject in cluster is SHORTEST

Question 37

Q

What is the disadvantage of using the Furthest Neighbour (Complete Linkage) method?

Answer

A

Very sensitive to outliers

Question 38

Q

What are the steps involved for Ward’s Method?

Answer

A

Choose subject that MINIMISES WITHIN-CLUSTER VARIANCE

Question 39

Q

What is the advantage of using Ward’s Method?

Answer

A

Creates cluster of SIMILAR SIZE that are relatively COMPACT

Question 40

Q

How do we determine whether there is an improvement if we move a subject to the other cluster?

Answer

A

Need to tell them:

No. of cluster
Exact composition - who’s in what cluster

Question 41

Q

What is the 3-step approach that combines hierarchical and iterative algorithms?

Answer

A

Nearest Neighbour
- To detect OUTLIERS
Ward’s Method
- To decide on NO. OF CLUSTERS and obtain initial solution for Step 3
K-Means
- To obtain FINAL cluster solution

Question 42

Q

What is the output obtained from the Nearest Neighbour method? How do we identify outliers?

Answer

A

Output: DENDROGRAM
Identify outliers: Dendrogram: - Indicates agglomeration order. Last subjects to be added may be outliers
Agglomeration schedule

Question 43

Q

How do we decide on the number of clusters when executing Ward’s Method?

Answer

A

Manageable no. of clusters?
Size of clusters?
Interpretation of clusters
Large “horiz. jump” in dendrogram
Large jump in coefficients (agglomeration schedule)

Question 44

Q

What is the output obtained from Ward’s Method?

Answer

A

Cluster membership in data columns - Frequency tables for e.g. 3- & 4-cluster solution

Cluster means

Question 45

Q

What is the output obtained from K-Means Clustering?

Answer

A

Final cluster membership in data columns

Freq. table of final cluster size

Question 46

Q

What are the 2 operational issues for cluster analysis?

Answer

A

Distance measure

2. Standardisation

Question 47

Q

What is the operational issue regarding distance measure?

Answer

A

For continuous variables: Euclidean distance

- For binary: E.g. Simple matching coefficient

Question 48

Q

What is the operational issue regarding standardisation?

Answer

A

Per variable: Use if variables are measured on diff. scales
Per subject: Use if subjects have very diff. means

Question 49

Q

What are the considerations to decide which clusters to target?

Answer

A

Fit with cluster positioning?
Cluster size?
Cluster profitability

Question 50

Q

What are the considerations for positioning?

Answer

A

Diff strategy for diff. target groups?

One vs. multiple brand(s)
Diff. sales pitch