Lecture 11 - MDS Flashcards

1
Q

What is different about MDS compared to clustering? How are they similar?

A
  • MDS is older
  • MDS has an explicit model (clustering has weak/no model)

SIMILAR: both analyse distance (dissimilarities) and both can do variables OR cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you represent MDS?

A
  • geometrical picture
  • each case/variable is a point in space
  • 1D: line
  • 2D: chart
    ^ can do more than this, but 1/2 usually preferred because easier to visualise on page
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 types of MDS?

A
  • CLASSICAL: simplest, 1 proxmity, matrix, interval data (often ratio)
  • NON-METRIC: most common, 1 distance matrix, assumes ordinal only
  • > 1 MATRIX: can create matrix for each subject, unweighted = replicated MDS, weighted = individual diffs MDS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the MDS problem?

A
  • locate points in space to represent variables, so that distances b/w points is similar to original distances
  • need to find: coordinates aj and ak of points j and k on the mth of r dimensions
  • so that distance djk is distance b/w j and k, in r-dimensional Euc space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the goal of MDS?

A
  • dimension reduction
  • n variables, you have an n-dimensional space
  • want to “shrink” down from many data points to fewer dimensions
  • AIM: make data more understandable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the advantages and issues with dimension reduction?

A
  • issues: always lose something
  • good: very useful simplification of the data
  • can oversimplify! (can lose too much info)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we define distance in MDS?

A
  • we let the distance d be some function of r (where r = original distance)
  • djk = f (rjk )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference between classic and modern MDS?

A
  • classic: function is linear regression, djk = a + b.rjk

- modern: rank-ordered function (i.e. want rank order of distances to be the same as original)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does MDS work?

A
  • start with points located randomly in space (of a dimensionality chosen by the user)
  • math procedures ‘move’ points around to minimise stress
  • many iterations > until rank order of their distances ‘best’ matched the rank order of the data (i.e. until stress is the lowest point)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is stress? What situations produce higher stress?

A
  • badness of fit. How well does MDS representation fit data
  • Kruskal
  • less than 0.15 is good fit
  • more variables = higher stress; higher dimensions = low stress
  • want to use #dimensions that are required for acceptable stress value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What did Kruskal do? What is good about this?

A
  • devised a method of rank-order transformations
  • monotone regression
  • distance of the proximity matrix converted into rank order
  • no need for assumptions of ratio scales
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the 2 MDS methods in SPSS?

A
  • ALSCAL: method of steepest descent

- PROXSCAL: iterative majorisation > use this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why use PROXSCAL?

A
  • doesn’t need a good starting point

- always converges to minimum stress

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you know when to stop?

A
  • stress doesn’t change by larger than a preset criterion
  • stress value reaches a preset min value
  • program has reached set no. of iterations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the subjective vs. statistical interpretation of MDS?

A
  • subjective: look at diagram to see patterns and how they group together > MORE COMMON
  • statistical: regression of dimensional coordinates of an MDS solution against other variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 2 spatial manifolds?

A
  • simplex: points form chain or sequence

- circumplex: points form a circle

17
Q

What is the assumption of MDS? How do you check this?

A
  • rship b/w proximity data and derived from distances is smooth

CHECK transformation plot for DEGENERACY

  • want smooth plot
  • degeneracy: points of representation are located in a few tight clusters
  • these tight clusters may be small part of structure of data, but can swamp interpretation
  • may make stress close to 0
18
Q

Why is it r-dimensional?

A

r = the no. of dimensions we choose to represent the data

19
Q

How do we actually reduce dimensions?

A
  • TRANSFORMATION of distances!!!

- distances in new space must be different to original data

20
Q

What is important about the axes in MDS?

A
  • they are arbitrary!!

- relative distances are key

21
Q

What can you use to determine how many dimensions to use in MDS?

A

scree plot > at elbow

- elbow may not be obvious if you have too few variables

22
Q

What is the difference between transformations in classic vs. modern MDS?

A
  • classical: linear
    djk = a + b*rjk
  • modern: rank-order, not linear
23
Q

What is the stress from random data thing? How was this done?

A
  • when stress is less than random data, we can accept the solution
  • graph: stress (y) v. dimensions (x), line for diff no. of variables
  • you use the no. of dimensions and variables you are using to find the stress of the appropriate random data, want yours to be less than this!
  • some dudes made random data, replicated many times and found stress for the data in a no. of solutions
24
Q

In an MDS solution, what are other names for “informal groups” and “partitioning space”?

A
  • informal groups: cluster analysis

- partitioning space: regression

25
Q

What is in the transformation plot?

A
  • Y: transformed proximities

- X: (original) proximities

26
Q

What does MDS transform ordinal proximities in to?

A

distance data