Lecture 11 - MDS Flashcards
What is different about MDS compared to clustering? How are they similar?
- MDS is older
- MDS has an explicit model (clustering has weak/no model)
SIMILAR: both analyse distance (dissimilarities) and both can do variables OR cases
How do you represent MDS?
- geometrical picture
- each case/variable is a point in space
- 1D: line
- 2D: chart
^ can do more than this, but 1/2 usually preferred because easier to visualise on page
What are the 3 types of MDS?
- CLASSICAL: simplest, 1 proxmity, matrix, interval data (often ratio)
- NON-METRIC: most common, 1 distance matrix, assumes ordinal only
- > 1 MATRIX: can create matrix for each subject, unweighted = replicated MDS, weighted = individual diffs MDS
What is the MDS problem?
- locate points in space to represent variables, so that distances b/w points is similar to original distances
- need to find: coordinates aj and ak of points j and k on the mth of r dimensions
- so that distance djk is distance b/w j and k, in r-dimensional Euc space
What is the goal of MDS?
- dimension reduction
- n variables, you have an n-dimensional space
- want to “shrink” down from many data points to fewer dimensions
- AIM: make data more understandable
What are the advantages and issues with dimension reduction?
- issues: always lose something
- good: very useful simplification of the data
- can oversimplify! (can lose too much info)
How do we define distance in MDS?
- we let the distance d be some function of r (where r = original distance)
- djk = f (rjk )
What is the difference between classic and modern MDS?
- classic: function is linear regression, djk = a + b.rjk
- modern: rank-ordered function (i.e. want rank order of distances to be the same as original)
How does MDS work?
- start with points located randomly in space (of a dimensionality chosen by the user)
- math procedures ‘move’ points around to minimise stress
- many iterations > until rank order of their distances ‘best’ matched the rank order of the data (i.e. until stress is the lowest point)
What is stress? What situations produce higher stress?
- badness of fit. How well does MDS representation fit data
- Kruskal
- less than 0.15 is good fit
- more variables = higher stress; higher dimensions = low stress
- want to use #dimensions that are required for acceptable stress value
What did Kruskal do? What is good about this?
- devised a method of rank-order transformations
- monotone regression
- distance of the proximity matrix converted into rank order
- no need for assumptions of ratio scales
What are the 2 MDS methods in SPSS?
- ALSCAL: method of steepest descent
- PROXSCAL: iterative majorisation > use this
Why use PROXSCAL?
- doesn’t need a good starting point
- always converges to minimum stress
How do you know when to stop?
- stress doesn’t change by larger than a preset criterion
- stress value reaches a preset min value
- program has reached set no. of iterations
What are the subjective vs. statistical interpretation of MDS?
- subjective: look at diagram to see patterns and how they group together > MORE COMMON
- statistical: regression of dimensional coordinates of an MDS solution against other variables
What are the 2 spatial manifolds?
- simplex: points form chain or sequence
- circumplex: points form a circle
What is the assumption of MDS? How do you check this?
- rship b/w proximity data and derived from distances is smooth
CHECK transformation plot for DEGENERACY
- want smooth plot
- degeneracy: points of representation are located in a few tight clusters
- these tight clusters may be small part of structure of data, but can swamp interpretation
- may make stress close to 0
Why is it r-dimensional?
r = the no. of dimensions we choose to represent the data
How do we actually reduce dimensions?
- TRANSFORMATION of distances!!!
- distances in new space must be different to original data
What is important about the axes in MDS?
- they are arbitrary!!
- relative distances are key
What can you use to determine how many dimensions to use in MDS?
scree plot > at elbow
- elbow may not be obvious if you have too few variables
What is the difference between transformations in classic vs. modern MDS?
- classical: linear
djk = a + b*rjk - modern: rank-order, not linear
What is the stress from random data thing? How was this done?
- when stress is less than random data, we can accept the solution
- graph: stress (y) v. dimensions (x), line for diff no. of variables
- you use the no. of dimensions and variables you are using to find the stress of the appropriate random data, want yours to be less than this!
- some dudes made random data, replicated many times and found stress for the data in a no. of solutions
In an MDS solution, what are other names for “informal groups” and “partitioning space”?
- informal groups: cluster analysis
- partitioning space: regression
What is in the transformation plot?
- Y: transformed proximities
- X: (original) proximities
What does MDS transform ordinal proximities in to?
distance data