week 6 Flashcards
Classical Scaling (MDS/PCO) and non-metric MDS (nMDS)
PCA
- PCA is a projection of the sample points (objects) onto
axes drawn in a direction that maximizes total
variation along the axis. - New axes are uncorrelated.
- Preserves Euclidean distances among points.
- Does not have formal assumptions, but we know that
variables should not be highly skewed, and the
relative scales of variables will matter (if done on the
covariance matrix).
When to Use PCA
- Quantitative data (covariances have no
meaning for qualitative data) - Data are approx. normally distributed
- The number of variables (p) should be less
than the number of replicates (n) - Avoid using if there are lots of zeros in the
data or if data are highly skewed - PCA works best to reduce dimensions when
there is high correlation among variables
MDS - Multidimensional Scaling
MDS is a method that takes the distance matrix and tries to find a configuration of points in a space (in this case, a 2D space) where the distances between the points match the distances in the matrix as closely as possible.
principal coordinates analysis (PCO) - give example
Imagine you have a dataset with information about the heights and weights of a group of people. A classical solution like principal coordinates analysis (PCO) or classical scaling would take this data and find a way to represent each person as a point in a space where the distances between the points reflect how similar or dissimilar the people are based on their heights and weights.
“Euclidean space”
“Euclidean space” refers to the familiar space we live in, where distances are measured in a straight line, and the principles of Euclidean geometry apply. It’s named after the ancient Greek mathematician Euclid, who laid down the foundational rules for this type of space.
Here’s a simple way to understand it:
Imagine you have a piece of graph paper, with two axes: one horizontal (x-axis) and one vertical (y-axis). Each point on this graph represents a unique location in this space. For example, the point (2,3) would mean moving 2 units to the right along the x-axis and 3 units upwards along the y-axis from the origin (0,0).
Jaccard’s coefficient and Jaccard dissimilarity
Directly interpretable as the “proportion of species (or characteristics)
shared” between two objects or sites. S = a/(a+b+c) ,Ignores joint absence information
DJ = 1 – SJ = Jaccard dissimilarity, interpretable as the proportion of
unshared species (or characters)
Distance measures for continuous variables
- Euclidean distance
* “as the crow flies”, straight line distance in variable space
* Pythagoras’ thereom: the square root of the sum of squared
differences on each dimension
* most commonly usedfor continuous data - Manhattan distance
* “as the taxi drives”, simply the sum of the absolute
differences in each dimension
* less affected by outliers - Mahalanobis distance
* takes into account covariances
* good for spherical clusters
Bray-Curtis dissimilarity
It does not include double zeros.
* The comparison between two sampling units does
not depend on the rest of the data set.
* It is a ratio with an upper limit: 0 ≤ d ≤ 1.
* Can be interpreted as a “percentage difference” in
ecological terms.
* The contribution of variables to the measure will be
relative to their scale of measurement – so an
appropriate transformation a priori is generally
recommended
Bray-Curtis takes into account both the types of species in a community and how abundant they are. It’s like considering not only the types of houses in a neighborhood but also how many of each type there are.
For instance, if in Neighborhood A, there are mostly small houses and in Neighborhood B, there are mostly big houses, Bray-Curtis helps capture these differences in both types and abundances.
A complexity of PCO
- If the distance used is metric, then
- all eigenvalues are positive
- If the distance used is semi-metric, then
- there will be some negative eigenvalues!
Eigenvalues
Eigenvalues are like measures of how much variation or spread there is in the data. Higher eigenvalues mean more spread, while lower ones mean less spread.
Shepard plot
what is the relationship between the distances in
the original matrix compared with what we are
seeing in Euclidean space on our plot. To check if your squishing job is any good, you can look at something called a Shepard plot. This plot compares the distances between points in the original data with the distances between those same points in the squished-down space. If the distances look similar, then your squishing job is probably pretty good. But if the distances are all wonky, then maybe you need to squish the data in a different way to capture the important stuff better.
Why use MDS?
Why use MDS?
* Sometimes PCO doesn’t do a very good job of
representing the relative inter-point
relationships in a small no. of dimensions.
* Instead of using an eigenvector technique, we
could use a criterion based on distances
themselves that focuses explicitly on
dimension reduction.
* Namely, why not use the concept underlying
the Shepard plot as our criterion for
ordination?
MDS
Choose a number of dimensions (k) that you want for the
configuration.
*Place the points into that space so that the Euclidean distances
in the configuration (of reduced dimension) replicate the actual
underlying dissimilarities as well as possible
Metric vs Non-metric MDS
- The criteria used for metric vs non-metric MDS are different:
- Metric MDS: the relationship between the new Euclidean
distances and the original dissimilarities being modeled is
linear. - Non-metric MDS: the relationship between the new Euclidean
distances and the original dissimilarities being modeled is
monotonic (i.e., we preserve the ranks of dissimilarities)
Non-metric MDS
- Considers absolute dissimilarities to be arbitrary - the
only thing of interest is relative differences. - Constructs a “map” or configuration of the points
which imitates the higher-dimensional “map”. - The goal is to order the points relative to one another
in the same order as the original dissimilarities. - We want to do this as well as possible in a small
number of dimensions (the no. of dimensions is
chosen a priori). - Non-metric MDS is more robust than either PCO
(classical scaling) or metric MDS for uncovering high
dimensional patterns, so it will be our focus here.
least squares monotone regression
the MDS technique is being used to find the least squares monotone regression of a new variable (ˆ new d rs) based on the original variable (drs) while minimizing stress, ensuring that the relationship between the variables is preserved as much as possible in the low-dimensional space.
stress in MDS
Stress is a measure of how much the distances between the points in the low-dimensional space differ from the original distances in the high-dimensional space. Minimizing stress means finding the configuration of points that best represents the original distances.
- STRESS is a measure of how good the fit is
between the original distances and the new
distances on the plot. - The lower the stress, the more the MDS plot
reflects real distance structures in the data. - A published MDS plot should always state a
value for stress