Chapter 15 Linear Discriminant Analysis Flashcards
WHEN DO WE USE LDA? P71
Logistic regression is a classification algorithm traditionally limited to only two-class classification problems. If you have more than two classes then the Linear Discriminant Analysis is the preferred linear classification technique. But even with binary classification problems, it is a good idea to try both logistic regression and linear discriminant analysis.
WHAT ARE THE LIMITATIONS OF LOGISTIC REGRESSION? P71
Two-Class Problems. Logistic regression is intended for two-class or binary classification problems. It can be extended for multi-class classification, but is rarely used for this purpose.
Unstable With Well Separated Classes. Logistic regression can become unstable when the classes are well separated.
Unstable With Few Examples. Logistic regression can become unstable when there are few examples from which to estimate the parameters.
WHAT ARE THE ASSUMPTIONS OF LDA? P72
That your data is Gaussian, that each variable is shaped like a bell curve when plotted.
That each attribute has the same variance; that values of each variable vary around the mean by the same amount on average.
HOW CAN WE PREPARE DATA FOR LDA? P73
Classification Problems. This might go without saying, but LDA is intended for classification problems where the output variable is categorical. LDA supports both binary and multiclass classification.
Gaussian Distribution. The standard implementation of the model assumes a Gaussian distribution of the input variables. Consider reviewing the univariate distributions of each attribute and using transforms to make them more Gaussian-looking (e.g. log and root for exponential distributions and Box-Cox for skewed distributions).
Remove Outliers. Consider removing outliers from your data. These can skew the basic statistics used to separate classes in LDA such as the mean and the standard deviation.
Same Variance. LDA assumes that each input variable has the same variance. It’s almost always a good idea to standardize your data before using LDA so that it has a mean of 0 and a standard deviation of 1.
WHAT ARE THE NAMES OF THE EXTENSIONS TO LDA MODEL?
P 73
Quadratic Discriminant Analysis: Each class uses its own estimate of variance (or covariance when there are multiple input variables).
Flexible Discriminant Analysis: Where nonlinear combination of inputs is used such as splines.
Regularized Discriminant Analysis: Introduces regularization into the estimate of the variance (or covariance), moderating the influence of different variables on LDA.
What’s the similarity and difference between LDA and PCA?
External
- Both LDA and PCA rely on linear transformations and aim to maximize the variance in a lower dimension. (reduce dimension)
- Both method’s rank the new axis in order of importance (PC1 accounts for the most variation in the data, LD1 accounts for the most variation between the categories)
- Unlike PCA, LDA finds the linear discriminants in order to maximize the variance between the different categories while minimizing the variance within the class. (LDA focuses on maximizing the separabality among known categories) but PCA reduces dimensionality by linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data.
When should we use LDA for dimensionality reduction?
External
we should be using LDA for dimensionality reduction if the objective is to maintain class separability.
If we have 3 classes and 18 features, LDA will reduce from 18 features to only ____ features
External
2
Can we use LDA for a dataset with categorical features? Explain
External
LDA works on continuous variables. If the classification task includes categorical variables, the equivalent technique is called the discriminant correspondence analysis.