Factor Analysis Flashcards
The amount of variance shared by each VARIABLE with the others is technically called
communality
The amount of variance accounted by each FACTOR is technically represented by the mathematical concept of an
EIGENVALUE
SPSS will automatically extract those factors with variance greater than ??? if no specific criteria is specified.
1
This criteria makes sense if we remeber that all the input variables should be ???? to ensure a mean = 0 and a standard deviation = 1.
standardize
Once the factors are extracted, we will interpret their meaning by exploring the ??? matrix.
Component
Coefficients in this matrix can be understood as ??? coefficients between Factors and Input Variables.
Correlation
If the interpretation is not straightforward, we can use a factor ??? to better interpret Factors.
Rotation
Among the different types available, Varimax keeps the property of ??? so the new factors will still being independent.
Ortogonality]
Once the interpretation is clear, we can save factor values, technically called factor ??? in our dataset.
Scores
The set of input variables for a FACTOR ANALYSIS should exhibit a low degree of redundancy
FALSE. Redundancy is, in fact, essential for a good factor analysis.
Orthogonality of factors is not necessarily a realistic assumption
TRUE. Sometimes, independency=orthogonality is not realistic and we find more credible some degree of correlation / overlapping between factors.
If input variables were orthogonal, then a factor analysis would not make sense
TRUE. If original variables are orthogonal, that means no correlation between them and thus no space for common factors.
Standard/Classic factor analysis is suitable mainly for metric variables (not categorical)
YES. Although some technical variants exist for categorical factor analysis, the standard PCA/Factor is only prepared to process metric scale variables.
In a good factor analysis, we expect to get as many factors as possible
FALSE. A main goal of factor analysis is dimensionality reduction so, the less the number of factors (without a great loss of information) the better.
When there exists an underlying factor, being unique and common, we will get a high eigenvalue for the first factor
TRUE. A high value means a factor accounting for a high proportion of variance.
VARIMAX, QUARTIMAX and EQUAMAX are orthogonal rotations.
TRUE. You can find it in the class document. All of them keep orthogonality of initial factors after rotation.
The proportion of variance accounted for by each factor / component equals its eigenvalue divided by the number of variables
TRUE. The eigenvalue is the variance=information of each factor. Given that variables are z-scores, the total amount of variance is equal to “N=number of variables”. Dividing the Factor eigenvalue (variance of the factor) by “N” we get the proportion of variance captured by this factor.
Using PCA as factor extraction, we will sometimes get oblique factors solutions
FALSE. PCA implies orthogonality between factors.
A low communality for a given input variable may suggest that we can remove that variable from the factor analysis
TRUE. A low communality implies that the variable does not share common factors with others.
Factor analysis can de used to get a metric measurement of an unobservable feature
TRUE. If we are able to measure some variables that are somehow related to an underlying unobservable feature, FACTOR analysis can extract a metric representation (measurement) of that unobservable feature
Standardized versions of input variables will give all the input variables the same importance / weight in our factor analysis.
TRUE. Otherwise, we take a risk of weighting some variables more than others (although this is not the case for PCA, could be a risk when using other extraction methods)
We could use a FACTOR analysis to test if different items in a questionnaire are linked with the same latent/underlying concept
TRUE. As mentioned in class, FACTOR analysis could be used to test consistency of a set of items in a list. Those exhibiting low communality with others are supposed to be un-connected with the common underlying factor.
We could use a cluster membership group variable from a CLUSTER analysis as an input field in a FACTOR analysis
FALSE. That cluster membership would be a categorical variable and FACTOR is mainly about metric input variables.
Factor analysis Is a technique normally used to remove variables that are redundant
FALSE. In fact, FACTOR analysis is based on existing redundancy between variables.
Is a SUPERVISED analysis technique
FALSE. This is not about forecasting / predictive analytics so we don’t have any target.
Can be used to create a COMPOSITE INDEX
TRUE. This is, in fact, one of the main usages of FACTOR analysis
Is a CLASSIFICATION method
FALSE. We call classification methods to predictive analysis for categorical variables. Factor is not a predictive technique.
Is something mandatory before a segmentation/cluster analysis
FALSE. Both techniques may complement to each other for certain exercises but are not formally connected.
It is normally used to explore common underlying factors between categorical variables.
FALSE. It only works properly for metric variables.
Requires a set of variables exhibiting a high degree of multiple correlation
TRUE. A high correlation between variables is a good starting point for a factor analysis. Multiple correlation between every variable is preferred to partial correlation between only a couple or a subset of variables.
Is commonly used as a dimensionality reduction technique
TRUE. This is one of the main goals: to compress the information of several variables into a limited number of factors.
Always provides a clear interpretation of every factor extracted.
FALSE. The hardest part of a factor analysis is usually the interpretation of factors.
Produces some new variables (factors) that can be saved as new variables into the dataset
TRUE. The scores of factors can be saved as new variables. This is, in fact, the output of every factor analysis: factors scores.
Rotation is used to improve factor interpretation
TRUE. This is the goal of rotation.
Varimax rotation can be used to increase communality
FALSE. Communality will remain the same after VARIMAX rotation
There are several techniques to produce rotation
TRUE. There are ORTHOGONAL and OBLIQUE rotation methods and also different algorithms for both types.
Orthogonal rotation is normally more realistic (it is realistic to assume independence between factors)
FALSE. Normally, it is difficult to find uncorrelated FACTORS in the real world.
Oblique rotation means a between - factors angle of 90 degrees
FALSE. It is the contrary: by using an oblique rotation we allow non orthogonal factors (angle different from 90 degrees)
Oblique is more realistic because normally there is always HIGH correlation between VARIABLES
FALSE. Orthogonality or obliquity is about relationship between factors NOT between variables. The existence of relationship between variables is a MUST and it is not strictly related with relationship between factors.
Rotation does change the factor SCORES
FALSE. When we rotate, factors are computed according to a different combination of variables and that means different scores.
Related to information not explained by the Factors extracted
Specifity
Specific Method to extract Factors
Principal Components
Displays relationship between variables and factors
Component plot
Shows factor eigenvalues
Scree plot
Independence / No correlation
Orthogonalty
Measure that indicates amount of information conveyed by a factor (variance of factor)
Eigenvalue
Method for Factor Rotation
Oblimin
If the communality showed by a VARIABLE is very LOW we could normally discard this variable and re - run the analysis
TRUE. A low communality means low degree of common factors with the other variables.
Each INPUT VARIABLE variable should exhibit high correlation with others (at least, with other input variable and ideally with all the rest)
TRUE. This would be the minimum requirement for a variable to be part of a factor analysis. Sharing something in common with, at least, other variable ensures that a common factor (at least for these two variables is feasible).
By default, SPSS will retain factors with eigenvalues HIGHER than ONE
TRUE. Factors with an amount of variance higher than 1.
If wanted, we can extract as many factors as input variables
TRUE. Even if it would not make any sense.
The factor SCORES will normally present positive and negative values
TRUE. This is because a factor is a z-score so a negative value means a value below the mean and a positive one means a factor score above the mean.
Using PCA extraction method, FACTORS will be always orthogonal if no rotation is applied
TRUE. PCA implies orthogonal factors.
The SIGN of the coefficients in the component / structure matrix is NOT very interesting since variables are normally standardized
FALSE. The sign of coefficients / loadings illustrate the positive or negative correlation between factors and variables.
The higher the coefficients for a variable in a factor ( component / structure matrix), the higher the correlation between the factor and that variable
TRUE. The coefficients / loadings represent correlation between factors and variables.
Prior to rotation, a variable may exhibit high correlation with more than one factor at the same time
TRUE. This is normally what we try to avoid by running rotation. Nevertheless, be aware that, even if we try, rotation does not ensures a completely uneven distribution of component matrix for the rotation solution
Factors are always extracted and presented in decreasing order of information/communality
TRUE. In the output table, factors are ordered in size of variance.