Factor Analysis Flashcards
1
Q
What is Factor Analysis?
A
-
Factor Analysis is a family of techniques used to investigate the underlying structure of variables.
- Determining whether many variables can be reduced to fewer, higher order (latent) variables (called factors)
- Groups together highly correlated variables
-
Three main types:
- Principle Components Analysis; data reduction technique (doesn’t extract factors. All items given same regression weight, creates components (not factors).
- Exploratory Factor Analysis; Extract factors predicting the latent construct (items which covary together stay together), creates factors
- Confirmatory Factor Analysis; when you already have a theory
2
Q
What are the differences between EFA and CFA?
A
3
Q
What are the three main uses of factor analysis according to Field?
A
- To understand the structure of a set of variables; such as investigating intelligence, wellbeing, personality and other complex constructs
- To develop a questionnaire to measure a variable; Factor analysis is often used to test the validity and internal structure of questionnaires
- To reduce a data set to a more manageable size while retaining the data set’s essential qualities; Especially when there are issues with multicollinearity, highly correlated variables can be condensed into factors
4
Q
According to Fabringer et al. what 5 decisions need to be made when using factor analysis?
A
- Study Design and What variables are to be measured: Consider nature and number of common factors they wish to examine, ensure those factors are represented in multiple measurements
- Determine whether EFA is appropriate; goal of EFA is to produce a more parsimonious model of factors
- Choice of model fitting procedure (which factor extraction procedure to use)
- Numbers of factors: balancing parsimony with ability of model to account for correlations between variables (plausibility)
- Rotation Methods: whether to allow for correlations between factors
5
Q
What are three ways of viewing relationships in factor analysis?
A
- R-Matrix; a correlation table used to eyeball the data (look for groups of high correlations)
-
Plot; Presenting the proposed factors as axis on a graph, plot the correlation of each variable with the factor on the graph to reveal clusters.
- Factor/Component Loading = coordinate of a variable along a classification axis (given by pearson correlation)
-
Mathematically; A matrix representation of the factor loadings
- Column per factor, Row per variable
*
- Column per factor, Row per variable
6
Q
What are the four methods of combining factor scores?
A
-
Weighted Average; rarely used because it is too simplistic.
- Uses a linear model formula, sub in the individual’s scores
- Cannot compare results across different measurements
-
Regression Method; more sophisticated but there are limits imposed on how scores can relate to each other
- Generates coefficients adjusted for intial correlations between variables
- Recommended for most circumstances
- Bartlett Method; overcomes limits of regression by producing unbiased scores. Factor scores can still correlate with each other though.
-
Anderson-Rubin Method; modification of Bartlett method which produces uncorrelated and standardised factor scores
- recommended when requiring standardised scores
7
Q
What is Communality?
A
-
Communality is the proportion of common variance present in a variable
- Empasse: FA requires knowledge of communality but you need to conduct FA to discover communality.
-
Determining Shared Variance:
-
PCA: Assume a value of 1 (no unique variance) for all variables
- Problem; assumes no measurement error
-
Estimating Communality; many methods available eg
- Squared Multiple Correlation (SMC); Run a multiple regression using one measure as outcome and others as predictors. R2 is used as a communality estimate for that factor
-
PCA: Assume a value of 1 (no unique variance) for all variables
8
Q
What are some different ways of extracting factors?
A
- Eigenvalues; Represent the proportion of variance accounted for by a factor
-
Kaisers Criterion/Jiffy: Retain all factors with eigenvalues > 1
- Overly simplistic- tends to over or under extract
- Joliffe recommended .7 cutoff
-
Catells’ Scree Plot: Eigenvalues on Y axis, factor associated on x; look for the ‘elbow’ of the curve
- Very subjective, up to researchers discretion
-
Monte Carlo Parallel Analysis; Similar process to bootstrapping - compares observed eigenvalues to average of many simulated data sets. Retain factors that exceed those expected
- Underutilised since requires SPSS syntax
9
Q
What is rotation and what are the different types?
A
- Rotation is a technique which maximises an item’s loading on its primary factor while minimising loading on other factors
-
Orthogonal Rotation; Assume factors to be uncorrelated. Counter-intuitive for Psyc.
- Varimax Rotation; Good starting point. Maximises dispersion to aid interpretation.
-
Oblique Rotation; Assumes factors to be correlated. All techniques produce similar results
- Direct Quartimin; direct rotation
- Direct Oblimin; direct rotation + predetermining degree of correlation
- Promax; faster, uses orthogonal up to a point
10
Q
What assumptions need to be met when conducting factor analysis?
A
-
Data Cleaning; inaccurate entries and missing data are important.
- Missing data; cannot use listwise deletion when participants are answering 120 questions. Remember to check for patterns
- Normality; Not strictly required but enhances factor structure
- Linearity; Need to examine scatterplots for each pair of variables. Can be done more quickly using a scatterplot matrix.
-
Correlations; Create a correlation matrix for all the variables
- Too Low: Consider excluding variables with several low (.3) correlations
- Bartletts test; should be significant (otherwise it is essentially an identity matrix = disaster)
- Too High; multicollinearity
- Determinant of R-Matrix should be > .00001
- Too Low: Consider excluding variables with several low (.3) correlations
11
Q
What sample size is required for factor analysis?
A
-
Sample Size; Aim for 300 - 1000+ participants
- Bare minimum is 5 participants per variable, more common recommendation is 10 - 15 per variable
- Sample size can be less for factors with stronger loadings; if a factor has 4+ loadings > .6 it is reliable regardless of sample size
-
Kaiser Meyer Olken Measure of sampling adequacy (KMO)
- Measures ratio of squared correlations between variables to squared partial correlations
- Values below .5 = Merde, values >.9 = marvellous
12
Q
How is reliability assessed in factor analysis?
A
- Internal consistency is measured using Chronbach’s Alpha (measure of split half internal consistency)
-
Chronbach’s a; most commonly reported statistic, with values above .8 generally considered reliable. But:
- a is biased by sample size
- a is affected by reverse phrasing
- a is not a measure of uni-dimensionality
- Assumes uncorrelated errors multivariate normality, etc
-
SPSS: Check overall, corrected item column, if item deleted column.
- Run separate analyses for each subscale