Factor analysis Flashcards
Scale development phases:
· Phases of investigating a new construct using scales (summary by Flake et al., 2017)
1. Substantive phase
- Construct conceptualisation and literature review
- Generating items
2. Structural phase
- Item analysis
- Determining dimensionality
- Reliability
3. External phase
- Convergent and discriminant validity
Latent variables - measuring the immeasurable:
· Psychological constructs tend to be more abstract. We can’t directly measure them
· Global self-esteem is the construct of interest, also called a latent variable or a factor
· The scale items are directly measured (i.e., directly observed) so these are observed variables
Factor analysis allows us to determine dimensionality:
· Understand the structure of a scale (or set of variables)
- When building a new scale, we create a set of observed variables (i.e., items)
- We want to see whether these items capture one construct or several related constructs (or factors) in meaningful ways
Pandemic-related worry (Brown et al, 2022):
· Factor 1: Infection severity
- Item 20 I often worry that if I get COVID-19 I will not recover from it
- Item 22 I often worry that I will get hospitalised due to COVID-19
· Factor 2: Risk to loved ones
- Item 5 I often worry about my family members getting COVID-19
- Item 23 I often worry about whether my close friend(s) or family member(s) will be hospitalised due to COVID-19
· Factor 3: Financial concerns
- Item 8 I often worry about the impact of COVID-19 on my financial situation
- Item 7 Hearing about job losses in the media or through people I know makes me worry about my job security.
Exploratory vs confirmatory factor analysis:
· Exploratory factor analysis (EFA)
- No predictions about the number of factors (but you may have some tentative predictions) or which items belong to what factors
- You can explore the data
- Remove poor performing items and re-run the factor analysis
· Confirmatory factor analysis (CFA)
- You have clear expectations of what the factor structure will look like.
- You want to confirm these predictions with our data. You cannot explore the data
What does factor analysis do?:
· Factor analysis is about making sense of shared variance between your variables (scale items)
· You might notice that some items correlate more strongly than others. And this might be evidence that your scale is capturing more than one factor.
Communalities in factor analysis:
· Communality - The proportion of variance that an item shares with other items
· In general, larger communalities are desirable
- For instance, we assume all items within a questionnaire share a healthy amount of variance
Factor analysis types:
· Minimum residual estimation
· Maximum likelihood estimation
· Principal axis factoring
Steps for running/interpreting an EFA:
· Preliminary analyses
- Check inter-item correlations; Bartlett’s test of sphericity
- Test whether an analysis is appropriate (Kaiser-Mayer-Olkin test of sampling adequacy)
· Main analysis
- Factor extraction
- Model fit—how well does our model fit the data?
- Factor rotation and factor loadings—under what factor does each item fit best?
- Interpreting the factors—what do factors represent conceptually?
Inter-item correlations:
· Items must not correlate too weakly (usually s.30 or closer to 0)
- Items are expected to correlate with each other (they measure the same construct!)
- Low correlations indicate an item doesn’t fit with the other items: it measures something else!
· Items must not correlate too highly (usually around .90 or closer to 1)
- Several items measure the exact same thing
- No need to measure the same thing multiple times within a scale
Bartlett’s test of sphericity:
· Tests whether the correlation matrix is significantly different from an identity matrix
- In other words: Are the correlations close to zero?
· Not a great test…
- Test is too extreme: correlations are often different from zero but still very small
- p-values are sensitive to sample size; the larger the sample, the more likely they are to be significant
Testing sampling adequacy (KMO test):
· Determining sample size for an EFA is tricky. The number of participants required to run a factor analysis depends on communalities, and how “strong” the factors are (items strongly related to their factors, several items per factor)
· Kaiser-Mayer-Olkin (KMO) tries to work out the pattern of correlations is in your dataset and determine whether your sample is sufficient; ranges from 0 = diffuse pattern (bad) to 1 = compact pattern (good
Factor extraction:
· Factor extraction = deciding how many factors are captured by our items
- We want to explain as much variance as we can with as few factors as possible
· Extraction methods are based on eigenvalues.
- Eigenvalues are the amount of total variance explained by each factor
- Eigenvalues indicate the importance of a factor: bigger eigenvalue = more important factor
· Extraction methods start by generating the maximum number of factors (as many factors as we have items) and inspect their eigenvalue
Factor extraction in R:
· x-axis
- Shows the number of factors
- As many factors as there are items (here, 25)
· y-axis
- Shows the eigenvalues
- E.g., Factor 1 has an eigenvalue > 5
· … But we don’t want to end up extracting 25 factors each with a single item!
· We need to choose a smaller number of factors
Parallel analysis (Horn, 1965):
· Parallel analysis involves comparing your data to randomly generated data
· Step 1: Create several randomly generated datasets that have the same number of cases and variables as the actual dataset
· Step 2: Compare eigenvalues from the actual dataset to the eigenvalues obtained across the randomly generated dataset
· Step 3: Factors are retained if their eigenvalue is greater than the corresponding eigenvalues obtained from the random data.
· Rationale: Find out which factors meaningfully explain variance in your data, beyond random noise.
· The red dotted line shows the eigenvalues from the random data
· The blue line shows the eigenvalues from your actual data