Factor analysis Flashcards

1
Q

Scale development phases:

A

· Phases of investigating a new construct using scales (summary by Flake et al., 2017)
1. Substantive phase
- Construct conceptualisation and literature review
- Generating items
2. Structural phase
- Item analysis
- Determining dimensionality
- Reliability
3. External phase
- Convergent and discriminant validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Latent variables - measuring the immeasurable:

A

· Psychological constructs tend to be more abstract. We can’t directly measure them
· Global self-esteem is the construct of interest, also called a latent variable or a factor
· The scale items are directly measured (i.e., directly observed) so these are observed variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Factor analysis allows us to determine dimensionality:

A

· Understand the structure of a scale (or set of variables)
- When building a new scale, we create a set of observed variables (i.e., items)
- We want to see whether these items capture one construct or several related constructs (or factors) in meaningful ways

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pandemic-related worry (Brown et al, 2022):

A

· Factor 1: Infection severity
- Item 20 I often worry that if I get COVID-19 I will not recover from it
- Item 22 I often worry that I will get hospitalised due to COVID-19
· Factor 2: Risk to loved ones
- Item 5 I often worry about my family members getting COVID-19
- Item 23 I often worry about whether my close friend(s) or family member(s) will be hospitalised due to COVID-19
· Factor 3: Financial concerns
- Item 8 I often worry about the impact of COVID-19 on my financial situation
- Item 7 Hearing about job losses in the media or through people I know makes me worry about my job security.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Exploratory vs confirmatory factor analysis:

A

· Exploratory factor analysis (EFA)
- No predictions about the number of factors (but you may have some tentative predictions) or which items belong to what factors
- You can explore the data
- Remove poor performing items and re-run the factor analysis
· Confirmatory factor analysis (CFA)
- You have clear expectations of what the factor structure will look like.
- You want to confirm these predictions with our data. You cannot explore the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does factor analysis do?:

A

· Factor analysis is about making sense of shared variance between your variables (scale items)
· You might notice that some items correlate more strongly than others. And this might be evidence that your scale is capturing more than one factor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Communalities in factor analysis:

A

· Communality - The proportion of variance that an item shares with other items
· In general, larger communalities are desirable
- For instance, we assume all items within a questionnaire share a healthy amount of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Factor analysis types:

A

· Minimum residual estimation
· Maximum likelihood estimation
· Principal axis factoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Steps for running/interpreting an EFA:

A

· Preliminary analyses
- Check inter-item correlations; Bartlett’s test of sphericity
- Test whether an analysis is appropriate (Kaiser-Mayer-Olkin test of sampling adequacy)
· Main analysis
- Factor extraction
- Model fit—how well does our model fit the data?
- Factor rotation and factor loadings—under what factor does each item fit best?
- Interpreting the factors—what do factors represent conceptually?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Inter-item correlations:

A

· Items must not correlate too weakly (usually s.30 or closer to 0)
- Items are expected to correlate with each other (they measure the same construct!)
- Low correlations indicate an item doesn’t fit with the other items: it measures something else!
· Items must not correlate too highly (usually around .90 or closer to 1)
- Several items measure the exact same thing
- No need to measure the same thing multiple times within a scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bartlett’s test of sphericity:

A

· Tests whether the correlation matrix is significantly different from an identity matrix
- In other words: Are the correlations close to zero?
· Not a great test…
- Test is too extreme: correlations are often different from zero but still very small
- p-values are sensitive to sample size; the larger the sample, the more likely they are to be significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Testing sampling adequacy (KMO test):

A

· Determining sample size for an EFA is tricky. The number of participants required to run a factor analysis depends on communalities, and how “strong” the factors are (items strongly related to their factors, several items per factor)
· Kaiser-Mayer-Olkin (KMO) tries to work out the pattern of correlations is in your dataset and determine whether your sample is sufficient; ranges from 0 = diffuse pattern (bad) to 1 = compact pattern (good

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Factor extraction:

A

· Factor extraction = deciding how many factors are captured by our items
- We want to explain as much variance as we can with as few factors as possible
· Extraction methods are based on eigenvalues.
- Eigenvalues are the amount of total variance explained by each factor
- Eigenvalues indicate the importance of a factor: bigger eigenvalue = more important factor
· Extraction methods start by generating the maximum number of factors (as many factors as we have items) and inspect their eigenvalue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Factor extraction in R:

A

· x-axis
- Shows the number of factors
- As many factors as there are items (here, 25)
· y-axis
- Shows the eigenvalues
- E.g., Factor 1 has an eigenvalue > 5
· … But we don’t want to end up extracting 25 factors each with a single item!
· We need to choose a smaller number of factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Parallel analysis (Horn, 1965):

A

· Parallel analysis involves comparing your data to randomly generated data
· Step 1: Create several randomly generated datasets that have the same number of cases and variables as the actual dataset
· Step 2: Compare eigenvalues from the actual dataset to the eigenvalues obtained across the randomly generated dataset
· Step 3: Factors are retained if their eigenvalue is greater than the corresponding eigenvalues obtained from the random data.
· Rationale: Find out which factors meaningfully explain variance in your data, beyond random noise.
· The red dotted line shows the eigenvalues from the random data
· The blue line shows the eigenvalues from your actual data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Parallel analysis (Horn, 1965) 2:

A

· Sometimes it is unclear whether an eigenvalue from the actual data is higher or lower than the corresponding random eigenvalue
· In this example, it’s unclear what is happening with the 7th eigenvalue
· You can use other sources of information to decide how many factors to extract. For instance, the scree plot

17
Q

The scree plot:

A

· Inspecting the scree plot means looking at the shape of the line showing the eigenvalues in your actual dataset.
· We need to see where the slope changes; that point is called an inflexion point
· We extract however many factors are to the left of the inflexion point (not including the inflexion point itself)
- e.g., if the inflexion point is at Factor 5, extract 4 factors

18
Q

The scree plot is a load of rubble:

A

· Scree = loose rock/rubble at the base of a cliff
- The scree at the base of a cliff can form a very pronounced slope
· It’s not as easy to identify an inflexion point in actual data…
- Recommendation: Check parallel analysis first! If the parallel analysis is not clear, then look at the scree plot

19
Q

Running the factor analysis model:

A

· the number of factors suggested by the parallel analysis
· The type of factor analysis you want to run.
- “minres” stands for Minimum Residual Estimation
· These are about factor rotation
· In general, you want to use these argument options which will apply an “oblique rotation”—helps clarify the relationship between your items and your factors

20
Q

Model fit indices - RMSR (reported as SRMR):

A

· Smaller is better (0 is best)
· Good fit: SRMR<.60 (Hu & Bentler, 1999)

21
Q

Model fit indices - Chi-square

A
  • Test how different your model is to “real” relationships between variables in the population
    · If significant: your model is different from this ideal model (not good!)
    · Problems:
    • Sensitive to sample size
    • A harsh test
22
Q

Model fit indices - TLI

A

· Based on Chi-square, but this time comparing it to a very bad model (no factor structure)
· Between 0 (horrible) and 1 (perfect)

23
Q

Model fit indices - RMSEA

A

· Based on Chi-square but compares obtained model to a model implied by the data and adjusted for the complexity of the model
· Smaller is better (0 is best)

24
Q

Model fit criteria:

A

· A combination of model fit indices will tell you how well your model fits the data
· Hu & Bentler (1999) suggest:
- TLI > 0.96 and SRMR < 0.06
- OR
- RMSEA < 0.05 and SRMR < 0.09
· You should report the Chi-Square test, the RMSEA, the CFI and the SRMR

25
Q

Factors and items:

A

· The relationship between a factor and an item can be thought of as the Pearson correlation between the two
· This relationships is called a factor loading

26
Q

Cross-loading items:

A

· Primary factor loadings
- An item’s highest factor loading is its primary factor loading
- e.g., raq_19 has a primary factory loading of .56
· Secondary factor loadings
- An item’s additional, lower factor loadings are its secondary factor loadings
- e.g., raq_19 has a secondary loading of .26
· When an item has high loadings on multiple factors, we say that it is a cross-loading item

27
Q

Understanding your factors:

A
  1. Rules-of-thumb (Costello & Osborne, 2005):
    · Acceptable: Primary factor loadings >.32 and cross-loadings <.32
    · Note: Some papers use .30 as the threshold for judging primary factor loadings and cross-loadings
    1. Read your items carefully!
      · Do your items make conceptual sense as part of the same factor?