Exploratory Factor Analyses Flashcards

1
Q

When do you carry out a factor analysis?

A

When you have continuous latent data and continuous observed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How did we know that the observed data was categorical in IRT?

A

Answeres were yes/ no, likert scale etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you know if the observed data is continuous?

A

In psychology we rarely have truly continuous data, an example is reaction time. As a rule; if an item has more than five points (scale) and forms a normal distribution, you can consider it as a continuous item and perform factor analysis on it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is factor analysis often applied?

A

Sum scores on sub tests (e.g dimensions of intelligence)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the exciting thing about factor analysis as compared to item response theory according to Dylan

A

The nice thing about IRT is that you really analyse the individual items. FA is more flexible; continuous data is easier to model and the mathematics and formulas is more simple.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why did IRT form an S shaped curve?

A

Because you’re modelling a probability of an outcome (correct score) since you have categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Does one-factor FA have a s shaped curve? Explain

A

No, it’s a linear function. The expected value is on the x axis and since we’re no longer estimating probability and working with continuous data, the scores go higher than one and we can use a linear model. For this reason it is sometimes known as the linear factor model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the linear factor analysis equation

A

𝐸 (𝑋𝑝𝑖 |𝜃𝑝) =𝑏𝑖 +𝑎𝑖𝜃𝑝
or 𝑋𝑝𝑖 =𝑏𝑖 +𝑎𝑖𝜃𝑝 +𝜖𝑝𝑖
with VAR(Xpi | 𝜃p)

𝑏𝑖 is commonly referred to as item attractiveness but it roughly translates to the IRT item difficulty/ easiness parameter and it is the intercept in the slope.

e. g., “I think about suicide” has a low attractiveness
e. g., “I am statisfied with my life” has a higher attractiveness

𝑎𝑖 is the item discrimination, same as IRT and it forms the slope of the function. We model the expected value of the continuous item, and if it is a good model then the observations should be around that line.

the variance is just how much the observed points vary for the modelled line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can the notation for this model change? Explain

A
𝐸 (𝑋𝑝𝑖 |𝜃𝑝) =𝑏𝑖 +𝑎𝑖𝜃𝑝 is written as:
𝑌𝑝𝑖 =𝜐𝑖 +𝜆𝑖𝜂𝑝 +𝜖𝑝𝑖 in factor analysis literature
𝜐𝑖 is an intercept
𝜆𝑖 is a factor loading
𝜂𝑝 is the common factor
𝜖𝑝𝑖 is the residual

They mean the same things but are just written differently in IRT literature compared to FA literature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Conceptually, what is the goal of factor analysis?

A

Its a statistical approach to extract common variance from the items and separate it from the variable specific effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do the parameters translate to the variance measured?

A

𝑌𝑝𝑖 =𝜐𝑖 +𝜆𝑖𝜂𝑝 +𝜖𝑝𝑖

The common factor variance (𝜎^2𝜂) is the variance caused by the latent trait

The factor loadings (𝜆𝑖) tunes how much common factor variability comes from each item out of all the variability in the items - how well each item measures the latent trait

The intercepts (𝜐𝑖), in a single group application, are simply the item means

The residual variances (𝜎^2,𝜖𝑖) tells you how much of the variance is unique to the item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does this model imply about the data? How can this be used in calculations?

A

The model implies some kind of structure in the data in terms of the variance. You can calculate how much variability does this model predict for item 1. You can compare this to the observed variance which should be the factor loading squared times the variance plus the residual variance.

It also implies some kind of covariance between the items since they measure the same thing. If you score high on one item you are assumed to score higher on a second item. You can calculate the expected covariance through this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you calculate the expected covariance? What should you look for in this?

A

The expected covariance is the first factor loading multiplied by the second factor loading etc multiplied by the factor variance. You should look to see if this is close to the observed covariance to assess whether this is a good model for the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is the proportion of the variance calculated?

A

𝜌|2𝑌|𝑖𝜂𝑝 = 𝜆|2𝑖| 𝜎|2𝜂| / 𝜆|2𝑖| 𝜎|2𝜂| +𝜎|2𝜖|𝑖
i.e the variance of the latent variable multipled / the total variance = the variance explained by the factor

The varaince not explained by the factor (uniqueness) is calculated by:

1 - 𝜌|2𝑌|𝑖𝜂𝑝 = 𝜎|2 𝜖𝑖| / 𝜆|2𝑖| 𝜎|2𝜂| +𝜎|2𝜖|𝑖

However most of the time you can just read these from the output in Rstudio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the equivalent of chronbachs alpha from CTT here and hoiw is it calculated?

A

The reliability of the sum score, the ratio between the variability due to the latent trait and the total variability

𝜌𝑌 =𝜎^2 (𝑇 ) / 𝜎^2 (𝑌) = 𝜎^2 (𝐸(𝑌)) / 𝜎^2 (𝑌 )
= …
= 𝜎𝜂2 × (E|𝑛,𝑖=1| 𝜆𝑖)^ 2 /
𝜎|2,𝜂| × (E|𝑛,𝑖=1| 𝜆𝑖)^ 2 + E|𝑛,𝑖=1 𝜎|2,𝜖𝑖|

i.e multiply the factor variance by the sum of the loadings squared, divided by the factor variance by the loadings squared by the sum of the residual variances squared summed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does identification change compared to IRT and why?

A

It is very similar to IRT, we need to identify the model because the latent variable doesn’t have a scale or a unit so we have to create one. In factor analysis, however, we play around with this more. In IRT we fixed the mean to 0 and the std to 1 because the R packages don’t allow you to change it much and its not interesting when its one dimensional. People like to change the identification to get a different scale for the parameters, this will not change the conclusions or the p-value since the proportions between the parameters don’t change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Give an example of how you can have two different options for identification in factor analysis

A

Option 1:
• 𝜇𝜂 =0
• 𝜎𝜂2 =1

Option 2:
• 𝜇𝜂 =0
• Fix one factor loading to 1
By picking an arbitrary factor loading and fixing it to one is like saying that the scale of the latent variable is the same as the scale of that item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the most dominant approach to parameter estimation in factor analysis?

A

Maximum likelihood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does using MLE in this instance assume about the data?

A

Normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What can you do as opposed to fitting the model on the raw data with MLE for a factor analysis? Why might you want to do this?

A

You have the option to only analyse the observed covariance matrix which is very useful for factor analysis since a covariance matrix already contains all the information about the structure of your data. From a covariance matrix you ca already fit a one parameter model since you have your factor loadings and residual variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a downside to using the covariance matrix in estimating the MLE for a factor analysis?

A

There’s no intercepts in the model because for the intercepts you really need the means of the data which is not contained in a covariance matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What two alternative methods for parameter estimation exist?

A

Weighted least squares and bayesian estimation (Also popular but not discussed in this course)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Given data with 1000 subjects answering 10 questions on a 5 point likert scale, how would you write up code to carry out a factor analysis? Explain the code

head(E)

A

library(lavaan)
model = “eta = ~Y1 + Y2 + Y3 + Y4 +Y5 + Y6 +Y7 + Y8 +Y9 + Y10”
fit = cfa(model = model, data = E, meansstructure = TRUE, std.lv = TRUE)

where eta is the common factor and can be given any name akin to a variable, =~ indicated that it is measured by…, cfa runs a confirmatory factor analysis, data is called E, meanstructure = TRUE means that you want to calculate the intercepts, std.lv = TRUE means that you want to standardise the latent variable with 𝜇𝜂 =0 and 𝜎𝜂2 =1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

There is a screenshot of CFA output in the docs. Describe the information that is presented under the factor name

A

The factor loadings are given under whatever you called the factor (e.g eta). The estimates are the factor loading estimates (how well an item measures the latent trait), negative items are contra-indicative in that the higher you score on an item, the lower you score on a latent trait. If you divide the estimate by the standard error then you get the z value, both of which are shown. If an estimate is shown to be non-significant then it does not measure the latent trait. Std.a;; are the standardised factor loadings; they are the correlations between the item/variable and the factor. If you square this value then you get the shared variance between the item and the common factor.

25
Q

There is a screenshot of CFA output in the docs. Describe the information that is presented under intercepts

A

For the intercepts the estimated intercepts are simply the mean of the variable (mean response to an item). Std error, z value and Pvalue are similar to before and not interesting. Std.all gives the standardised intercepts; the mean if the variable is standardised (We won’t use this much.)

26
Q

There is a screenshot of CFA output in the docs. Describe the information that is presented under variances

A

Under variances, the estimates are the estimated residual variances (How much they deviated from the model. The variance of the latent trait (efa) is set to one here if you specified lvd.sd = TRUE. Std.ERr, zvalue and pvalue are also uninteresting here especially since you cannot even use these to test for the residual variances larger than 0 due to a boundary constraint. The STd.all gives the standardised residual variances which are interesting; they are the unexplained variance by factor, 1 - P|2,Y|mp i.e ‘uniqueness’

27
Q

Contrast the two main types of factor analysis

A

Exploratory factor analysis: is used when the factor structure is unknown (number, loadings)
• The number of factors, 𝑚, is systematically altered by the researcher
• All items load on all factors

Confirmatory factor analysis is used when the number of factors, 𝑚, is derived from theory/expectations
• The loadings are derived from theory/expectations

28
Q

Give the model for an exploratory factor analysis

A

𝑌𝑝𝑖 =𝜆1𝑖 𝜂1𝑝 +𝜆2𝑖 𝜂2𝑝 +⋯𝜆𝑚𝑖 𝜂𝑚𝑝 +𝜖𝑝𝑖

up until 𝑚 factors where 𝜂1𝑝 is the first factor for person p and 𝜂2𝑝 is the second factor for person p.

𝜆1𝑖 is the loading of item i on that first factor.
𝜆2𝑖 is the loading of item 𝑖 on factor 2, 𝜂2𝑝 etc.

𝜖𝑝𝑖 is the residual

29
Q

Where is the intercept in an exploratory factor analysis model?

A

There is not intercept, the key is to reveal the factor structure. Everything you need for this factor structure is already in a covariance matrix and so the means/ an intercept is not needed.

30
Q

What are the population parameters of an exploratory factor analysis?

A

𝜎|^2,𝜂1| is the variance of factor 1, 𝜂1𝑝
𝜎|^2,𝜂2| is the variance of factor 2, 𝜂2𝑝

etc.
𝜎𝜂1𝜂2 is the covariance between factor 1 and 2
etc.

31
Q

What are the two most popular methods of parameter estimation in factor analysis?

A

Principal factoring

Maximum likelihood

32
Q

What does principal factoring consist of?

A
  • Based on the Eigenvalues of the principal factors.
  • Same tools as in principal component analysis (see IRT lecture)
    • Kaiser criterion
    • Scree plot
    • Parallel analysis
33
Q

Give two advantages of principal factoring

A

No distributional assumptions

No improper solutions (e.g., negative variances)

34
Q

Give a disadvantage of principal factoring

A

No explicit falsifiable model

-You can calculate all you parameters yet still not know whether your factors are fitting well or not

35
Q

What does maximum likelihood consist of in factor analysis?

A

As discussed in IRT lecture, but with a normal distribution for the data

36
Q

Give two advantages of using maximum likelihood as paremeter estimation for EFA and two disadvantages

A

+ Explicit model based
+ Model falsification
- Sometimes improper solutions (e.g., negative variances etc)
- Multivariate normal distribution assumption for the data

37
Q

What is meant by saying you can obtain improper solutions using MLE to estimate the parameters?

A

e.g: Sometimes with MLE you estimate your parameters and get a negative residual or common factor variance which shouldn’t be possible

38
Q

How can you falsify your model with MLE? (2)

A

You can falsify your model with means of fit measures, There are many but two of which are:

  • 𝜒2-goodness of fit measure
  • Root Mean Squared Error of Approximation
39
Q

What is involved in the 𝜒2-goodness of fit measure?

A

It just gives a significance test with the following hypotheses:
• H0: Model fits
• HA: Model does not fit

So in this case you want your test to be insignificant

40
Q

What is involved in the Root Mean Squared Error of Approximation?

A

You take a chi-square of the model, subtract the df associated with the model. Then divide that by the square root of the df - (the number of participants -1):

𝑅𝑀𝑆𝐸𝐴= sqrt( 𝜒2−𝑑𝑓 ) / sqrt(df *(𝑁−1))
if 𝜒2 < 𝑑𝑓→𝑅𝑀𝑆𝐸𝐴 =0

41
Q

How do you interpret the result of a RMSEA?

A

You can readily interpret the number produced with the following:
• <0.05 good fit
• 0.050.08 poor fit

42
Q

What two identification issues are associated with factor analysis?

A

Scaling the latent variable: Similar as in the one factor model and in IRT, the scale of the latent variables (factors) needs to be identified

Statistical identification: The number of parameters should not exceed the number of observed (co)variances

43
Q

What is involved in scaling the latent variable?

A

For EFA this is relatively complex, it is carried out but we don’t need to know exactly how. Most important thing to know is that this constraint results in M^2 restrictions (the number of restrictions you impose on the model to make the latent variable have a scale/ unit.)

44
Q

Why wasn’t statistical identification an issue with the earlier models (IRT)?

A

Before we were only considering simpler, unidimensional models which hardly have a problem with statistical identification. But now since we’re going to build bigger more complex models, at some point your model is going to become too big for the data e.g using a model with 10 items and 1,000 parameters on a dataset of 50 people would be trying to extract more from your data than you put into the model.

45
Q

What is involved in statistical identification for EFA?

A

• The number of parameters should not exceed the number of observed (co)variances

• In EFA this can happen if the number of factors in the model is too large
-The observed covariances contain all the information about the factor structure so otherwise the factor structure would be too complex for the information we provide

46
Q

How can you investigate whether a model is identified according to statistical identification in regards to EFA?

A

A model should always have degrees of freedom larger than or equal to 0

𝑑𝑓 = 𝑀– 𝑘
𝑀:number of independent pieces of observed information
𝑘:number of parameters

47
Q

In this formula for df:
𝑑𝑓 = 𝑀– 𝑘
where
𝑀:number of independent pieces of observed information

What does M mean in terms of an EFA?

A

The number of observed covariances and variances (since that contains all the information about the factor structure.)

48
Q

How do you calculate M for the df of an EFA if the EFA is conducted on a covariance matrix?

A

M = p * (p + 1)/2
where p = number of observed variables
E.g.,: 𝑝=4 →𝑀 =4 ∗ 5/2 = 10

This makes sense if you count the covariances and variances of a covariance matrix

49
Q
In this formula for df:
𝑑𝑓 = 𝑀– 𝑘
where
𝑀:number of independent pieces of observed information
𝑘:number of parameters

How do you get the number of parameters?

A

𝑘 = 𝑝×𝑚 + 𝑚×(𝑚+1)/2 + 𝑝– 𝑚^2 (formula from in exam)
𝑝 is number of variables
𝑚 is number of factors

E.g,: 𝑝=6 and 𝑚=2→ 6∗2 + (2∗3)/2 + 6 − 22 = 𝟏𝟕

𝑝×𝑚 is the number of loadings (relations between each factor and each varaible/item) = 12
𝑚×(𝑚+1)/2 are the factor covariances = 3
𝑝 residual covariances = 6
𝑚^2 covariance constraints = 4

50
Q

What is the role of m^2 in this formula?

𝑘 = 𝑝×𝑚 + 𝑚×(𝑚+1)/2 + 𝑝– 𝑚^2

A

This is what resulted from scaling the latent variable, these parameters are no longer free parameters. They are fixed in order to identify the scale of the model and so are subtracted from the residual variances.

51
Q

These formula are for models fit on ________

A

Strictly, these formula are for models fit on a covariance matrices, but you can also use them for correlation
matrices (as for df it does not matter)

52
Q

Raw factor loadings in EFA are hard to interpret. Why is this and what solution is there to this?

A

Raw factor loadings in EFA are hard to interpret because values are given for each factor and it can be unclear which factor the variable loads on. To fix this, a simple structure version was created in which the values are replaced by + and - to indicate whether a variable loaded on a factor or not through rotation.

53
Q

What is rotation?

A

A transformation of the raw factor loadings to enhance simple structure

54
Q

What allows us to carry out these rotations on the factor loadings?

A

The unit of the factors is arbitrary, we chose them! You can therefore transform the results without affecting your model statistically. We can change the scale a little bit after we fit the model just to see when the interpretation is the best.

55
Q

Name two types of rotation and a key feature of both

A

Orthogonal rotation:
• The factors remain uncorrelated

Oblique rotation:
• The factors are correlated after rotation

56
Q

Name the main R functions associated with each type of rotation

A

Orthogonal rotation:
- varimax in R

Oblique rotation:
- promax in R

57
Q

In psychology, what type of rotation is typically used and why?

A

In psychology we can mostly assume factors are correlated, therefore oblique rotation is typically used (promax). Its hard to argue for using uncorrelated factors in psychology

58
Q

Write a line of code to carry out a factor analysis on the correlation matrix ‘osbourne_cor’. Describe what the arguments do

A
fa(r = osbourne_cor, nfactor = 1, n.obs = 477, fm = 'ml', rotate = 'none')
nfactor = 1 fits a one factor model
fm = 'ml' requests the maximum likelihood
rotate = 'none' requests no rotation (never a good idea)

fa(r = osbourne_cor, nfactor = 2, n.obs = 477, fm = ‘ml’, rotate = ‘varimax’)
Fits a two factor model with orthogonal rotation in R, doesn’t really make sense since it assumes the factors are uncorrelated

you can then tweak the nfactors and observe the RMSEA to see which is the best fit

fa(r = osbourne_cor, nfactor = 2, n.obs = 477, fm = ‘ml’, rotate = ‘promax’)
Fits a two factor model with oblique rotation in R, makes the most sense