Exploratory Factor Analyses Flashcards

Question

There is a screenshot of CFA output in the docs. Describe the information that is presented under intercepts

Answer 1

For the intercepts the estimated intercepts are simply the mean of the variable (mean response to an item). Std error, z value and Pvalue are similar to before and not interesting. Std.all gives the standardised intercepts; the mean if the variable is standardised (We won't use this much.)

Answer 2

Under variances, the estimates are the estimated residual variances (How much they deviated from the model. The variance of the latent trait (efa) is set to one here if you specified lvd.sd = TRUE. Std.ERr, zvalue and pvalue are also uninteresting here especially since you cannot even use these to test for the residual variances larger than 0 due to a boundary constraint. The STd.all gives the standardised residual variances which are interesting; they are the unexplained variance by factor, 1 - P|2,Y|mp i.e 'uniqueness'

Answer 3

Exploratory factor analysis: is used when the factor structure is unknown (number, loadings) • The number of factors, 𝑚, is systematically altered by the researcher • All items load on all factors Confirmatory factor analysis is used when the number of factors, 𝑚, is derived from theory/expectations • The loadings are derived from theory/expectations

Answer 4

𝑌𝑝𝑖 =𝜆1𝑖 𝜂1𝑝 +𝜆2𝑖 𝜂2𝑝 +⋯𝜆𝑚𝑖 𝜂𝑚𝑝 +𝜖𝑝𝑖 up until 𝑚 factors where 𝜂1𝑝 is the first factor for person p and 𝜂2𝑝 is the second factor for person p. 𝜆1𝑖 is the loading of item i on that first factor. 𝜆2𝑖 is the loading of item 𝑖 on factor 2, 𝜂2𝑝 etc. 𝜖𝑝𝑖 is the residual

Answer 5

There is not intercept, the key is to reveal the factor structure. Everything you need for this factor structure is already in a covariance matrix and so the means/ an intercept is not needed.

Answer 6

𝜎|^2,𝜂1| is the variance of factor 1, 𝜂1𝑝 𝜎|^2,𝜂2| is the variance of factor 2, 𝜂2𝑝 etc. 𝜎𝜂1𝜂2 is the covariance between factor 1 and 2 etc.

Answer 7

Principal factoring Maximum likelihood

Answer 8

* Based on the Eigenvalues of the principal factors. * Same tools as in principal component analysis (see IRT lecture) * Kaiser criterion * Scree plot * Parallel analysis

Answer 9

No distributional assumptions | No improper solutions (e.g., negative variances)

Answer 10

No explicit falsifiable model | -You can calculate all you parameters yet still not know whether your factors are fitting well or not

Answer 11

As discussed in IRT lecture, but with a normal distribution for the data

Answer 12

+ Explicit model based + Model falsification - Sometimes improper solutions (e.g., negative variances etc) - Multivariate normal distribution assumption for the data

Answer 13

e.g: Sometimes with MLE you estimate your parameters and get a negative residual or common factor variance which shouldn't be possible

Answer 14

You can falsify your model with means of fit measures, There are many but two of which are: - 𝜒2-goodness of fit measure - Root Mean Squared Error of Approximation

Answer 15

It just gives a significance test with the following hypotheses: • H0: Model fits • HA: Model does not fit So in this case you want your test to be insignificant

Answer 16

You take a chi-square of the model, subtract the df associated with the model. Then divide that by the square root of the df - (the number of participants -1): 𝑅𝑀𝑆𝐸𝐴= sqrt( 𝜒2−𝑑𝑓 ) / sqrt(df *(𝑁−1)) if 𝜒2 < 𝑑𝑓→𝑅𝑀𝑆𝐸𝐴 =0

Answer 17

You can readily interpret the number produced with the following: • <0.05 good fit • 0.050.08 poor fit

Answer 18

Scaling the latent variable: Similar as in the one factor model and in IRT, the scale of the latent variables (factors) needs to be identified Statistical identification: The number of parameters should not exceed the number of observed (co)variances

Answer 19

For EFA this is relatively complex, it is carried out but we don't need to know exactly how. Most important thing to know is that this constraint results in M^2 restrictions (the number of restrictions you impose on the model to make the latent variable have a scale/ unit.)

Answer 20

Before we were only considering simpler, unidimensional models which hardly have a problem with statistical identification. But now since we're going to build bigger more complex models, at some point your model is going to become too big for the data e.g using a model with 10 items and 1,000 parameters on a dataset of 50 people would be trying to extract more from your data than you put into the model.

Answer 21

• The number of parameters should not exceed the number of observed (co)variances • In EFA this can happen if the number of factors in the model is too large -The observed covariances contain all the information about the factor structure so otherwise the factor structure would be too complex for the information we provide

Answer 22

A model should always have degrees of freedom larger than or equal to 0 𝑑𝑓 = 𝑀– 𝑘 𝑀:number of independent pieces of observed information 𝑘:number of parameters

Answer 23

The number of observed covariances and variances (since that contains all the information about the factor structure.)

Answer 24

M = p * (p + 1)/2 where p = number of observed variables E.g.,: 𝑝=4 →𝑀 =4 ∗ 5/2 = 10 This makes sense if you count the covariances and variances of a covariance matrix

Answer 25

𝑘 = 𝑝×𝑚 + 𝑚×(𝑚+1)/2 + 𝑝– 𝑚^2 (formula from in exam) 𝑝 is number of variables 𝑚 is number of factors E.g,: 𝑝=6 and 𝑚=2→ 6∗2 + (2∗3)/2 + 6 − 22 = 𝟏𝟕 𝑝×𝑚 is the number of loadings (relations between each factor and each varaible/item) = 12 𝑚×(𝑚+1)/2 are the factor covariances = 3 𝑝 residual covariances = 6 𝑚^2 covariance constraints = 4

Answer 26

This is what resulted from scaling the latent variable, these parameters are no longer free parameters. They are fixed in order to identify the scale of the model and so are subtracted from the residual variances.

Answer 27

Strictly, these formula are for models fit on a covariance matrices, but you can also use them for correlation matrices (as for df it does not matter)

Answer 28

Raw factor loadings in EFA are hard to interpret because values are given for each factor and it can be unclear which factor the variable loads on. To fix this, a simple structure version was created in which the values are replaced by + and - to indicate whether a variable loaded on a factor or not through rotation.

Answer 29

A transformation of the raw factor loadings to enhance simple structure

Answer 30

The unit of the factors is arbitrary, we chose them! You can therefore transform the results without affecting your model statistically. We can change the scale a little bit after we fit the model just to see when the interpretation is the best.

Answer 31

Orthogonal rotation: • The factors remain uncorrelated Oblique rotation: • The factors are correlated after rotation

Answer 32

Orthogonal rotation: - varimax in R Oblique rotation: - promax in R

Answer 33

In psychology we can mostly assume factors are correlated, therefore oblique rotation is typically used (promax). Its hard to argue for using uncorrelated factors in psychology

Answer 34

``` fa(r = osbourne_cor, nfactor = 1, n.obs = 477, fm = 'ml', rotate = 'none') nfactor = 1 fits a one factor model fm = 'ml' requests the maximum likelihood rotate = 'none' requests no rotation (never a good idea) ``` fa(r = osbourne_cor, nfactor = 2, n.obs = 477, fm = 'ml', rotate = 'varimax') Fits a two factor model with orthogonal rotation in R, doesn't really make sense since it assumes the factors are uncorrelated you can then tweak the nfactors and observe the RMSEA to see which is the best fit fa(r = osbourne_cor, nfactor = 2, n.obs = 477, fm = 'ml', rotate = 'promax') Fits a two factor model with oblique rotation in R, makes the most sense

Exploratory Factor Analyses Flashcards

(58 cards)