FA; Lec 3 & 4; Lab 2 & 3 Flashcards
Give an example of CFA.
Given a set of data you could determine which factor theory of personality best represents the data
In the FA output, what should you be able to tell from the second column in the table ‘Total Variance Explained’?
How much of the variance is explained by the factors with an Eigenvalue above 1
What is a problem with the Kaiser Guttman criterion?
It is sensitive to the number of items. Therefore, an increase in items = increase in eigenvalue.
The Kaiser Guttmann needs to fulfill one of two criteria to be valid - what are they?
- Either there must be <30 variables and ALL communalities >.7
OR
- The sample size must be greater than >250 and AVERAGE communality must be >.6
How does the Kaiser Guttman criterion work?
Generated factors with eigenvalues above 1 are removed as real factors.
One purpose of FA is to show how many distinct common factors are measured by a set of test items - give an example of this.
Are the supposed different constructs: neuroticism, anxiety, hysteria, ego strength, self-actualisation, and locus of control, 6 independent entities or would they be better described as only 2 factors? (Elements of pathology: neuroticism, anxiety and hysteria; Healthy mechanisms: ego-strength self-actualisation and locus of control)
Rotation has no impact on the overall variance explained - why do we do it?
Because we are searching for a simple structure and it helps us with this; it moves loadings around and cleans up the output. This aids our interpretation of the latent constructs.
What is the most common orthogonal rotation?
Varimax
One purpose of FA is to determine whether tests that purportedly measure the same thing in fact do so - give an example of this.
3 tests that claim to measure anxiety - FA may produce more than one factor indicating something in addition to anxiety is being measured.
How do you conduct PCA (a type of EFA) with a correlation matrix and Varimax rotation in SPSS?
Analyse –> Dimension Reduction –> Factor –> Move all variables to ‘Variable’ box –> Extraction –> Scree plot –> Deselct ‘Unrotated Factor Solution –> Continue –> Descriptives –> KMO and Bartlett’s test of sphericity –> coefficients (for correlation matrix) –> Rotation –> Varimax –> Continue
How do you interpret factors?
You use the factor loadings - anything >.3/>.32
Why does the button ‘Eigenvalues over’ automatically become deselected when you indicated how many factors you want to be selected from the ‘Extraction’ dialogue box - why?
Because you aren’t using Eigenvalues anymore, you are forcing the result into a specific number of factors.
What is an identity matrix?
When the R-matrix has no correlations/all correlations are 0
If it is debatable whether, for example, a 2 or 3 factor solution makes more sense, what should you do?
Report the results of both interpretations and then follow one based on theoretical disposition. Since all the results will be reproduced enough information is available for someone with a different theoretical disposition to interpret the data in an alternative fashion.
In the FA output, what does the table labelled ‘communalities’ tell us?
The first reads ‘Initial’ and indicates from a theoretical position that the communality of any item is potentially one.
The second, labelled ‘Extraction’ gives a different value for the communalities of the items after extraction has taken place.
For data to be suitable for FA, should Bartlett’s be significant or not?
Significant
What are the assumptions re variance of principal components analysis (PCA)?
- All variance explained by the factors
In these two questions:
1. What is the capital of Spain?
2. What is the capital of Italy?
What is the common factor
Geographical knowledge
How do you report Bartlett’s
Χ2(df) chi sq value, p><0.05
Are loading factors with PAF going to seem less or more impressive than PCA?
PAF = Less impressive loading factors, because it allows for specific variance
What must KMO value be for data to be suitable for FA?
Above 0.5
What are the 4 parametric assumptions?
- Must be continuous
- Variables much be normally distributed and outliers must have been appropriately dealt with
- Relationship between all variables appear to be linear, or at least not U-shaped or J shaped
- All variables must be independent
How do you estimate communality for PCA and PAF?
- PCA - it is assumed to be 100% and therefore there is no estimation required
- With PAF there is no agreed way to do this
Whilst there are 7 steps to conducting PCA, this can be simplified to 3, what are they?
- Determine the suitability of the data
- Factor extraction
- Rotation and interpretation
One purpose of FA is to check the psychometric properties of a questionnaire - give an example of this.
Would a different population made of Chinese identify the constructs of extraversion-introversion and neuroticism which have been found in European cultures?
Note: this would need to be done through confirmatory factor analysis
When conducting FA you should look at the R-matrix for two potential problems, what are they?
- Correlations are too low - variables with lots of correlations .9 for two variables or >.6 among many variables also possibly a problem)
specific variance
variance that cannot be explained by the factors - fluke knowledge (e.g. knowing the capital of Spain because you went there, but not actually having good geographical knowledge)
If you decide that the weakest factor is not worth retaining, how do you get rid of it in your SPSS output?
Go to the Factor Analysis ‘Extraction’ dialogue box and in the Box labelled ‘Number of Factors’ type in the number of factors you want to retain.
How do you read a scree plot?
Going from left to right draw the first straight line that shows the data leveling off (elbow).
No. factors above th line = number of factors to be retained
In the FA output, what does the Rotated Component Matrix tell us?
It shows which items load heavily onto which factors based on the rotated solution
Aside from parametric assumptions, how do you know if data are suitable for EFA?
- There must be at least some correlations in the matrix that are above .3
- There must be at least 100 participants and more participants than items; although partly a function of the ‘strength’ of the data
What is an eigenvalue?
Therefore, what does a scree plot indicate?
An indication of the amount of variance explained by any one factor.
The scree plot graphically indicates how much variance is explained by the removal of successive possible factors.
What is the simplest form of structural equation modelling?
CFA
What is the most common oblique rotation?
Direct Oblimin
What does factor rotation do?
It changes the position of the factors to ease interpretation. Each factor should have some large loadings and some small ones (simple structure). Large numbers of mediocre loadings should be avoided.
FA will produce a correlation matrix. However, if we want to run this in SPSS beforehand how would we do it?
Analyse –> Correlate –> Bivariate –> Move the variables we are interested in correlating to the ‘Variables’ box –> OK
What is the difference between exploratory factor analysis and confirmatory factor analysis?
EFA seeks to determine the number and nature of factors which underpin a set of data; while CFA allows you to choose between alternative hypotheses which purport to represent your data.
What are the 4 basic purposes of FA?
- To show how many distinct common factors are measured by a set of test items
- Shows which items relate to which common factors
- Determines whether tests that purportedly measure the same thing in fact do so
- Checks the psychometric properties of questionnaire - with a different sample do the same factors materialise (CFA only)
In the FA output, what should you be able to tell from the third column (Rotation Sums of Squared loadings) in the table ‘Total Variance Explained’?
This column identifies the percentage of variance which each of the rotated factors now explains.
A person’s score on a factor can be calculated from what?
Their responses to items that load onto that factor. Items with greater loadings have higher weighting. You can then use factor scores for subsequent tests.
If you expect your factors to be uncorrelated which kind of rotation should you use?
orthogonal - look at the rotated component matrix
What do we usually use the Kaiser criterion to identify factors, and not the Joliffe criterion?
Joliffe retains too many factors
In the FA output where do you look for the rotated solution?
‘Rotated component matrix’
What is a scree test?
It is based on eigenvalues of an unrotated solution
What is a simple structure?
When a factor only have substatial loadings on a few item
How does Bartlett’s test of sphericity tell you if the R-matrix is an identity matrix?
If it is all correlations will be zero.
A result with significance
What is the difference between principal axis factoring and principal components analysis?
Principal components analysis (PCA) does not discriminate between common and specific variance, while principal axis factoring (PAF) does.
SPSS sometimes produces strange numbers, if it produces a number with an E and a minus number after it e.g. 1.24E-02, what should you do?
Move the decimal place to the left the stipulated number of places. In this instance: 0.0124
Large correlations between factors should be considered as what?
suspect
Is there much difference between PCA and PAF?
They seem to produce very similar results; so much so that some researchers do not identify which one they are carrying out (although not good practice).
However, since PAF allows for specific variance then an item’s communality is necessarily going to be less than one; therefore, loading factors for items are going to appear less impressive with PAF as opposed to PCA.
SPSS sometimes produces strange numbers, if it produces a number with an E and a positive number after it e.g. 1.24E02, what should you do?
Move the decimal place to the right the stipulated number of places. In this instance: 124.00
Once you have your FA output, what 6 things should you look for to know that the data is suitable for FA?
- Correlation matrix - some of the variables must be above .3
- Bartlett’s test - if its significant this means it is different from data selected at random
- KMO - The least you should accept is .5 (.9 is superb)
- Communalities - you want some above .4
- The relationship between variables should be largely linear. But FA is robust so as long as it isn’t J or U shaped you are okay
6 scree plot - interpret left to right as elbow to get initial idea of no. of factors
What are two formal tests to assess data suitability for FA?
- Kaiser-Meyer-Olkin (KMO) is a measure of sampling adequacy and addresses if the sample is big enough
- Bartlett’s test of sphericity - measures if the R-matrix is an identity matrix
How would you clean up your component matrix if there were too many small values and it is hard to read?
Go to the Factor Analysis ‘Options’ and ‘suppress small coefficients’
This will get rid of values below .3
Correlations detected from Bartlett’s are unlikely to be…?
The product of chance
In the FA output, what does the Component Matrix tell us?
It shows intial correlations between the items and the proposed factors.
NOTE: Usually we don’t use this
When you report factor analysis what 5 things should you include?
- Table of factor loadings
- Extraction
- Rotation methods
- Rationale for the number of factors
- Bartlett’s and KMO
What are the two questions that need to be asked in determining the suitability of a set of data for FA?
- Is the sample big enough?
- Are there at least some correlations between items in the sample (above .3 in magnitude) that would suggest that there might be underlying factors that could summarise some of the items used?
- What are the two types of exploratory factor analysis?
2. How are they different from one another?
- Principal components analysis and Principal axis factoring
- They make different assumptions regarding unexplained variance
When reporting FA what eight details should you include?
- packages used (e.g. SPSS)
- Presence of correlations with values .3 and above
- KMO
- Bartlett’s
- Inspection of scree plot (sometimes)
- Number of factors with Eigenvalues exceeding 1 and how much variance (%) each one explains
- Justification for removal of any factors and how much variance is explained by each of remaining factors
- What kind of rotation was performed and whether this yielded a simple structure
When factors are uncorrelated and commonalities are moderate, what does PCA produce?
Inflated values of variance accounted for by the components
In the FA output, the table Total Variance Explained has three columns: ‘Initial Eigenvalues’, ‘Extraction Sums of Squared Loadings’ and ‘Rotation Sums of squared loadings’, what is the difference between these two?
The second column ignores potential factors that have an eigenvalue below 1
According to PCA, what is ‘total variance’ equal to?
Total variance = common factor variance + measurement error
What does variables being independent mean?
That they cannot be calculated from other variables - e.g. if item A was height and B was weight, then it would be inappropriate for C to be a height to weight ration since it would necessarily be correlated to both A and B
What does varimax do?
Tries to equalise variance across all the factors
What are 4 possible errors in factor analysis?
- Interpreting the unrotated solution (SPSS spits this out by default)
- Applying rigid rules to the extraction of factors (KG vs scree method)
- Replication is very important
- Factor validity is not attested to only by item content (face validity), it must also be compared with some other measure
What you rotate a solution what changes? What doesn’t?
The communality of each variable remains the same.
The eigenvalues of factors do not
What is regression and what is it used for?
It is a weighted sum of the factor loadings, used to calculate the factor scores.
What is a varimax rotation?
An orthogonal rotation - tries to make factors independent by turning them to 90 degrees
If you expect your factors to be correlated which kind of rotation should you use?
Oblique - look at the pattern matrix
Why does the sample size have to be big enough?
Because FA is easily influenced by error variance
Which is the simplest form of FA and suitable for undergraduate level?
Principal components analysis (PCA)
According to principal components analysis (PCA), all items have a communality of what?
1
Therefore the factors will, between them, account for 100% of the variation among the items.
KMO is a measure of sampling adequacy and addresses if the sample is big enough. What are its ranges?
>.5 min .5-.7 mediocre .7-.8 good .8-.9 great .9< superb
What are the 7 stages of carrying out exploratory factor analysis?
- Ensure that data are suitable
- Decide on the model - PAF or PCA
- Decide how many factors are required to represent the data
- When using PAF estimate the communality of each factor
- Factor extraction
- Rotate the factors ensuring that simple structure has been reached
- Compute factor scores
According to principal axis factoring (PAF), what is ‘total variance’ equal to?
Total variance = common factor variance + specific item variance + measurement error
Is the scree plot alone enough to identify latent constructs?
No. FA should always be conducted in light of psychological theory - how many factors does it make sense to extract from a psychological point of view.