Factor Analysis Flashcards

1
Q

What are you trying to do in a factor analysis?

A

Trying to build a table - solid, dependable, everyone uses them, easy to understand. But when you look closer, it is hard to construct a table - legs, what it is made up of etc

First one won’t be very good

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you make a questionnaire?

A

Items: the part people interact with - see if they are good quality

Factors: the structure which holds up the items, want a few but provide the main support

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the main idea?

A

Reduce a large number of variables to a smaller set of representative, meaningful variables while keeping as much information as possible - identify factors from a large set of correlated items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do you get out of a FA?

A

A set of statistically identified factors - clusters of items which all measure the same characteristics / data - use these as variables for future analyses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are factors?

A

Clusters of items which all measure the same characteristic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the goal?

A

Identify how many factors you have and what characteristic those factors each represent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the stages of FA?

A
  1. Identify variables and design
  2. Check data and assumptions
  3. Rotation
  4. Interpret the results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Identify variables and design: what are the initial checks?

A

Check the data
is there any missing?
what scale is the data measured on? - what do each end mean?
how many items and participants are there?

remove ppts who are incomplete cases or make invalid answers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Checking data and assumptions: what are the initial checks?

A

Normality and standard deviations
check the items are normally distributed (all of them)
check SD’s are between 0.5 and 1.5
identify the worst offenders - if all the data is skewed, can’t chuck them all out, identify the worst ones, may want to exclude them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What should the SD’s be?

A

Between 0.5 and 1.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Checking data and assumptions: what are the second checks?

A

Correlations
Sphericity
Sampling adaquacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Checking the correlations

A

You get a massive correlation - you expect them to correlate in FA, we are looking for underlying factors that explain groups of items so want correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the possible problems with correlations?

A
  1. items that don’t correlate with anything else - might indicate that an item doesn’t measure the construct, so not valid
    look for items with r < .3 or p > .05 (not just one, has to be many)
  2. items that correlate too highly - too much overlap, measuring the same thing, not valid
    singularity r > .9
    problems with multicollinearity

Check the determinant - should be greater than 0.00001 - no problems with multicollinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What should you do with correlations?

A

Identify the worst offenders - can’t discard them all if lots of WO, pick the ones which don’t correlate with loads of items, if it is just one item, then it is fine, report with justification

At this point, run the analysis with the final set of items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you check for suitability of the data?

A

Kaiser-Meyer-Olkin Measure of sampling adequacy KMO
Do you have a sufficient sample to extract the factors?
Values range between 0 (inappropriate) and 1 (go for it)
marvellous - bigger than .9
middling - bigger than .7
miserable - above 0.5
if below 0.5, you should stop and collect more data or do something else

Report it and cite the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you check for the sphericity of the data?

A

Using Bartlettes test - see whether the correlations are too small for FA
if everything is okay, this test will be significant
Reporting: X2 (df) = chi square value, p value

17
Q

Interpreting the results: what do we do here?

A

Extraction - how many do we have, which items below with each factor, what do the factors represent?

18
Q

What is extraction?

A

Deciding how many factors best capture our data - we want parsimony, explaining as much variance as we can with as few factors as possible - don’t want to lose much data

19
Q

What is Kaisers criteria for extraction?

A

Automatically extracts eigenvalues bigger than 1

20
Q

What is an eigenvalue?

A

The variance in all the variables accounted for by a particular factor
If it is low, it doesn’t explain much - can be disregarded
A measure of how useful it is - each factor has its own eigenvalue - measure how much weight of the table each leg holds up, if it isn’t holding much up, we can get rid of it and not lose much

21
Q

What are the initial eigenvalues?

A

Tells you how much factors you have - a factor for each given variable

22
Q

Is Kaisers criteria always okay to use to extract factors?

A

No, it is only reliable if it meets certain circumstances

23
Q

What are the circumstances in which you can use Kaisers criteria?

A

There are fewer than 30 variables and all commonalities are bigger than 0.7
or
There are more than 250 participants and the average communality is bigger or equal to 0.6

24
Q

What is a communality?

A

The percent of variance in a variable explain by all of the factors together - after extraction, some information is lost
Communality after extraction - variance in each variable explained by the remaining factors
Higher - factor structure better explains the variance in variables - bigger is better
e.g. if .67 = 67% of variance of item 3 is explained by all of the factors

25
Q

What do we do if Kaisers criterion is unreliable?

A

If you have more than 200 ppts, you can use a scree plot to decide the number of factors

26
Q

What are we looking for in a screeplot?

A

The inflexion point: where the slope changes
It is the point where the slope changes and goes up, count from the left of the point of inflexion - don’t keep the factor where it changes
Very prone to interpretation - as long as you explain it
Can rerun the analysis with fixed amounts of factors

27
Q

How can you tell how much variance our factors explain?

A

Look at: total variance explained table
Extraction sum of loadings - at the end of factor 6, there will be a cumulative percentage of what all the factors explain together

28
Q

What do you do after deciding the amount of factors?

A

Rotation

29
Q

What is rotation?

A

It optimises how the items load onto a factor - it should equalise the variance explained of each factor, so they all explain similar amounts

Aids and clarifies the interpretation - doesn’t change the number of factors or effect the method of extraction

30
Q

What are the two types of extraction?

A

Orthogonal - factors are uncorrelated, independent of each other
Oblique - when the factors correlate, theoretical grounds thinking they will correlate

31
Q

What do you use if you believe the factors are independent of each other?

A

Orthogonal rotation

Varimax

32
Q

What do you use if you believe the factors are correlated?

A

Oblique rotation

Direct oblimin

33
Q

How do you decide which rotation to use?

A

Look at previous research - see what other questionnaires have done
Think about your factors - do you think they will correlate

Make a choice based on your own research and judgement - and explain why you have made this decision

34
Q

What does rotation actually do?

A

Spreads the variance more evenly among the factors - it is the same total variance explained, but the eigenvalues have changed so that they are better distributed - each factor explains a similar amount of variance rather than 1 explaining loads

35
Q

How do you identify and name the factors?

A

Look at the items listed that load onto each factor - the number is the loading, the higher the loading, the stronger the association with that factor
Name them yourself
Negative loadings - because some of the questions were worded negatively
Sometimes cross-loadings - items which led onto more than one factor - allocate to either a higher factor or one that makes the most sense