Week 4 Flashcards

Question 1

Q

What does PCA stand for?

Answer

A

Principal Component Analysis (PCA)

Question 2

Q

What essentially is Principal Component Analysis (PCA)?

Answer

A

Combining variables into weighted sums

Question 3

Q

What essentially is Factor Analysis

Answer

A

A technique for showing relationships among variables by relationships to hypothetical underlying factors aka LATENT VARIABLES

Question 4

Q

Principal Component Analysis (CFA) uses correlation as the underlying association among variables, but Factor Analysis does not, True/False?

Answer

A

FALSE

They both use correlations as the underlying thing

Question 5

Q

How would you characterise the difference between Principal Component Analysis (CFA) and Factor Analysis?

Answer

A

PCA looks for ‘optimal linear transformations’ - I don’t know what this means but I’m imagining smushing variables together.

Factor Analysis makes a theoretical assumption that there is an underlying latent variable that explains the observed variables

Question 6

Q

What determines where you draw the line for the FIRST PRINCIPAL COMPONENT?

Answer

A

The line that minimises the diagonal lines to each of the data points (there’s probably a more technical way to say that) - and I think that line is based on something called the EIGEN VECTOR

Question 7

Q

What even is the SECOND PRINCIPAL COMPONENT?

Answer

A

I think it might actually be the original data, but I’m not sure

Question 8

Q

What is the name given to the VARIANCE of the FIRST PRINCIPAL COMPONENT?

Answer

A

The first EIGENVALUE

Question 9

Q

What do the weights of the first principle component always add up to?

Question 10

Q

What is one of the rules for determining the SECOND PRINCIPAL COMPONENT?

Answer

A

It must have zero correlation with the FIRST principle component

Question 11

Q

What’s the limit to number of principal components you can have?

Answer

A

It is based on the number of variables you have.

Number of variables = max principal components

Question 12

Q

Is principle component analysis a MATHEMATICAL or a STATISTICAL technique? And why

Answer

A

It’s mathematical

It’s based on matrix algebra

It doesn’t contemplate error values

Question 13

Q

According to Kaiser’s rule, when should you cut off the number of factors/components?

Answer

A

When the Eigenvalue drops below 1

Aka Kaiser-Guttman

Question 14

Q

Do we like the Kaiser-Guttman rule?

Answer

A

NO!

Always tends to chose a third of the variables.

Schmitt says it is “the most inaccurate” of all methods

Question 15

Q

What are the four methods for determining how many components/dfactors to use

Answer

A

Kaiser-Guttman (don’t use)
Scree plot
Parallel Test (the one that uses random data)
MAP test (the Minimum Average Partial Correlation test) Velicer(1976)

Question 16

Q

What do you use the Parallel Test for?

Answer

A

Deciding how many factors/components to use.

Question 17

Q

When doing the Parallel Test, do you use the mean of the random data or the 95th percentile?

Answer

A

I think you use the 95th percentile.

Question 18

Q

When examining your component loadings - I think it’s in a FACTOR LOADING MATRIX - it is common practice to remove any loadings that are less than 0.5, True/False?

Answer

A

FALSE

But close. You can remove any with a loading of less than 0.3.

This is called SUPPRESSING THE LOADINGS

Question 19

Q

In factor analysis, after extracting your factors/components, you are going to want to do some ROTATION. What are the two types of rotation you can do

Answer

A

Orthogonal and Oblique

Question 20

Q

What assumption underpins ORTHOGONAL rotation?

Answer

A

The factors and UNcorrelated

Question 21

Q

What assumption underpins OBLIQUE rotation?

Answer

A

The factors ARE correlated

Question 22

Q

In psychology, are you more likely to do ORTHOGONAL or OBLIQUE rotation?

Answer

A

OBLIQUE

In other words, the one that allows the factors to correlate

Question 23

Q

When you do oblique rotation, you will get two matrices of loadings. What are they called?

Answer

A

The factor PATTERN matrix, and

the factor STRUCTURE matrix

Question 24

Q

What does the factor PATTERN matrix contain?

Answer

A

The regression coefficients… of something

Question 25

Q

What does the factor STRUCTURE matrix contain?

Answer

A

The correlations… of something

Question 26

Q

Of the two matrices of leadings that we get upon completing a rotation, which one do we generally report?

Answer

A

The factor PATTERN matrix

Question 27

Q

If you do an OBLIQUE rotation, the PATTERN and the STRUCTURE matrix are the same thing, T/F?

Answer

A

FALSE

They are only the same thing if you do an ORTHOGONAL rotation

Question 28

Q

With FACTOR ANALYSIS, what is our aim in relation to PARTIAL CORRELATIONS between observed variables?

Answer

A

We want the PARTIAL CORRELATIONS to be as close to zero as possible

Question 29

Q

In factor analysis, what is the name given to the VARIANCE that is die to the COMMON FACTORS?

Answer

A

The COMMUNALITY

Question 30

Q

In factor analysis, what is the opposite of COMMUNALITY?

Answer

A

The UNIQUE VARIANCE

… because communality refers to the variance in the observed factors that is due to the common factors.

Question 31

Q

Exploratory Factor Analysis works best with ordinal data, T/F?

Answer

A

FALSE

It assumes continuous data (ie Interval or ratio)

You can proceed, but you’ve just gotta note the limitations

Question 32

Q

Missing data is a big deal for Exploratory Factor Analysis, T/F

Answer

A

TRUE

Can’t have missing data. Either impute the data or delete the case.

Question 33

Q

Is Exploratory Factor Analysis sensitive to sample size?

Question 34

Q

If you have LOW CORRELATIONS between variables to begin with, is this a good or bad sign for Exploratory Factor Analysis?

Answer

A

Bad sign

Factor analysis is based on the assumption that there is some common thing going on, so low correlations means you might be on a hiding to nothing

Question 35

Q

How can you test whether your variables have a high enough correlation to run Exploratory Factor Analysis?

Answer

A

Yep, it’s called BARTLET’S TEST, but we don’t like it

Question 36

Q

When doing Exploratory Factor Analysis, does linearity matter?

Question 37

Q

If you have low PARTIAL CORRELATIONS between variables to begin with, is this a good or bad sign for Exploratory Factor Analysis?

Answer

A

Good sign

Don’t ask me why

Question 38

Q

When doing Exploratory Factor Analysis, do OUTLIERS matter?

Answer

A

Yup

Cos Pearson’s r isn’t robust to outliers

Question 39

Q

If you have CRAZY HIGH correlations (multicollinearity) between variables to begin with, is this a good or bad sign for Exploratory Factor Analysis?

Answer

A

Bad sign

You don’t want them two high or two low

Question 40

Q

What is Kaiser’s Measure of Sampling Adequacy about?

Answer

A

It’s an overall statistic that tells you how suitable your dataset is for factor analysis

It has something to do with image and anti-image

Question 41

Q

What is the minimum value for Kaiser’s Measure of Sampling Accuracy - that is, the one that signals unacceptability?

Question 42

Q

When you’re looking at an Anti-Image matrix following a Kaiser’s Measure of Sampling Accuracy, what do we want to the values ON the DIAGONAL to approach

Answer

A

Approach 1

Question 43

Q

When you’re looking at an Anti-Image matrix following a Kaiser’s Measure of Sampling Accuracy, what do we want to the values OFF the DIAGONAL to approach

Answer

A

Approach zero

Question 44

Q

What are the TWO major methods of finding factors in SPSS (the ones that Schmitt (2011) recommends.

Answer

A

Maximum likelihood (ML)

2. Principle axis factoring (PA)

Question 45

Q

Under what circumstances would you use PRINCIPAL AXIS FACTORING (PA) when finding factors?

Answer

A

If your data is not normally distributed

Question 46

Q

Under what circumstances would you use MAXIMUM LIKELIHOOD (ML) when finding factors?

Answer

A

I think when you want to generalise to the general population

Note: requires normal distribution

Question 47

Q

What does it mean when you see .999 in a factor loading table?

Answer

A

It’s a Heywood case

It means there is an error

Question 48

Q

What is the term given to when you weight each factor loading the same (hint: equivalently)?

Answer

A

Tau-equivalent

Question 49

Q

What are the three ways SPSS provides for estimating factor scores? And which is the one Geoff recommends, and why?

Answer

A

a regression method
the BARTLETT method (actually Maximum Likelihood or Weighted Least Scores)
ANDERSON-RUBIN

Geoff recommends BARTLETT, because for oblique solutions Anderson-Runin is misleading because it assumes uncorrelated scores.

Question 50

Q

What are the SIX differences that Geoff highlights between PCA and FA?

Answer

A

Principle Components = linear combinations of obersved variables. Factors = theoretical entities.
In FA, error is explicitly modelled. Not so for PCA.
In FA, if factors are removed/added, the other factor loadings change. Not so for PCA
FA is a theoretical modelling method (so can test model fit). Not so for PCA.
FA ‘fragments’ variability into common and unique parts. Not so PCA
PCA has a canonical algorithm that always works. FA has many, which need to be matched to the data

Question 51

Q

Do FA and PCA have the same general form?

Answer

A

Yes, and it goes like this:

Observed Variable = Loading x F + error

Where F is either a factor or a component

Question 52

Q

Do FA and PCA typically deliver similar results?

Answer

A

Yes, particularly if they’re applied to a large number of variables and a large sample size

Question 53

Q

What is the tip provided by a past tutor in the subject about when to use PCA and when to use FA

Answer

A

Use FA is you assume or wish to test, a theoretical model of latent factors causing observed variables

Use PCA if you simply want to reduce your correlated observed variables into a smaller set of important uncorrelated composite variables

Question 54

Q

With Varimax rotation, why would you not get both pattern and structure matrices?

Answer

A

Because Varimax is a form of ORTHOGONAL rotation, and because orthogonal rotation produces perfectly uncorrelated stage, you end up with pattern and structure matrices that are identical. Hence no need to produce both.

Question 55

Q

The Promax rotation technique is an OBLIQUE technique, which means factors are allowed to be correlated, thus it produces pattern AND structure matrices. But why?

Answer

A

The pattern matrix shows factor loadings

The structure matrix shows correlation between the variable and the factor.

The factor loadings in the pattern matrix PARTIAL OUT any common variation among factors

Question 56

Q

If you square and sum each of the factor loadings for a single VARIABLE, what do you get?

Answer

A

The COMMUNALITY for that variable

Question 57

Q

If you square and sum each of the factor loadings for a single FACTOR, what do you get?

Answer

A

The EIGENVALUE for that Factor

Question 58

Q

If your data is not normally distributed, should you be cautious about using EFA?

Answer

A

Nope, you just need to make sure you use the PRINCIPAL AXIS FACTORING extraction method

Question 59

Q

If you have as many items measured as you have participants, should you be cautious about using EFA

Answer

A

YES FFS

Some people say you need st least 20 items per participant

Question 60

Q

If your MSA is below 0.5 should you worry about using an EFA? Why tho

Answer

A

Yes

Because a low MSA score means the correlation between your variables is quite low.

This means there is little common variance for you to extract, so you probably won’t get any factors that explain multiple items well

Question 61

Q

If some of your commonalities are equal to or greater than 1.0, should you be worried about doing an EFA?

Answer

A

Totes

A communality score of 1 means there’s basically no unique variance

If this happens, computer will set a very high communality factor as a kind of error signal. This is the origin of Heywood cases.

Question 62

Q

What is an Eigenvalue anyway? No, really, what actually is it

Answer

A

It is the variance explained by a principal component

If it is about components, why are we thinking about it in relation to EFA? Nobody knows

Question 63

Q

Imagine you’re doing a EFA.

Using SPSS outputs, how do you tell which item is explained WORST by the extracted factors?

Answer

A

First, you look at the COMMONALITIES table

Then you find the one with the LOWEST score

REASON: Communalities reflect the sum of squared factor loadings and thus, a low communality means low factor loadings

Question 64

Q

In EFA, one of the benefits of rotating the factors is to increase the total variance explained, T/F?

Answer

A

FALSE

Rotation will never affect the total explained variance, it will only affect how each factor contributes towards this total.