Exploratory Factor Analysis Flashcards

1
Q

What techniques are utilised for identifying clusters of variables?

A

Factor analysis and Principle component analysis (PCA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What three uses do these techniques have?

A
  1. Understanding the structure of a set of variables
  2. Construct a questionairre to measure an underlying variable
  3. Reduce a dataset to a more manageable size while retaining as much of the original information as possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does factor analysis attempt to achieve parsimony?

A

By explaining the maximum amount of common variance in a correlation matrix using the smallest number of explanatory constructs (latent variables )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does this differ to PCA?

A

PCA attempts to explain the maximum amount of total variance in a correlation matrix by transforming the the original variables into linear components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is meant by the term factor loading?

A

A factor loading refers to the coordinate of a variable along a classification axis (e.g. Pearson correlation between factor and variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does a factor loading tell us

A

It tells us something about the relative contribution that a variable makes to a factor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are scores on the measured variables predicted in factor analysis?

A

From the means of those variables plus the person’s scores on the common factors multiplied by their factor loadings, plus scores on any unique factors within the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is meant by common factors?

A

factors that explain the correlations between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are components predicted in PCA?

A

In PCA, the components are predicted from the measured variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Name a major assumption of factor analysis

A

One major assumption of factor analysis is that the algebraic factors represent real-world dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the weighted average calculated after PCA

A

Multiplying their score by the factor loadings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is the weighted average rarely used

A

as it is over simplistic and is influence by the measurement scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the simplest technique for calculating factor score coefficients?

A

A regression technique;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the mean and variance look like using the regression technique?

A

the resulting actor scores have a mean of 0 and a variance equal to the squared multiple correlations between the estimated factor scores and the true factor values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a downside to the regression model?

A

A downside is that the scores can correlate with other factor scores from a different orthogonal factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What can be done to overcome this problem of the regression model? (2)

A

The Bartlett method and the Anderson-Rubin method can be used to overcome this problem. The Bartlett method produces factor scores that are unbiased and only produce correlations with their own factor. The Anderson-Rubin method is a method which produces factors which are uncorrelated and standardized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How is the matrix of factor score coefficients obtained?

A

By multiplying the factor loadings by the inverse of the original correlation or R matrix (dividing the factor loadings by the correlation coefficients)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When is the Anderson-Rubin method best?

A

When uncorrelated scores are required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does the method for discovering factors depend on?

A

whether the results should be generalized from the sample to the population (1) and whether you are exploring your data or testing a specific hypothesis (2).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does random variance refer to?

A

Random variance refers to variance that is specific to one measure but not reliably so.

21
Q

What does communality refer to?

A

the proportion of common variance present in a variable.

22
Q

What is meant by extraction?

A

Extraction refers to the process of deciding how many factors to keep.

23
Q

What do Eigen values represent?

A

Eigenvalues associated with a variate indicate the substantive importance of that factor. Therefore, factors with large eigenvalues are retained. Eigenvalues represent the amount of variation explained by a factor.

24
Q

What is a scree plot?

A

A scree plot is a plot where each eigenvalue is plotted against the factor with which it is associated.

25
Q

What is meant by a point of inflection?

A

The point of inflexion is where the slope of the line changes dramatically. This point can be used as a cut-off point to retain factors

26
Q

What else can be used as a criterion?

A

The eigenvalues

27
Q

What is the difference between Kaiser’s and Joeliffe’s criterion?

A

Kaiser’s criterion is to retain factors with eigenvalues greater than 1. Joliffe’s criterion is to retain factors with eigenvalues greater than 0.7.

28
Q

With what method should you restrict your conclusions to the sample?

A

PCA

29
Q

What is meant by unique variance?

A

Variance that can be attributed to one measure

30
Q

What its the most common way of estimating communality?

A

SMC (squared multiple correlation)

31
Q

When can kaiser’s criterion be accurate?

A

When the number of variables is less than 30 and the resulting commonalities are all greater than 0.7 or when the sample size exceeds 250 and the commonalities are over 0.6

32
Q

What do commonalities mean for the validity of our results?

A

They represent a loss of information, the closer to one they are the better our factors are at explaining the data

33
Q

Once factors have been extracted how do we calculate the degree two which variables load onto these factors?

A

In order to make interpretation easier, rotation can be used where rotation rotates aces such that variables are loaded maximally to only one factor.

34
Q

What two different types of rotations are there?

A

Orthogonal rotation refers to rotation while keeping the factors uncorrelated. Oblique rotation allows factors to correlate.

35
Q

What does the choice of rotation depend on?

A

The choice of orthogonal or oblique rotation depends on whether there is a theoretical reason to suppose that the factors should correlate or should be uncorrelated and how the variables cluster on the factors before rotation. Orthogonal is used when its unrelated and arguably should not be used for human qualities

36
Q

Describe the three varieties of orthogonal rotation

A

Quartimax- more variables in less factors
Verimax- less variables in more factors (most common)
Equemax- erratic hybrid

37
Q

what is meant by direct quartimin rotation?

A

Oblique rotation when the delta is 0 and so doesn’t let the factors correlate too high

38
Q

What should happen if variable correlations are very high or low?

A

Remove the variables

39
Q

What does the determinant tell us?

A

The determinant tells us whether the correlation matrix is singular or if all variables are completely unrelated.

40
Q

What score should the determinant have?

A

The determinant should be larger than 0.00001.

41
Q

What is the likelihood of a non-positive definite matrix?

A

A non-positive definite matrix is not possible. The most likely reason for this is having too many variables and too few cases of data.

42
Q

What variables on the anti-image matrix should be removed? Why is this?

A

Variables in the diagonal line on the anti-image matrix with a score of less than 0.5 should be removed. These scores denote the Kaiser-Meyer-Olkin measure of sampling adequacy. The off-diagonal scores should be small.

43
Q

What is the first part of factor extraction?

A

The first part of factor extraction is to determine the linear components within the variables – the eigenvectors.

44
Q

_____ is a poor sample size
______ is good
______ is excellent

A

100; 300; 1,000

45
Q

What can decide what sample size is needed

A

more commonalities the less n needed and the more variables the more n needed

46
Q

What can be calculated to find the sampling adequacy?

A

KMO

47
Q

What test do we use to see if a correlation matrix is significantly different to an identity matrix?

A

Bartletts test

48
Q

What is meant by multicollinearity and singularity?

A

variables highly correlated; perfectly correlated

We try to avoid this

49
Q

When is an assumption of normality important?

A

When you’re generalising to the population