FA & PCA Flashcards

Question

what is EXTRACTION | ?

Answer 1

Factor extraction is an iterative process – attempts to maximise the variance explained

Answer 2

Extracted communalities represent the amount of variance in each variable explained by the factors (i.e. variables as DVs and factors as IVs)

Answer 3

Each item has a loading on each factor. You want loadings to be high on one factor and not another. Id square all them, gives you the communalities after extraction.

Answer 4

Eigen Values are the amount of variance explained by each FACTOR Cumulative EIGEN values then show how each factor contributes to overall variance explained. You want to pick out the most amount of variance with least amount of factors. HOW many are worth retaining.

Answer 5

Most common is Principal Axis Factoring

Answer 6

Use scree plot and look at POINT OF INFLECTION – the ELBOW of the curve. SUBJECTIVE – DISCUSS IN RELTION TO THEORETICAL THRUST in exam. More complex: Parallel analysis • Creates a random dataset with same number of cases and variables. • Run FA/PCA on random data and generate averaged eigenvalues. • Compare real eigenvalues and generated eigenvalues, and retain eigenvalues from the real dataset that are higher than those from the random dataset.

Answer 7

Rotates the axes of the factors to align better with variables. If there are items that load on too many factors (not clean), rotation can enhance interpretability by moving the variance around between the factors. But with a target t simplify interpretation. Trying to get to 1 factor per item.

Answer 8

Orthogonal | oblique

Answer 9

Orthogonal – keeps factors at 90 degrees (uncorrelated / independent factors) VARIMAX (maximises the variance across factor loadings) – all psych is arguably overlapping factors though.

Answer 10

Oblique – Factors are allowed to correlate. More realistic.

Answer 11

When an orthogonal rotation is used, a factor loading matrix is produced – use this for interpretation.

Answer 12

– pattern matrix – factor loadings but after partialling out overlap with other factors. Best for interpretation. Use pattern of factor loadings to help label and define factors

Answer 13

A scree plot is a method for graphically determining the number of factors to be retained in the analysis. It is achieved by plotting the eigenvalues (which reflect the amount of total variance/covariance explained by the factor) for each factor in size order. The number of factors or components to be reatianed is determined by the elbow in the plot (where the size of the eigenvalues changes relatively little from one fator to another). Accurate to +/- a factor or so. Some debate about whether to include factor at elbow or not.

Answer 14

Varimax rotation is a form of orthogonal rotation of the solution (ie all factors are uncorrelated) which is designed to maximizes the variance of the factor loadings over the variables (hence the name). This simplifies the factors by having either high or loading variables and avoiding variables with mid-loadings. This makes factor interpretation/labelling easier (hence popularity of method).

Answer 15

KMO measure of sampling adequacy (MSA) and Bartlett’s test of sphericity are means for establishing the factorisability (or factorability) of a correlation matrix. In other words for checking whether there are meaning ful relationships between subsets of the variables which can cluster into factors/components. KMO has to be >0.6 to indicate factorisability. Bartlett’s test being significant means that the hypothesis that there are no factors can be rejected but this test is overly sensitive.

Answer 16

KMO measure of sampling adequacy (MSA) and Bartlett’s test of sphericity are means for establishing the factorisability (or factorability) of a correlation matrix. In other words for checking whether there are meaning ful relationships between subsets of the variables which can cluster into factors/components. KMO has to be >0.6 to indicate factorisability. Bartlett’s test being significant means that the hypothesis that there are no factors can be rejected but this test is overly sensitive.

Answer 17

The anti-image correlation matrix (AICM) is another means for determining factorisability. To get the off-diagonal elements of the AICM one calculates the partial correlation between variable X and Y partialling out all the other variables (and then multiply this correlation by -1). Even if X and Y are related then, if other variables covary with X and Y (ie can form a factor with X and Y), the partial correlation between X and Y will be small. So the off-diagonal elements of the AICM should be zero. The KMO sampling adequacy measures for each variable are put on the diagonal of the AICM and these values should be as close to 1 as possible.

Answer 18

Should comment on the decision to retain x factors. Given the expected x factors, hypothesis testing could be justified, Catell’s Scree, Kaiser’s Eigenvalue, interpretability (the no. of factor that produce a ‘meaningful’ solution) or other methods could have been used. Hypothesis testing + at least 1 other should be mentioned. The consistency across these methods, with all seeming to suggest 3 factors, is also worth noting (although scree not shown, eigenvalues are listed).

Answer 19

• Communalities. The communality for a variable is the variance accounted for by the factors. Extracted communalities (or h2) represent the proportion of each variable's variance that can be explained by the retained factors. Read their output and say: In this instance communality values are xx suggesting a xxx solution with xx factors retained.

Answer 20

Communality is the sum of squared loadings (SSL) for a variable across factors

Answer 21

The proportion of variance in the set of variables accounted for by a factor is the SSL for the factor divided by the number of variables (if rotation is orthogonal)

Answer 22

The proportion of variance in the solution accounted for by a factor—the proportion of covariance—is the SSL for the factor divided by the sum of communalities (or, equivalently, the sum of the SSLs).

Answer 23

SPSS can use a variety of methods to calculate factor scores e.g. regression, Bartlett, Anderson-Rubin. aggregating standardised scores

Answer 24

Factor loadings are the correlation of each variable with each factor. Factors are defined by high loadings.

Answer 25

1) Normally distributed (frequencies / graphs) 2) No univariate, bivariate or multivariate outliers (freq / scatterplots) 3) Normality not required (but helps clarity) 4) Illegal values (out of range of given data, e.g. <1 on a 1-7 likert) 5) Restriction of range in the data (bias questions like… do you like your psych course) 6) Collinearity & Singularity (see corr matrix / high SMC values) 7) Factorisability of corr matrix - rue thumb need bivariate corrs above 0.3 + look for low partial pairwise correlations and partial out all other variables (can use KMO) 8) Distinction rest: FA can only produce what’s put in - question wide enough set items; quality of ratings and checks (possibility of halo effects, perhaps assessed with the first un-rotated factor) 9) Sample size - Note no single opinion on matter but mention matter (e.g. Comfrey & Lee = a minimum of 300 cases for a good factor analysis or ratio of cases to variables - Nunnally 10:1, Guildford 2:1, Barrett &Kline find 2:1 replicates structure while 3:1 is better). Current data look…… 10) Ratio of variables to Factors - (as above e.g. Tebachnik & Fidell 5 or 6:1; Klein 3:1; Thurstone 3:1; Kim & Mueller 2:1). data is ….. as there are not lots of factors or items which don’t correlate with other items. 11) Which type to use: listwise, pairwise (to be avoided) or imputation. Always listwise if numbers allow. Good answer may discuss different forms of imputation (regression, mean).

FA & PCA Flashcards

(50 cards)