Test 3 Flashcards by Kirstin Mayes

What does factor analysis do?

this analytic technique attempts to find groupings of items that constitute sub-factors within a single measure.
Be aware that these are groupings of ITEMS, not individual participants.

How well did you know this?

Not at all

Perfectly

Deep dive: what is the diff between factor analysis and PCA anyway?

Run factor analysis if you assume or wish to test a theoretical model of latent factors.
Run principal component analysis If you want to simply reduce your correlated observed variables to a smaller set of important independent composite variables

How well did you know this?

Not at all

Perfectly

What is an eigenvalue?

The eigenvalue is a mathematical representation of the degree of clustering of items created by the PCA program. The larger the eigenvalue, the better the clustering of items for that particular grouping. Some groupings will be very poor (eigenvalue near zero) but others will be larger.

How well did you know this?

Not at all

Perfectly

Eigenvalue cut off

Eigenvalues greater than one is an arbitrary cut-off to try to catch the better clusterings of items. But it’s a crude and insensitive cut-off.

How well did you know this?

Not at all

Perfectly

What is the deal with rotation anyway?

Orthogonal is the case where you force the factors to be maximally uncorrelated, whereas oblique allows more correlation among the factors. Varimax is orthogonal, and is often used. Oblimin is oblique.

How well did you know this?

Not at all

Perfectly

Where is the point where the mountain ends and the loose gravel (scree) begins

One in from the point of inflection, i.e. one in from the elbow

How well did you know this?

Not at all

Perfectly

Steps for deciding on number of subfactors

Where is the kink or elbow in the scree plot?
Do we have a relatively small number of subfactors with a reasonable number of items in each (e.g., 4 or more items)?
Do these items yield an adequate Cronbach’s alpha (i.e., greater than .70) for the separate subfactors?
Does a parallel analysis (PA) support the other converging evidence?

How well did you know this?

Not at all

Perfectly

what’s a parallel analysis?

It is a Monte Carlo-generated set of eigenvalues that would occur by chance based on the number of items and participants that you have.
‘Monte Carlo’ refers to a data simulation that is run by the computer generating random values within a given range. i.e. under the line is under chance

How well did you know this?

Not at all

Perfectly

Crossover on scree plot with parallel analysis

tells you how many sub factors we have, above the line or at cross over points

How well did you know this?

Not at all

Perfectly

KMO and Bartlett’s Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy

Bartlett’s Test of Sphericity Approx. Chi-Square

How well did you know this?

Not at all

Perfectly

Common errors with factor analysis

Using the ‘eigenvalues greater than one’ rule
Misinterpreting the kink in the scree plot. The proper number is usually the number to the left of the kink.
Ignoring the parallel analysis approach.
‘Forcing’ a particular solution based on your theoretical views (or biases). Let the data tell you what is going on there.

How well did you know this?

Not at all

Perfectly

What are the four criteria for determining factor structure of a PCA?

-Suppressed factor loading’s less than .3
Look to see if clusters make sense
KMO .6 minimum cutoff
Check Cronbach Alpha

How well did you know this?

Not at all

Perfectly

What is confirmatory factor analysis

It occurs AFTER someone has identified a factor structure in EFA
It serves to determine whether the previously obtained factor structure is REPLICABLE in another sample
It is NOT done in SPSS; instead it is done in structural equation modeling (SEM)

How well did you know this?

Not at all

Perfectly

Comparing EFA to CFA

EFA: All items will load onto all factors. Does NOT generate model fit indices.
CFA: Not all items load onto all factors. It generates model fit indices.
Point: CFA tests how well the proposed model ‘fits the data’

How well did you know this?

Not at all

Perfectly

model fit value for CFA

It needs to be lower than 7.0. If not it means that there is significant ‘misfit’ between the proposed model and the actual data

How well did you know this?

Not at all

Perfectly

Classify by variable or case

Factor analysis classifies variables or items.
However, one can classify cases (individual subjects) within your dataset
What’s the advantage of this approach? It attempts to identify relatively homogeneous groupings of individuals who share one or several characteristics.
Then you can use those groupings to compare and contrast on other variables. CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP

How well did you know this?

Not at all

Perfectly

Other valid points

Study These Flashcards

Moderation is about comparing groups of individuals

One can cluster items OR cases

Nominal or categorical variable

Study These Flashcards

where numbers refer to discrete categories, but the numbers don’t signify higher or lower

Ordinal data

Study These Flashcards

from the word “order”, where you rank order from 1st to whatever

Interval or continuous data

Study These Flashcards

The most common type of data in psychology

Ratio scale

Study These Flashcards

think of this type as an interval scale with a zero point. For example, measurement of temperature, height, and other physical things are on a ratio scale

How do we treat categorical data?

Study These Flashcards

can generate frequencies with these data,
can use them as an IV in an analysis of variance (ANOVA or MANOVA),
can do a chi-square test.
One cannot generate a mean for categorical data, and cannot do a correlation with this type.

What do you do with ordinal data?

Study These Flashcards

It’s a mixture of categorical and interval, so it presents a bit of a fuzzy picture.
Usually just report the rank ordering that is obtained
There are non-parametric statistics that are useful with ordinal data (e.g., phi correlation) Non-parametric = small, skewed etc

What do you do with interval (continuous) data?

Study These Flashcards

This type is the most flexible in terms of quantitative analyses. Most data in psychology are interval data.
One can derive a mean and standard deviation. Can do correlations, can do t-tests, ANOVA, and many other statistical tests.
Most of the data in the PSYC 325 questionnaire would be of this type

Difference between description and analysis

Description: Means, Standard Deviations, Correlations etc Analysis: also known as “inferential statistics”, which includes t-tests, ANOVA, and regression. “Inferences” refer to hypotheses as opposed to description

Descriptive statistics with interval data

Mean: the average (sum total divided by number of individuals) Median: the score that divides the group in half Mode: the most common score

You learned how to do an ANOVA in the first half of the course, let’s review the facts

``` The IV (or IVs) must be categorical and the DV must be continuous If you have categorical variables, then you can run an ANOVA. The reverse is true as well: experiments can be analysed with correlations and regressions. ```

Mean group differences vs. correlation

Mean group differences: t-test; ANOVA Associations: correlations, regressions, factor analysis Categorical vars are IVs in ANOVA Continuous vars can be used in both types of analyses

Define correlation

``` measure of the degree to which two variables covary The Pearson r correlation varies from: –1.00 (high negative correlation); to 0.00 (no correlation), to +1.00 (high positive correlation) ```

From correlation to multiple regression

Regression has multiple predictors associated with a single dependent variable the r statistic is the same as the beta weight (b) generated in a regression analysis (for one variable regressed on another)

Regression equation

y = constant + b1(x) Constant = “intercept” (determines the placement of the regression line vertically in a graph) b1 refers to the B (unstandardised regression coefficient) - slope of the regression line - in SPSS it is B

But what does the regression equation mean?

X-values are the actual observed numerical values that participants generated in their survey for the predictor Y-values are the actual observed numerical values that participants generated in their survey for the dependent variable

obtaining the best fitting regression line

The computer programme estimates the constant and the unstandardized regression coefficient in order to minimise the residuals If this is done correctly, then the sum of the residuals will be zero Note that you use the B, not the b (beta), to graph the line All we can really say is that they are associated

what do we report from the regression?

The R squared value (usually the “adjusted” one) The beta value and the associated p-value Equation: b, Adj R2, p The rest of the information is useful for graphing (the constant, the Bs, and the SEs) and is usually ignored

A multiple correlation

the correlation of a group of predictor variables with a single dependent variable. This analysis is performed through a multiple regression

multiple R

The ‘multiple R’ statistic is like the Pearson r statistic, but it’s an overall correlation of five predictors on the single DV.

R square

R square indicates the amount of variance in the dependent variable Jointly explained by the set of predictors

Adjusted R Square

Adjusted R2 is the R2 adjusted for the number of predictors.

Why were conscientiousness, openness, and agreeableness non-significant predictors in the regression

These three variables were overshadowed by the strength of the other two predictors: extraversion and emotional stability

When you have a set of predictors that you expect to predict a DV, one of the worries that you should have is whether any of the predictors are strongly correlated with each other

If they are excessively correlated, this is called a problem of multicollinearity. What it means is that you have significant correlations that can mask other relationships Cure? Check for significant correlations first, and exclude any highly correlated variables. Use factor analysis to reduce the number of predictors

Test 3 Flashcards

(40 cards)