Test 3 Flashcards

1
Q

What does factor analysis do?

A

this analytic technique attempts to find groupings of items that constitute sub-factors within a single measure.
Be aware that these are groupings of ITEMS, not individual participants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Deep dive: what is the diff between factor analysis and PCA anyway?

A
  1. Run factor analysis if you assume or wish to test a theoretical model of latent factors.
  2. Run principal component analysis If you want to simply reduce your correlated observed variables to a smaller set of important independent composite variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an eigenvalue?

A

The eigenvalue is a mathematical representation of the degree of clustering of items created by the PCA program. The larger the eigenvalue, the better the clustering of items for that particular grouping. Some groupings will be very poor (eigenvalue near zero) but others will be larger.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Eigenvalue cut off

A

Eigenvalues greater than one is an arbitrary cut-off to try to catch the better clusterings of items. But it’s a crude and insensitive cut-off.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the deal with rotation anyway?

A

Orthogonal is the case where you force the factors to be maximally uncorrelated, whereas oblique allows more correlation among the factors. Varimax is orthogonal, and is often used. Oblimin is oblique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Where is the point where the mountain ends and the loose gravel (scree) begins

A

One in from the point of inflection, i.e. one in from the elbow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Steps for deciding on number of subfactors

A
  1. Where is the kink or elbow in the scree plot?
  2. Do we have a relatively small number of subfactors with a reasonable number of items in each (e.g., 4 or more items)?
  3. Do these items yield an adequate Cronbach’s alpha (i.e., greater than .70) for the separate subfactors?
  4. Does a parallel analysis (PA) support the other converging evidence?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what’s a parallel analysis?

A

It is a Monte Carlo-generated set of eigenvalues that would occur by chance based on the number of items and participants that you have.
‘Monte Carlo’ refers to a data simulation that is run by the computer generating random values within a given range. i.e. under the line is under chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Crossover on scree plot with parallel analysis

A

tells you how many sub factors we have, above the line or at cross over points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

KMO and Bartlett’s Test

A

Kaiser-Meyer-Olkin Measure of Sampling Adequacy

Bartlett’s Test of Sphericity Approx. Chi-Square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Common errors with factor analysis

A
  1. Using the ‘eigenvalues greater than one’ rule
  2. Misinterpreting the kink in the scree plot. The proper number is usually the number to the left of the kink.
  3. Ignoring the parallel analysis approach.
  4. ‘Forcing’ a particular solution based on your theoretical views (or biases). Let the data tell you what is going on there.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the four criteria for determining factor structure of a PCA?

A

-Suppressed factor loading’s less than .3
Look to see if clusters make sense
KMO .6 minimum cutoff
Check Cronbach Alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is confirmatory factor analysis

A
  1. It occurs AFTER someone has identified a factor structure in EFA
  2. It serves to determine whether the previously obtained factor structure is REPLICABLE in another sample
  3. It is NOT done in SPSS; instead it is done in structural equation modeling (SEM)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Comparing EFA to CFA

A

EFA: All items will load onto all factors. Does NOT generate model fit indices.
CFA: Not all items load onto all factors. It generates model fit indices.
Point: CFA tests how well the proposed model ‘fits the data’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

model fit value for CFA

A

It needs to be lower than 7.0. If not it means that there is significant ‘misfit’ between the proposed model and the actual data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Classify by variable or case

A

Factor analysis classifies variables or items.
However, one can classify cases (individual subjects) within your dataset
What’s the advantage of this approach? It attempts to identify relatively homogeneous groupings of individuals who share one or several characteristics.
Then you can use those groupings to compare and contrast on other variables. CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Other valid points

A

Moderation is about comparing groups of individuals

One can cluster items OR cases

18
Q

Nominal or categorical variable

A

where numbers refer to discrete categories, but the numbers don’t signify higher or lower

19
Q

Ordinal data

A

from the word “order”, where you rank order from 1st to whatever

20
Q

Interval or continuous data

A

The most common type of data in psychology

21
Q

Ratio scale

A

think of this type as an interval scale with a zero point. For example, measurement of temperature, height, and other physical things are on a ratio scale

22
Q

How do we treat categorical data?

A

can generate frequencies with these data,
can use them as an IV in an analysis of variance (ANOVA or MANOVA),
can do a chi-square test.
One cannot generate a mean for categorical data, and cannot do a correlation with this type.

23
Q

What do you do with ordinal data?

A

It’s a mixture of categorical and interval, so it presents a bit of a fuzzy picture.
Usually just report the rank ordering that is obtained
There are non-parametric statistics that are useful with ordinal data (e.g., phi correlation) Non-parametric = small, skewed etc

24
Q

What do you do with interval (continuous) data?

A

This type is the most flexible in terms of quantitative analyses. Most data in psychology are interval data.
One can derive a mean and standard deviation. Can do correlations, can do t-tests, ANOVA, and many other statistical tests.
Most of the data in the PSYC 325 questionnaire would be of this type

25
Q

Difference between description and analysis

A

Description: Means, Standard Deviations, Correlations etc
Analysis: also known as “inferential statistics”, which includes t-tests, ANOVA, and regression. “Inferences” refer to hypotheses as opposed to description

26
Q

Descriptive statistics with interval data

A

Mean: the average (sum total divided by number of individuals)
Median: the score that divides the group in half
Mode: the most common score

27
Q

You learned how to do an ANOVA in the first half of the course, let’s review the facts

A
The IV (or IVs) must be categorical and the DV must be continuous
If you have categorical variables, then you can run an ANOVA.
The reverse is true as well: experiments can be analysed with correlations and regressions.
28
Q

Mean group differences vs. correlation

A

Mean group differences: t-test; ANOVA
Associations: correlations, regressions, factor analysis
Categorical vars are IVs in ANOVA
Continuous vars can be used in both types of analyses

29
Q

Define correlation

A
measure of the degree to which two variables covary
The Pearson r correlation varies from:
–1.00 (high negative correlation); to
0.00 (no correlation), to
\+1.00 (high positive correlation)
30
Q

From correlation to multiple regression

A

Regression has multiple predictors associated with a single dependent variable
the r statistic is the same as the beta weight (b) generated in a regression analysis (for one variable regressed on another)

31
Q

Regression equation

A

y = constant + b1(x)
Constant = “intercept” (determines the placement of the regression line vertically in a graph)
b1 refers to the B (unstandardised regression coefficient) - slope of the regression line - in SPSS it is B

32
Q

But what does the regression equation mean?

A

X-values are the actual observed numerical values that participants generated in their survey for the predictor
Y-values are the actual observed numerical values that participants generated in their survey for the dependent variable

33
Q

obtaining the best fitting regression line

A

The computer programme estimates the constant and the unstandardized regression coefficient in order to minimise the residuals
If this is done correctly, then the sum of the residuals will be zero
Note that you use the B, not the b (beta), to graph the line
All we can really say is that they are associated

34
Q

what do we report from the regression?

A

The R squared value (usually the “adjusted” one)
The beta value and the associated p-value
Equation: b, Adj R2, p
The rest of the information is useful for graphing (the constant, the Bs, and the SEs) and is usually ignored

35
Q

A multiple correlation

A

the correlation of a group of predictor variables with a single dependent variable. This analysis is performed through a multiple regression

36
Q

multiple R

A

The ‘multiple R’ statistic is like the Pearson r statistic, but it’s an overall
correlation of five predictors on the single DV.

37
Q

R square

A

R square indicates the amount of variance in the dependent variable
Jointly explained by the set of predictors

38
Q

Adjusted R Square

A

Adjusted R2 is the R2 adjusted for the number of predictors.

39
Q

Why were conscientiousness, openness, and agreeableness non-significant predictors in the regression

A

These three variables were overshadowed by the strength of the other two predictors: extraversion and emotional stability

40
Q

When you have a set of predictors that you expect to predict a DV, one of the worries that you should have is whether any of the predictors are strongly correlated with each other

A

If they are excessively correlated, this is called a problem of multicollinearity. What it means is that you have significant correlations that can mask other relationships
Cure? Check for significant correlations first, and exclude any highly correlated variables. Use factor analysis to reduce the number of predictors