M18 - Factor & Cluster Analysis Flashcards
Factor and Cluste Analysis
- are … … techniques
- difference
- are DATA REDUCTION techniques
- difference:
FA - groups variables
CA groups observations
Factor Analysis method
– classification of similar variables measuring the same in factors/groups
Cluster Analysis method
- classification
- principle
– classification of similar objects in groups
- principle: homogeneity within each group, heterogeneity between the groups
Factor analysis
- what is a factor?
- Factor analysis tries to identify…
- always fewer … than …
- Factor analysis is an … technique in which all the variablesare … considered, each … to all others.
- -> contrast to regression analysis
- factor = is a construct, hypothetical entity, ‘latent’ variable, that is assumed to underly tests
- Factor analysis tries to identify a set of common underlying factors, in a group of variables
- always fewer FACTORS than VARIABLES
- Factor analysis is an INTERDEPENDENCE technique in which all the variablesare SIMULTANEOUSLY considered, each RELATED to all others.
– In contrast, regressions are dependence techniques in which one variable isexplicitly regarded as the dependent variable.
What the factor laoding?
the correlation coeff ajs between factor and variable
Purpose of factor analysis
1-4
- test validity of a scale
- reveal interesting patterns
- solving problems of ulticollinearity, if two variables are correlated and theoretically meaningful
- smaller number of variables to work with
two types of factor analysis
1.
2.
- Exploratory factor analysis
- used to identify complex interrelationships among items and group items that are part of unified concepts. - no a priori assumptions about relationships among factors. - Confirmatory factor analysis
- tests the hypothesis that the items are associated with specific factors
Exploratory Factor Analysis
- uncovers …
- priori ass.:
- shows … and assesses …
- needs to be done …
- uncovers underlying structures in a large set of variables
- priori ass.: any indicator ay be associated with any factor
- shows (uni)dimensionality and assesses reliability of a scale
- ALWAYS needs to be done
Confirmatory Factor Analysis
- determines
- … … are selected on the basis of … … and factor analysis is used to if they … as ….
- shows … …
- not … if the scale has been … …
- determines the number of factors based on what is expected from previous research
- INDICATOR VARIABLES are selected on the basis of PRIOR THEORY and factor analysis is used to see if they LOAD as PREDICTED
- shows GENERAL VALIDITY of a scale
- not NECESSARY if the scale has been used before
3 steps of exploratory factor analysis
1-3
Assumptions!
- Correlation matrix
- -> examine corr of variables and KMO-criterion: should be >=0.8, must be >=0.5 - Factor extraction from variables
- Factor rotation
- -> to maximize relship between variables and factors
Assumptions:
- variables continuous
- variables normally distributed
- but also for ordinal var possible
Factor extraction method
1-2
Key differences
- Principal component f.a.
- -> maximize expl variance among underlying variables
purpose: derive a small number of linear combinations (principal components) to retain as much information from the original variables as possible - Common f.a.
- -> maximize the underlying correlations among the underlying variables
differences:
- ass Principal c.f.a.: all variance can be explained –> use: data reduction
- Common f.a.: aims at latent constructs
How many factors?
3 criterias
- Kaiser criterion: Eigenvalue >1
- achieving a high specified cumulative % of total variance (usually 60%)
- Elbow criteria in screeplot
Kaiser criterion:
Eigenvalue
Communality
Factor loadings
Eigenvalue = sum of suqared factor loadings of one factor over all variables / the amount of variance explained by a factor
Communality = sum of squared factor laodings of one variable / the proportion of common variance present in a variable (rest is random variance)
Factor loadings = correlation of factor and variable
Factor distinct
- why?
- what it does:
- so that variables only load on one factor, not many
- reference axes of the factors are turned about the origin
– Process of adjusting axes to achieve a more meaningful factor solution
Factor rotation
- why?
- how?
- result
- to discriminate between different factors
- rotate the axes such that variables are oaded maximally to only one factor
- by rotating the axes we ensure that both clusters of variables are intersected by the factor to which they relate the most
- -> after rotation the laodings of the variables are maximized on one factor