Week 4 Flashcards
Partial Correlation
- Useful to detect spuriousness
- Needed to understand
- Factor Anlysis
- Multiple regression
- ANCOVA
- introduces Venn diagriams
Factor Analysis
- A set of statistical procedures
- Determines the number of distinct unobservable constructs needed to account for the pattern of correlations among a set of measures
Correlation in SPSS
- r(N-1) = CCA, p<.001
- In this instance coefficient equals .35
- This is greater than .001
- Therefore is significant
Bivariate (Zero Order) Correlation (R)
- used to determine the existence of relationships between two different variables
- Can be represented with Venn Diagrams
Partial Correlation (PR)
- Describe the relationship between two variables whilst taking away the effects of another variable, or several other variables, on this relationship.
Spurious Correlation
- Connection between two variables that appears to be causal but is not.
Venn Diagrams
Overlapping circles or other shapes to illustrate the logical relationships between two or more sets of items.
Exploratory Factor Analysis
- Data Reduction Technique
- Reveals underlying structure of intercorrelations
- How scale items cluster together
The goal is to summarise the relationships between variables by creating sub-sets of variables
Subsets are known as Factors - Constructs cannot be observed but are inferred by correlations
Correlation Matrix
a symmetrical square that shows the degree of association between all possible pairs of variables contained in a set.
Latent Variables
- These are our constructs or factors and cannot be observed
- Can be observed by the way they affect on observable variables
Manifest Variables
- can be directly observed or measured such as behaviour
- does not need to be inferred
- Used to study latent variables.
- Correlations between them create super-variables or constructs
What is Factor Analysis Used For?
- Scale Development
- Scale Checking and Refining
- Data Reduction
Factor Analysis Uses - Scale Development
- Count how many sub-scales we have
- Which items belong to sub-scales
- Which items should be discarded
Factor Analysis Uses - Scale Checking and Refining
- When used in research does the factor replicate like any previous research?
- Factors are not fixed to any scale: Big Five with University Students vs Nursing Home Residents
- Should I make any changes
- Should be conducted and reported when existing scale is used in research
- Ensure factors are apporpriate for the context
Factor Analysis Uses - Data Reduction
- To create new Factor Scores
- Can be used as predictors or new outcome variables
- We don’t tend to use this very often
Determinant
- Individual characteristics, such as cognitions, beliefs and motivation, that could potentially be associated with Constructs
- A determinant > .00001 suggests that multicollinearity is not a problem
Multicollinearity
- Very high correlation between variables
- If correlations are all small there is no point of running a factor analysis
Self Efficacy Scale
- 10-item self-report measure of global self-esteem
- Rosenberg rated with 5-point scale strongly agree to strongly disagree
Factor Analysis - Preliminary Checks 1 & 2
- Look for patterns of correlations between variables
- No point continuing if variables are not correlated
- If correlations are low then we could end up with as many factors as items
Zero Order Correlation Matrix
- Correlation between two variables without influence of any other variables.
- Same thing as a Pearson correlation.
A Determinant
- Determinant > .00001 suggests that multicollinearity is not a problem.
Factor Analysis
- Interpret a factor of a measure
- Uses the correlation of observed variables to estimate latent variables known as factors
- Look for patterns of correlations between variables
- Use factor analysis to identify the hidden variables.
Asking in Factor Analysis
- Asking if intercorrelations amongst items support separate constructs
- How many constructs do we really need to summarise the items
- Which items belong to each construct
- There is no point in continuing if the variables are not correlated
Zero Order Correlation Matrix
- Looks at correlations between each pair of variables without considering the influence of any other variables
- Don’t run factor analysis if correlations are small this results in too many factors
- Too much correlation indicates multicollinearity
Multicollinearity
- When too many items correlate in a Zero Order Correlation Matrix
- If determinant is >.00001 then multicollinearity is not a problem
- We need to have healthy correllations but not to high
Preliminary Check 3 & 4
- Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy
- Partlett’s test of Sphericity
- Tell you if there are sufficient correlations to make factor analysis worthwhile
Kaiser-Meyer-Olkin Measure of Sampling Adequacy - (KMO)
- If the variables are all correlated then partial correlation should be small
- proportion of variance in your variables that might be caused by underlying factors
- Venn Diagrams - When we remove all the variance of other variables there is not much left in original correlation
Bartletts Test of Sphericity
- HO: Does not depart significantly from an identity matrix
- Correlations all close to zero
- Compare our correlation matrix to an identity matrix
- Bartlett’s test is significant at (p<.05)
Identity Matrix
- A square matrix in which all the elements of principal diagonals are one, and all other elements are zeros
- Shows perfect correlation between each item
- But No correlation - Also Perfectly between each other
- We want to reject this Null Hypothesis
Estimated Initial Communalities
- Estimated proportion of shared common variants in each item
- Total variance is always 1
- Initial is the proportion of variance that is shared
- What is left over is unique to that item
Common Variance
- The proportion of overlap between two variables
- Factor analysis is only interested in this variance
- Communalities suggest single Factor
*
Specific Variance
- What is left over after commonalites with other variables are removed
- Cannot be due to other constructs because they do not correlate
- Isn’t shared with other items
Linear Combination of Variables
- Expressions constructed from a set of numbers which are multiplied by a constant.
- How much of the data set are common or unique
- How much of the variance can we explain with each factor
- How many factors do we need to extract
- First few factors take up most of the Variance
Kaiser Criterion
- How many factors should we retain
- Retain factors with eigenvalues > 1
- If they describe more than 1 Item worth of variance then we keep them
- Scree Plot Method draws a line and we are not interested in anything at the bottom
Extraction
- After extraction only use the variables that remain
- Those that have Common Variance
Extracted Communalities
- Proportion of Common Variance that can be accounted bor by retained factors
- Extraction Value + Sum of eigenvalues for the extracted factors
Previous Preliminary Checks (6)
- Bivariate correlations between variables.
- Determinant.
- Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy.
- Bartlett’s Test of Sphericity.
- Extraction
- Initial and Extracted communalities
Are these tapping into underlying constructs?
If they aer too unique then we end up with Items as Factors without correlation
Is Two Factor Solution a Good Solution
Several issues to consider:
* Overall proportion of variance accounted for by the retained factors.
* Proportion of common variance in each item accounted for the retained factors.
* Proportion of non-redundant residuals.
* Coherence of factors.
* Overall parsimony.
Parsimony
- Focuses on using simplicity to understand complex situations and make difficult decisions with confidence
- Helps avoid ambiguity
Overall Proportion of Variance
- When factors with eigenvalues <1 are removed
- No hard & fast rules but higher is better
- Generally proportion accounted for is > 50% is good
- > 70% is considered very good
Proportion of Common Variance
- General principle the higher the better
- Seek out low Extracted Communalities
- Anything below 0.3
Non Redundant Residuals
- More complicated than just asking is our two-factor solution good?
- Use original data to create a two factor model
- If good it can predict the original correlations
- Reproduce the correlation matrix from our model
- If data is good then original and reproduced models should look similar
- Any errors are called residuals