Correlations and Regressions Flashcards
What is a correlation?
- Measure of relation between two variable, often continuous
- Association
- Linear, cant estimate other types of relations
- No causality
- Gives us a bivariate association
What is the coefficent used for correlation?
- Pearson (most commonly used) r
- Values greater than 1.00 is an error
- 0.00 = no relation
- Positive and negative direction
What can the magnitude of r tell us?
- How to infer the association
- .10 small
- .30 moderate
- .50 large
Covariation
How much one variable increase or decrease is dependent on the second variable
- No covariation= r should be zero
Variation
Variation between each variable
Example: variation in depression scores
Correlation - APA style
- r(N-groups)=, p
- Positive or negative association
What are other types of correlations?
- Spearman Rank-Order Correlation
One or both variables are measured on a ordinal scale - Point-biserial Correlation
One continuous and one is dichotomous - Phi-coefficient
Both are dichotomous
What is a partial correlation?
- Looking at an association while controlling some other factors
Example; looking at shyness and social anxiety, controlling for gender
What can a regression analysis give us?
- Being able to make a prediction
Often a predetermine direction on DV - Unique effect of predictors on the outcome variable
Still a linear association
Which predictor is stronger? - A correlation
How do you decide the direction of the prediction?
- Based on theories and/or conceptual arguments
What equation is commonly used with regression analysis?
Y=a + bX + e
- Y = outcome variable
- X = predictor
- a = intercept
- b = the slope, how many points Y changes for one unit change in X
- e = error, refers to variation not explained by X
What is the relation between the slope and explained variance?
βIn summary, while both models may have positive correlation coefficients, Model A would likely have a higher coefficient and explain more of the variation in the dependent variable compared to Model B, due to the tighter clustering of data points around the line.β
- A has a stronger linear correlation
What are the types of regressions?
- Simple regression model
- Multiple regression model
Simple Regression
- One predictor
- One outcome variable
Simple Regression - SPSS output
Table 1
- List predictors
Table 2
- R square = what portion of variance that explains outcome
Table 3
- Is the explained variance significant
Table 4
- R first lvl=intercept , level of outcome variable when predictor variable is 0
- R second lvl = slope, relation between predictor and outcome variable
- Beta = relation between outcome and predictor, is it significant? Standardised slope (compared to other studies)
Simple Regression - APA style
- Variance %
- F value
- Beta
Multiple Regression
- 2 or more predictors
- One outcome variable
Continuous variable
Multiple Regression - SPSS output
Table 1
- Descriptives
Table 2
- Correlations
Table 3
- List of all predictors
Table 4
- R square, all predictors
Table 5
- Significant or not?
Table 6
- Beta, unique effect of predictor on outcome
Multiple Regression - APA style
- % variance on outcome
- F value
- Beta, each predictor
- Positive or negative prediction?
What are some assumptions and restrictions with regression models?
Outliers
- Extreme data?
- Distribution of data and histogram; skewness(2 okay) and kurtosis(7 okay)
- Boxplots
Residual outliers
- Is it +/- 3.00? Inspect those over it
- Difference between predicted and observed outcome values
Linearity
- Test to see if the association is linear
- Multicollinearity
What is multicollinearity?
- When the predictor variables are too similar to each other
- Inspecting bivariate associations among predictors
- Raised association gives a biased result
How can you see if there is multicollinearity?
Collinearity diagnostics
- Tolerance, under .25
- VIF Variance Inflation Factor, not above 5
What can we do if the predictors are highly correlated?
- Remove one of the predictors
- Combine the predictors, composite score
What is standardized regression coefficient?
- Beta value
- How big the change in standard deviation of the independent variable to the dependent variable
- Can be bigger than 1, the higher the number the greater the impact
Residual
The differences between observed and predicted values of dependent variable
What is the residual are independent assumption?
- That residual are not related to each other
- Sample randomly selected
- Durbin-Watson close to 2, i.e they are independent
How can you tell if the regression analysis is reasonably linear?
A pearson r between .30 - .80-90.