Lecture 9 - Correlation And Regression Flashcards
What is the difference between an ANOVA or t test, and a correlation?
ANOVA and t tests look at differences between groups whereas a correlation looks at the relationship between two variables
What does correlation look at?
The co-variation between variables e.g. relationship between stress and illness
What does a compact coefficient do?
It’s function is to provide a compact numerical representation of the degree that to which any two variables (with two sets of data) co-vary
What are three types of correlations in psychology?
Perfect positive, no correlation, perfect inverse
What are the basic correlation statistics?
Pearson product moment - interval or ratio data (bivariate normal distribution)
Spearmans Rho - ranked data (only performed non ranked data)
What are some additional correlation statistics?
Kendal Tau - non parametric (SPSS)
Point-biseral correlation (when one variable is dichotomous)
Phi coefficient - when both variables are dichotomous)
Partial correlation -control for effect of additional variable
What is the equation of a straight line?
y = bx + a
What is the regression equation?
ŷ = bX + a
Y is predicted value of y
B is the slope of regression line
X is the value of the predictor variable
How do you find the residual in a correlation?
The difference between the predicted y and the actual y
How do you create a model of correlation data?
Fit the straight line in the data
How can we make predictions of y from x?
Find the equation of the line (linear regression)
Correlations indicate (what) and cannot be used to infer (what)?
Correlations indicate co-variation and cannot be used to infer causality
What is the different r’s used in correlations?
r is a correlation statistic and r2 is a variance estimate
When do you use r?
-1 through +1 correlations
The strength of the relationship not an indicator of significance
When do you use r2?
Variance estimate
Correlation coefficient r not a ratio scale eg 0.6 is not twice as good as 0.3
R2 is a ratio scale
Coefficient of determination
What are correlation assumptions?
Range restrictions
Non-linear relationships
Heterogenous samples
Outliers
For linear regression, do we use the least or most squares approach to find the line that best fits through the data points? And why?
We use the least it reduces the residual error - not always the best line you would draw eye by eye
Cause and effect: a correlation can mean:
Variable x has causal effect on variable Y
Variable Y has a causal effect on variable X
A and B are related to something else
Type 1 error