Chapter 6 Flashcards
Correlational Study
Examines the relationship between 2 or more measured variables (not manipulated or controlled by experimenter)
Correlation
Statistical technique used to measure and describe the relationship between 2 variables
You can correlate any two variables as long as they are numerical (meaning they can be represented by numbers)
Why do we use a correlation coefficient?
It is used to make a prediction (if two variables are related, we can use one variable to predict the other; example: SAT scores and college success)
To measure reliability (Test-retest, alternate forms; example: is that dependable friend going to pick you up at 2am at the airport)
To measure validity (Are the two variables really related?; example: SAT and ACT scores related to college grades)
BUT. IT IS NOT A MEASURE OF CAUSALITY
All correlations range from? And what does this number mean?
-1.00 to +1.00
This absolute value shows strength of relationship
Higher the absolute number, the stronger the relationship
What is perfect correlation?
+/- 1.00 is the strongest possible relationship
The graph of a perfect correlation is just one straight line
What does the sign of the correlation tell you?
It tells us the directionality of the relationship of any two variables, X and Y
If the sign is positive: (the variables change in the same direction)
As X is increasing, Y is increasing
As X is decreasing, Y is decreasing
If the sign is negative: (the variables change in opposite directions)
As X is increasing, Y is decreasing
As X is decreasing, Y is increasing
What is the correlation coefficient?
r, it is reflected by a spread. The fatter the oval the lower the correlation
What kind of line will r have if it equals zero?
It will be horizontal, because there is no correlation
Pearson correlation coefficient
r= the Pearson coefficient
r measures the amount that the 2 variables (X&Y) vary together taking into account how much they vary apart
It is a ratio
r= (degree to which X and Y vary together) / (degree to which X and Y vary separately)
Sum of Products of Deviations (SP)
Definitional Formula SP= The sum of (X-X bar)(Y-Y bar) Computational Formula SP= The sum of XY - ((sum of x, times the sum of Y) / n) N is the number of (X,Y) pairs
r squared
percentage of variance in Y accounted for by X
This ranges from 0 to 1 (POSITIVE ONLY)
you can not have a negative percentage, because squaring anything is positive
This number is a meaningful proportion (unlike the Pearson’s r)
It has a similar idea to effect size
What are the limitations of Pearson’s r?
- Correlation does not mean causation
- Strength of the relationship
(Pearson’s doesn’t give directly interpretable strength of relationship, the r squared (coefficient of determination ))
3.Outliers (extreme scores)
(scores with extreme X and/or Y value can drastically effect Pearson’s r) - Restriction of range
(restricted range of measured values can lead to inaccurate conclusions about the data;
finding no correlation when there really isn’t one
finding a correlation when there really is one)
What is regression?
Fitting a line to the data using an equation in order to describe and predict data
Simple regression
Uses just two variables (x and y)
Multiple regression
one y and many x’s. You’re still predicting one outcome, but comparing it to multiple causations
Multiple regression has a lot more external validity. Meaning that It is most comparable to the real world.