Relationship between Variables: Correlation and Regression Flashcards
We are interested in finding a way to represent association between scores.
association
The Regression Line
first and most obvious way to summarize data where we are examining the relationship between two variables.
Scatterplot
The Regression Line
We put one variable on the x-axis and another on the y-axis, and we draw a point for each person showing their scores on the _____
two variables.
The Regression Line
When we want to tell people about our results, we don’t have to draw a lot of _____
scatterplots
Children were asked to listen to a word and repeat it. They were then asked which of these 3 words started with the same sound.
X
Initial phoneme detection
reading score, a standard measure of reading ability.
Y
British Ability Scale (BAS)
We usually summarize and represent the relationship between two variables with a number
correlation coefficient
We also calculate the ______ for this number, and we want to be able to find out if the relationship is statistically significant.
Thus, we want to know what is the _______ of finding a relationship at least this strong if the null hypothesis that there is no relationship in the population is true.
Confidence Intervals
probability
a best fitting line used for prediction.
Line of best fit or Regression Line
Predicting the_____ in Y as a function of the ______ in X.
variation
how steep the line
slope
the position or height of the line.
intercept
By ____ we give the height at the point where the line hits the y-axis.
convention
The height is called the ____or often just the_____. (or sometimes the constant)
y-intercept or intercept
The intercept represents the expected score of a person who scored zero on the ______
x-axis variable.
It is often the case that the intercept doesn’t make any sense. After all, no one usually scores____
scores 0 or close to 0.
We can use the two values of______ to calculate the expected value of any person’s score on Y, given their score on X
slope and intercept
formula for Expected Y score
Expected Y score = intercept + slope x (score on X)
Where x is the x-axis variable. This equation is called the ______
regression equation.
Making Sense of Regression Lines
thinking about the relationship between______ can be very useful.
two variables
Making Sense of Regression Lines
We can make a____ about one score from the another score.
prediction
Problem: if we don’t understand the scale(s), regression lines and equations are _____
meaningless
When there is a relationship between two variables, we can _____ one from the other.
We can not say that one _____the other,
predict
explains
The correlation coefficient
We need some way of making the scales have some sort of meaning, and the way to do this is to convert the data into _____
standard deviation units.
Thus we could ask: “If the score on ___ is one SD higher, how many SDs higher would we expect the ____score to be?”
x
y
Talking in terms of SDs means that we are talking about _____
standardized scores
Because we are talking about standardized regression slopes, we call it______
standardized slope.
Correlation coefficient – a more important name for the ______
standardized slope.
Where σx is the SD of the variable of the variable on the x -axis (the horizontal one) of the scatterplot, and σy is the SD of the variable on the y-axis (the vertical one), and r is the correlation.
The letter r actually stands for ______, but most people ignore that because it is confusing.
regression
if we know the slope we can calculate the correlation using the formula:
r = β x σx / σy