[L8] Relationship between Variables Correlation and Regression Flashcards
We are interested in finding a way to represent ___
between scores.
association
Types of Correlation
Bivariate; Multivariate Correlation
Correlation does not prove __
Causality
Multivariate Correlation have more ____ Validity
Ecological
IGT & RMT = test of difference
Correlation = test of _
_
_
correlation/association
___ – first and most obvious way to summarize
data where we are examining the relationship between
two variables
Scatterplot
We put one variable on the x-axis and another on the yaxis,
and we ___for each person showing their
scores on the two variables.
draw a point
test of correlation involved administering ___ tests in the same group of participants
2 or more different
When we want to tell people about our results, we ____
don’t
have to draw a lot of scatterplots.
__
_
_Children were asked to listen
to a word and repeat it. They were then asked which of
these 3 words started with the same sound.
Initial phoneme detection.
____reading score, a standard
measure of reading ability.
British Ability Scale (BAS)
We usually summarize and represent the relationship
between two variables with a ___
__
_
_
number (correlation
coefficient).
We also calculate the ____ for this
number, and we want to be able to find out if the
relationship is ___
Confidence Intervals; statistically significant
Thus, we want to know what is the probability of finding
a relationship at least this strong if the ____ that
there is no relationship in the population is true.
null hypothesis
– a best fitting line
used for prediction
Line of best fit or Regression Line
Predicting the variation in Y as a __
_
function of the variation
in X.
– how steep the line
*
Slope
___ – the position or height of the line.
Intercept
By convention we give the height at the point where the
line ___
hits the y-axis.
The
height is called the____or often just the
intercept
y-intercept ; (or sometimes the constant)
The intercept represents the ___of a person
who scored _
_ on the x-axis variable.
expected score ; zero
y=b0+b1X
regression expression, predicting behavior of y as function of x
useful for raw scores
It is often the case that the intercept __. After all, __no one_usually scores ___
doesn’t make any
sense; 0 or close to 0.
We can use the ___of slope and __ to
calculate the expected value of any person’s score on Y,
given their score on X.
two values, intercept
y = β0 + β1x (sometimes it is y = a + bx or y = mx + c)
Where x is the x-axis variable. This equation is called the
___
regression equation.
We can make a _
__ about one score from the
another score
prediction
Problem: if we don’t understand the ___, regression
lines and equations are ___.
scale(s), meaningless
thinking about the relationship between two variables can
be very useful
Making Sense of Regression Lines
When there is a relationship between two variables, we
can ___ one from the other.
predict
We can not say that one __ the other,
explains
We need some way of making the scales have some sort
of meaning, and the way to do this is to
__ the data
into __
convert; standard deviation units.
Talking in terms of SDs means that we are talking about
_
__
standardized scores.
Because we are talking about standardized regression
slopes, we call it “___
standardized slope.
___ – a more important name for the
standardized slope.
Correlation coefficient
In order to convert the units, we need to know the ___
SD of
each of the measures.
If we know the ___, we can calculate the correlation
using the formula: r = β x σx / σy
slope
The letter r actually stands for ___, but most people
ignore that because it is confusing
regression
Thus, if we know the _
__ we can calculate the correlation
slope
3 ways to calculate for the correlation coefficient”r”
- regression line
- standardized slope
- proportion of variance
In correlation, we want to know how well the regression
line ___
fits the data.
That is, how
___the points are from the line.
far away
The __ the points are to the line, the stronger the
relationship between the two variables.
closer
When we had one variable and we wanted to know the
spread of the points around the mean, we calculated the
_
_
SD (σ).
The square of the SD is the _
__.
variance
We can do the same thing with our regression data, but
instead of making d the difference between the mean and
the score, we can make it the difference between the value
that we would expect the person to have, given their score
on the x-variable, and the score they actually got. We can
calculate their ___
predicted scores,
the difference between their
predicted score and their actual score. The difference is
called
_–.
Residual
Their ____ (the difference between the score they
got and the score we thought they would get based on
their initial phoneme score)
residual score
if we want to calculate the equivalent of the
variance, we need to ___ each person’s score
square
___ = d squared
Residual squared
The value of the standardized slope and the value of the
square root of the proportion of variance explained will
___ be the same value.
always
We therefore have ___of thinking about
correlation.
two equivalent ways
The first way is the ___
It is the expected
increase in one variable, when the other variable increases
by 1 SD.
standardized slope.
The second way is the __
__ If you
square a correlation, you get the proportion of variance in
one variable that is explained by the other variable.
proportion of variance.
A correlation is both ___statistics.
descriptive and inferential
We can find the
____and we can also use
it to describe the ___
probability estimate ; strength of the relationship
- __ – strength of relationship
_
Magnitude
___ – positive, negative, curvilinear etc.
Direction