[L8] Relationship between Variables Correlation and Regression Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

We are interested in finding a way to represent ___
between scores. For example

A

association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of Correlation

A

Bivariate; Multivariate Correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation does not prove __

A

Causality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Multivariate Correlation have more ____ Validity

A

Ecological

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

IGT & RMT = test of difference
Correlation = test of _
_

_

A

correlation/association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

___ – first and most obvious way to summarize
data where we are examining the relationship between
two variables

A

Scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

We put one variable on the x-axis and another on the yaxis,
and we ___for each person showing their
scores on the two variables.

A

draw a point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

test of correlation involved administering ___ tests in the same group of participants

A

2 or more different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When we want to tell people about our results, we ____

A

don’t
have to draw a lot of scatterplots.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

__
_
_Children were asked to listen
to a word and repeat it. They were then asked which of
these 3 words started with the same sound.

A

Initial phoneme detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

____reading score, a standard
measure of reading ability.

A

British Ability Scale (BAS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

We usually summarize and represent the relationship
between two variables with a ___
__
_
_

A

number (correlation
coefficient).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

We also calculate the ____ for this
number, and we want to be able to find out if the
relationship is ___

A

Confidence Intervals; statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Thus, we want to know what is the probability of finding
a relationship at least this strong if the null hypothesis that
there is no relationship in the population is true.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

– a best fitting line
used for prediction

A

Line of best fit or Regression Line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Predicting the variation in Y as a __
_

A

function of the variation
in X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

– how steep the line
*

A

Slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

___ – the position or height of the line.

A

Intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

By convention we give the height at the point where the
line ___

A

hits the y-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The
___ is called the y-intercept or often just the
intercept

A

height; (or sometimes the constant)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The intercept represents the ___of a person
who scored _
_ on the x-axis variable.

A

expected score ; zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

y=b0+b1X

A

regression expression, predicting behavior of y as function of x

useful for raw scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

It is often the case that the intercept __. After all, __no one_usually scores ___

A

doesn’t make any
sense; 0 or close to 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

We can use the ___of slope and __ to
calculate the expected value of any person’s score on Y,
given their score on X.

A

two values, intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

y = β0 + β1x (sometimes it is y = a + bx or y = mx + c)
Where x is the x-axis variable. This equation is called the
___

A

regression equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

We can make a _
__ about one score from the
another score

A

prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Problem: if we don’t understand the ___, regression
lines and equations are ___.

A

scale(s), meaningless

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

thinking about the relationship between two variables can
be very useful

A

Making Sense of Regression Lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

When there is a relationship between two variables, we
can ___ one from the other.

A

predict

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

We can not say that one __ the other,

A

explains

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

We need some way of making the scales have some sort
of meaning, and the way to do this is to
__ the data
into __

A

convert; standard deviation units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Talking in terms of SDs means that we are talking about
_
__

A

standardized scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Because we are talking about standardized regression
slopes, we call it “___

A

standardized slope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

___ – a more important name for the
standardized slope.

A

Correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

In order to convert the units, we need to know the ___

A

SD of
each of the measures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

If we know the ___, we can calculate the correlation
using the formula: r = β x σx / σy

A

slope

37
Q

The letter r actually stands for ___, but most people
ignore that because it is confusing

A

regression

38
Q

Thus, if we know the _
__ we can calculate the correlation

A

slope

39
Q

3 ways to calculate for the correlation coefficient”r”

A
  1. regression line
  2. standardized slope
  3. proportion of variance
40
Q

In correlation, we want to know how well the regression
line ___

A

fits the data.

41
Q

That is, how
___the points are from the line.

A

far away

42
Q

The __ the points are to the line, the stronger the
relationship between the two variables.

A

closer

43
Q

When we had one variable and we wanted to know the
spread of the points around the mean, we calculated the
_
_

A

SD (σ).

44
Q

The square of the SD is the _
__.

A

variance

45
Q

the difference between their
predicted score and their actual score. The difference is
called
_–.

A

Residual

46
Q

Their ____ (the difference between the score they
got and the score we thought they would get based on
their initial phoneme score)

A

residual score

47
Q

if we want to calculate the equivalent of the
variance, we need to ___ each person’s score

A

square

48
Q

___ = d squared

A

Residual squared

49
Q

The value of the standardized slope and the value of the
square root of the proportion of variance explained will
___ be the same value.

A

always

50
Q

We therefore have ___of thinking about
correlation.

A

two equivalent ways

51
Q

The first way is the ___
It is the expected
increase in one variable, when the other variable increases
by 1 SD.

A

standardized slope.

52
Q

The second way is the __
__ If you
square a correlation, you get the proportion of variance in
one variable that is explained by the other variable.

A

proportion of variance.

53
Q

A correlation is both ___statistics.

A

descriptive and inferential

54
Q

We can find the
____and we can also use
it to describe the ___

A

probability estimate ; strength of the relationship

55
Q
  • __ – strength of relationship
    _
A

Magnitude

56
Q

___ – positive, negative, curvilinear etc.

A

Direction

57
Q

Cohen’s effect size:

A
  • r = 0.1 = small correlation
  • r = 0.3 = medium correlation
  • r = 0.5 = large correlation
58
Q

Note that these only really apply in what Cohen, called
___

A

Social and Behavioral sciences.

59
Q

Common mistake

A
  • A correlation around 0.5 is a large correlation.
  • A correlation does not have to exceed 0.5 to be large.
  • If you have a correlation of r = 0.45, you have a
    correlation which is approximately equal to a large
    correlation.
  • It’s not a medium correlation just because it hasn’t quite
    reached 0.5
60
Q

Pearson Correlation Coefficient
* Also known as ___

A

Pearson Product moment correlation.

61
Q

Pearson Product moment correlation developed by

A

Karl Pearson

62
Q

Pearson Correlation Coefficient
is a _____ and makes the ___
made by other parametric tests.

A

Parametric correlation; same assumptions

63
Q

level of measurement for Pearson Correlation Coefficient

A

Continuous and normally distributed data

64
Q

to determine r

A
  1. standardized slope
  2. proportion of variance
  3. pearson product moment correlation
65
Q

Optional Extra: Product Moments
* ___: the moment is the __ from the fulcrum
multiplied by the weight on the lever.

A

Physics; length

66
Q

___ the total moment is equal to the length
from the center, multiplied by the weight. The same principle applies with ___.

A

Seesaw analogy: correlation

67
Q

The same principle applies with correlation: needs to be balanced (raw to standard score) to be
_-
_

A

comparable

68
Q
  • We find the _
    __ for each of the
    variables. In this case the center is the
    __.
A

length from the center; mean

69
Q

So, we calculate the difference between the score and the
mean for each variable (these are the ___) and then
we multiply them together (this is the ___).

A

moments; product

70
Q

Because this value is dependent on the ___
we need to divide it by N.

A

number of people,

71
Q

And because it is related to the ___, we
actually divide by N-1.

A

standard deviation

72
Q

This is called ___, and if we call the two variables
x and y,

A

covariance

73
Q

Just as before, we need to __ this value by
dividing by the ___

A

standardize; standard deviations.

74
Q

Calculating the Correlation
Coefficient:

we need to divide by ___, so we
multiply them together

A

both SDs

75
Q

Importance scattergraph or plot:
*

A

It will show us approximately what the correlation should
be.

It will help us detect any errors in our data, for example
data entry errors.

It will help us get a feel of our data.

76
Q

The confidence intervals for a statistic tell us the likely
___

A

range of a value in the population.

77
Q

Sampling distributions of correlation is ___. It is not __
_, which means we can’t add and
subtract CIs in the usual way.

A

tricky; symmetrical

78
Q

___transformation used which
makes the distribution symmetrical.

A

Fisher’s z transformation –

79
Q

Used to calculate the CIs and then transform back to
correlations.

A

Fisher’s z transformation –

80
Q

It is called a ____, because it makes the
distribution of the correlation into a z distribution which
is a normal distribution with a mean of 0 and SD of 1.

A

z transformation

81
Q

There are ___ to find the p-value associated with a
correlation.

A

2 ways

82
Q

Calculating the p-value

A
  1. Use table in Appendix 3.
83
Q

If we really want to know the p-value, then we can
convert the value for r into a ___

A

value for t.

84
Q

We can use this t-value to obtain the __using
a __

A

exact p-value, computer program

85
Q

When we know the __ we can also calculate the
___ of the regression line

A

correlation; position

86
Q

We can use the two values ___ to create
a regression equation which will allow us to predict y
____ from x

A

(slope and intercept); (display behavior); (desirability).

87
Q

We can use the
___ to draw a graph with the line
of best fit on it

A

predictions

88
Q

we have extended the line to __ – we would not
normally do this.

A

zero

89
Q
A