Correlation and Regression Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

what is correlation a form of?

A

bivariate analysis
- relationship between 2 variables
focus on direction and degree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a linear relationship?

A

for every increase in x, there is also an increase in why

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are some examples of non linear relationship?

A

practice and performance… when learning a musical instrument, you are more likely to learn a lot more in the first year and your progress is likely to slow over time, eg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

T or F? Even when there is a non linear relationship, it makes sense to use correlation measures? Why

A

False

As you might get a correlation of value when in fact there is a U shape relationship between the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the rules of thumb on how big or small a correlation is?

A

small (.1 to .3)
medium (.3 to .5)
large (.5 to .7)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is r squared? what is it used for?

A

the correlation coefficient, squared
when you square the correlation coefficient, this gives you an estimate of the percentage of variance that is actually accounted for by your model - how much variance does your predictor account for?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

if your predictor accounts for 50% of the variance, what does this mean?

A

that 50% of the variation across subjects can be accounted for by the predictor you have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is variability?

A

how much a given variable varies from observation to observation - eg how much height in the class varies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is covariability?

A
how much two variables vary together
eg, if we take the class height and weight, as height increases (or decreases) how does that impact weight? positively, negatively or no relationship? do two variables vary together or independently of each other?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the sum of squares used for? how is it calculated?

A

it calculates a rough estimate of variability…
SS = Σ(X - X(mean))^2
you take each individual’s height, and subtract the mean height from that and square it…. then sum it up for all observed numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

to measure the variability you use ____

to measure co variability you use ____

A

sum of squares; sum of products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how is the sum of products calculated?

A

SP = Σ (X-X (mean)) x (Y - Y(mean))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when will the sum of squares be identical to the sum of products?

A

when both variables are identical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how do we calculate the pearson correlation coefficient?

A

SP
r = ———————
Square root of (SS of x by SS of y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the worded formula for calculating the pearson correlation coefficient?

A

r = covariability of X and Y/Variability of X and Y separately
calculating a ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what happens if we have relatively low co-variability of X and Y compared to variability of X and Y separately?

A

we have a weak correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what can drastically influence your correlation value?

A

extreme scores or outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is regression towards the mean?

A

where an extreme score on one measure tends to be followed by a less extreme score on the other measure… as extreme scores are often due to chance, it’s extremely unlikely that the other value will also be extreme, eg if there is a really really rainy day, it is likely that the following day will not be as rainy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is an example of the regression towards the mean?

A

1 or 2 people might guess 10 coin flips correctly, and 1 or 2 people might correctly guess the number between 1 and 50, but it is highly unlikely to be the same people as the extreme scores of the people who got the coin flip correct are more likely to be followed by getting a value closer to the mean on the next variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is the null hypothesis for correlation?

A

that the correlation in the population is zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is asked when determining if the null hypothesis is to be rejected or accepted?

A

once r value has been calculated, we ask what is the probability of finding an r value this big if the real association in the population is zero? If this probability is small, we reject the null hypothesis

22
Q

what is the degrees of freedom?

A

the amount of participants (N) minus 2

23
Q

how is spearman’s correlation used? when is it used?

A

convert the data to ranks before calculating correlations…

used when asking the question are values that are high on one variable also high on the other variable?

24
Q

why would you use spearman’s correlation?

A

when you have non linear data… and you control for or eliminate outliers as values such as 15, 20, 23232 becomes values of 1, 2, 3.

25
Q

what is reliability?

A

the consistency of a measure…. does a measure or test return the same results each time?

26
Q

what does cronbach’s alpha measure?

A

reliability

27
Q

what does cronbach’s alpha require?

A

at least 3 items or scales?

28
Q

how is cronbach’s alpha calculated?

A

by averaging covariance of item pairs divided by the total variance

29
Q

T or F? the value of cronbach’s alpha directly represents the proportion of reliable variance? Eg, value of .7 means 70% reliable variance?

A

True

30
Q

what is the rough rule of thumb for cronbach’s alpha?

A
excellent: equal to or greater than 0.9 
Good: 0.8 to 0.9
acceptable: 0.7 to 0.8 
questionable: 0.6 to 0.7 
poor: 0.5 to 0.6 
unacceptable: less than 0.5
31
Q

what is regression about?

A

predicting one variable from another

32
Q

what is the general formula for a perfect linear relationship? give example

A
Y = a + b X + e
Y = a + b x (IQ)
33
Q

what do all components of the general formula for a perfect linear relationship represent?

A
Y = outcome variable (what is being predicted) 
a = y intercept (value of y when x = 0)
b = slope (how much y changes whenever x changes) 
X = predictor variable 
e = error or residual term
34
Q

what are some assumptions about errors?

A
  • errors are independent of one another
  • normally distributed (if we were to plot all of our errors, it would roughly follow a normal distribution)
  • homoscedastic (equal error variance for levels of predicted Y)
35
Q

how can we estimate regression model

A

least squares parameters estimates

36
Q

how do you calculate error or residual?

A

minus the predicted value from the observed value

Y - Y(predicted)

37
Q

how do you get the total squared error?

what is the formula?

A

calculate the error for every value then square them and sum them up
Σ (Y-Y(predicted))^2

38
Q

what is the sign for slope? how do you calculate it?

A

b.
SP
b = —————-
SSx

39
Q

what is the sign for intercepts how do you calculate it?

A

a.

a = Y(mean) - b x X(mean)

40
Q

what are other names for the intercept?

which one is the one they use in SPSS?

A
a
constant (this in spss)
y-intercept 
predicted valued of Y when x = 0
41
Q

what are other names for the slope?

which one is the one they use in SPSS?

A

b
rise over run
effect on Y for a unit increase (1) on predictor

42
Q

how do you measure the variance?

A

correlation score, squared

R^2

43
Q

what is r squared showing us?

A

the proportion of variance accounted for

44
Q

when do you use ANOVA test?

A

to determine whether the variance explained is significantly different from zero

45
Q

what does b (slope) indicate?

A
  1. whether there is a relationship between X and y
  2. whether that relationship is positive or negative
  3. estimate of expected change in Y when x increases by 1
    DOES NOT indicate correlation (it is not the correlation)
46
Q

what do you have to do to the slope so that it matches the correlation? how do you do this?
what is this new coefficient called?

A
  • translate it into a standardised form
  • convert X and Y into Z scores and then do the regression
  • standardised regression coefficient
47
Q

what regression diagnostics can be run?

A

histogram of residuals

residual plot - homoscedasticity

48
Q

what do we hope or expect to find when we run a histogram of residuals to be in the clear?

A

an approximately normal distribution

49
Q

what is a residual plot? what are we looking for?

A
  • a scatterplot of residuals against predicted values to check for heteroscedasticity
  • the absence of any systematic pattern supports the assumption of homoscedasticity
50
Q

T or F…. the standardised coefficient is not the same thing as the correlation (r)

A

False

51
Q

T or false
correlation tells us how tightly clustered the values are around the regression line but the line can have any sort of slope

A

True