5 - Correlation & Regression Flashcards

1
Q

Are correlation and causation the same thing?

A

No, just because 2 variables appear correlated doesn’t mean one causes the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can you tell whether there is positive or negative correlation based on a graph?

A
  • Positive = line of best fit sloping upwards and r value is positive
  • Negative = opposite
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the units of correlation?

A

Unit free

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is r? What does it measure?

A
  • Sample correlation coefficient

- Measures both the strength and direction of a linear relationship between 2 continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is r^2?

A
  • Coefficient of determination

- Whatever this value ends up being, we can say that the 2 variables share __% of their variance in common

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When would you want to do a 2-tailed test instead of a 1-tailed test?

A
  • 2-tailed test when not sure if it will be positive or negative correlation
  • If you suspect either positive or negative then do 1-tailed
  • *Be careful though because when doing 2-tailed, the region on each side is alpha/2, so will be a smaller region than 1-tailed which is just alpha (can make you accept the H0 based on 2-tailed when you should deny it with 1-tailed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between simple and multiple regression?

A
  • Simple = have 1 predictor looking at 1 outcome
  • Multiple = have multiple predictors (more than 1 independent variable) and have multiple ways to see how they collectively predict the dependent outcome measures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the purpose of regression?

A
  • To see how well one variables predict each other

- If unsure of the prediction, do correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we need to know to describe the regression line?

A
  • The slope (m) – has a clear practical interpretation

- The y-intercept (b) – may or may not have a practical interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is residual value?

A
  • Difference between y observed minus y predicted (y = mx + b)
  • *We want to minimize this gap
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the principle of least squares?

A

Wanting to make the residual value as small as possible by choosing values for b & m that minimize the sum of the squared residual values (SSE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does it mean when slope is 0?

A
  • No correlation between x and y

- Unlikely to happen even if there is no real relationship between the dependent and independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If Ha is that beta doesn’t equal 0, is this 1 or 2 tailed? How do you know?

A
  • When there is an equal sign, it is 2 tailed

- When there is > or < it is 1 tailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is SST?

A
  • Sum of squares total
  • Captures total variation in y
  • SST = SSR + SSE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is SSR?

A
  • Sum of squares residuals

- Captures variation in y explained by regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is SSE?

A
  • Sum of squares errors of prediction

- Captures variation in y not explained by regression

17
Q

How do you calculate r^2 from ANOVA?

A

r^2 = SSR / SST

18
Q

What are some assumptions for ANOVA?

A
  • x and y must be paired (have to have values for each variable for each patient)
  • At least one variable must be independent
  • At least one of the variables w/ independent observations must be appropriately normally distributed
  • Let y be the variable satisfying the independence and normality assumptions; variability of y shouldn’t change as the other variable changes
  • Relationship between the 2 variables must be linear
19
Q

What are some requirements and assumptions for multiple regression?

A
  • Subjects to variables ratio (must have at least 10 subjects for each variable b/c if not, will inflate error – increased chance of type 1 error)
  • Normality
  • Multicollinearity (when you have several independent variables and some are so close that it messes up the correlation; correlation of 0.7 or stronger between 2 independent variables)
20
Q

How can you fix multicollinearity?

A

Combine variables or remove a variable