chapter 5: regression Flashcards

1
Q

𝑦̂ = π‘Ž + 𝑏x

A

This is how you determine the least squares regression line, where β€œy-hat” gives a predicted response for any x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

slope b (different from the other b) in the least squares line is calculated as:

A

The slope of the regression line is the product of the correlation and the standard deviation of y over the standard deviation of x. r times (the standard deviation of y divided by the standard deviation of x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

π‘Ž = 𝑦̅ βˆ’ 𝑏xΜ„

A

for determining the least squares resgression line, where a is the mean of y minus the slope times the mean of x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Truse or false: we should always give π‘Ÿ2 along with our line to show how valid it is

A

True, because a least squares regression line can always be created regardless

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

To extrapolate

A

is to use values predicted by the line outside of the range of our π‘₯-values, which we should avoid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

a residual

A

is the vertical distance between any given datapoint and the least-squares regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

influential points

A

observations that have values that have more effect on the calculations than the rest, or that would drastically change the calculations if they were removed… If the discrepancy is only in the π‘₯ direction (or in the explanatory variable), then the influential point may affect the regression line; if it is in either the π‘₯ or 𝑦 direction (or in either variable), then the influential point may affect the correlation and/or the slope of the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

criteria to determine whether or not itβ€˜s likely that there is a causal relationship

A
  • The association is strong.
  • The association is consistent across different datasets
  • Higher values of explanatory variable are associated with higher values of response variable.
  • The explanatory variable precedes the response variable (in time).
  • The explanatory variable is plausible as a cause of the response variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

r2

A

gives the percentage of variation in y that is explained by the least squares regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ecological correlation

A

correlation based on averages (not to be trusted)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

lurking variables

A

Lurking variables are always potential problems in observational studies. Experiments are necessary to exclude the effect of lurking variables so that we can draw conclusions about the explanatory variable causing changes in the response variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

residual formula

A

observed y - predicted y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

the four statistics related to regression

A

slope, y-intercept, correlation, and r2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Interchanging x and y always changes:

A

slope and y-intercept, but not correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

to calculate correlation…

A

take the root of r2 aka the percentage of variation in y that x explains

How well did you know this?
1
Not at all
2
3
4
5
Perfectly