Regression Flashcards

1
Q

The least-squares regression line is

A

the unique line such that the sum of the vertical distances between the data points and the line is zero, and the sum of the squared vertical distances is the smallest possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Y hat is the
(in east-squares regression line)

A

predicted y value on the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The slope of the regression line describes

A
  • how much we expect y to change, on average, for every unit change in x.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to calculate The slope of the regression line

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

intercept of the regression line how to calculate

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

intercept of the regression line is a

A

necessary mathematical descriptor of the regression line. It does not describe a specific property of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The regression line always passes through

A

the mean of x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Least-squares regression is only for

A

linear associations

Don’t compute the regression line until you have confirmed that there is a linear relationship between x and y. - ALWAYS PLOT THE RAW DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SSE vs SSR vs SST

A

1 - (SSE/SST) = R^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

r^2 is the ____

A

the coefficient of determination, is the square of the correlation coefficient

represents the fraction of the variance in y that can be explained by the regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If all varabylity can be explained by the line R^2 =

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Outlier vs Influential individual”

A

Outlier: An observation that lies outside the overall pattern. (it is unusually far from the regression line, vertically). - large residual

“Influential individual”: An observation that markedly changes the regression if removed. This is often an isolated point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

residuals

A

The vertical distances from each point to the least-squares regression line are called residuals.

The sum of all the residuals is by definition 0.

Outliers have unusually large residuals (in absolute value).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

pos vs neg residual

A

pos - underestamation
neg - over estamation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Use the equation of the least-squares regression to predict

A

y for any value of x within the range studied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

extrapolation

A

Predication outside the range
Avoid extrapolation.

17
Q

Association with the equation of the least-squares regression

A

however strong, does NOT imply causation.

The observed association could have an external cause.

18
Q

A lurking variable is

A

a variable that is not among the explanatory or response variables in a study, and yet may influence the relationship between the variables studied.

19
Q

We say that two variables are confounded when

A

their effects on a response variable cannot be distinguished from each other.

20
Q

Establishing causation from an observed association can be done if:

A
  1. The association is strong.
  2. The association is consistent.
  3. Higher doses are associated with stronger responses.
  4. The alleged cause precedes the effect.
  5. The alleged cause is plausible.