Lecture 3_Regression Intro Flashcards

The Regression Line (Definition, Calculation, and Interpretation), Explaining Variance using regression, and significance testing

1
Q

What is the Centroid?

A

Point defined by plotting the means of 2 variables

• represents the center of the cloud of points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A regression line must …

A
  • pass through the centroid

* come as close as possible to all points (therefore, the sum of the squared residuals will be as small as possible).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the slope (b) of a bivariate regression line tell us?

A

the amount Y is expected to change when X changes by 1 unit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the intercept (a) of a bivariate regression line tell us?

A

the predicted value for Y when X = 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the Method of Least Squares guarantee?

A

the line will come as close as possible to all the data points (the e values will be as small as possible).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the calculation process for the Method of Least Squares?

A

1st - calculate the slope (b = ∑cross-products (SCP) / SSx)

2nd - calculate intercept (a = Y̅ - bX̅)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why doesn’t the calculation of the slope in Method of Least Squares use SSy in the denominator?

A

because it is using scores on X to predict scores on Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

∑e = ? (hint: Method of Least Squares)

A

∑e = 0 (always!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Because of the Method of Least Squares, e^2 is always …

A

the smallest possible value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard Error

A

SE = (e²/ N)^(1/2)

• the amount on average that predicted Y values differ from observed Y scores (similar to V and SD calculation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are 3 ways to think about variance?

A

• the average squared deviation of the
subjects’ scores from the mean of their scores.
• a measure (in squared units) of how
much the subjects differ amongst themselves.
• of the variable X, is the expectation of its square minus the square of its expectation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does sample size (N) effect the estimation of population variance?

A

A larger N improved accuracy (the difference between 4 and 5 is large compared to the difference between 99 and 100)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is R² effected by sample size?

A

No.

R² = SS(regression) / SS(total)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Mean Square (MS)?

A

A variance estimate

MS = SS/df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do we interpret R²?

A

the proportion of variation in Y explained by variation in X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do we test the Significance of the proportion of Variation explained (R²)?

A

With an F test

F = [SS(regression)/ df1 ] / [SS(residual)/ df2]

17
Q

How do we test the Significance of a regression coefficient (b)?

A

with a t Test

t = b / SE(b)

18
Q

Confidence Interval (CI)

A

CI = b ± [t(critical) × SE(b)]

• an estimated range of values with a given high probability of covering the true population value