Linear Regression Flashcards by Dusty Hackler

Regression Analysis uses a _______model to predict a ______variable (dv) by using one or more _______variables (iv).

Statistical

Response

Predictor

How well did you know this?

Not at all

Perfectly

In regression analysis, β₀ and β₁ are called_______

parameters

How well did you know this?

Not at all

Perfectly

What are the four steps of hypothesis testing?

Step 1:

one-sided: H₀<μ H_a≥μ(no linear association between x and y – not useful for predicting y)

two-sided: H₀=μ H_a≠μ

Step 2:

t=(x ̅-μ₀)/(s⁄ √n) with df=n-1
t*=b₁/s{b}

Step 3: t {1- α, n-1} OR t {1- α/2, n-1}

Step 4: If t ≥ +crit val or ≤ -crit val reject H₀

How well did you know this?

Not at all

Perfectly

What is the simple linear regression model?

Y=β₀+β₁X₁+ε

How well did you know this?

Not at all

Perfectly

In linear regression, E(ε)=

How well did you know this?

Not at all

Perfectly

In linear regression, σ² {ε}=

σ²

How well did you know this?

Not at all

Perfectly

In linear regression, ε’s are/are not correlated and have covariance of ___.

ε’s are uncorrelated and have covariance of 0

How well did you know this?

Not at all

Perfectly

Least Squares Estimates of betas _____ the sum

                              ∑   [y<sub>1</sub>-(β<sub>0</sub>-β<sub>1</sub>x<sub>i</sub>)]<sup>2</sup>

                            (i=1)

minimize

How well did you know this?

Not at all

Perfectly

Interpretation of β₁

Y=β₀+β₁X₁+ε

For each increase in x, there is an increase/decrease in y.

(e.g., For each add’l hour a student watches tv, he loses .2 GPA points)

How well did you know this?

Not at all

Perfectly

Interpretation of β₀

Y=β₀+β₁X₁+ε

The mean when x=0

(e.g., On average, first year students who don’t watch tv have a GPA of 3.9)

How well did you know this?

Not at all

Perfectly

y ̂ is the ____ regression line.

estimated

How well did you know this?

Not at all

Perfectly

b₁ and b₀ are estimates for

β₁ and β₀

How well did you know this?

Not at all

Perfectly

What is the equation for b₁

(ss_xy)/(ss_xx)

How well did you know this?

Not at all

Perfectly

What is the equation for b₀

y ̅ - b₁x ̅

How well did you know this?

Not at all

Perfectly

What is the equation for SS_xx

All of the following equations are equal

∑(x_i- x ̅ )²

(∑x_i²) - n(x ̅ )²

(n-1) s_x²

How well did you know this?

Not at all

Perfectly

SS_xx must be positive/negative.

Study These Flashcards

positive

What is the equation for SS_xy

Study These Flashcards

∑(x_i - x ̅ ) (y - y ̅ )

(∑x_iy_i) - n(xy̅)

When creating a table for an estimated regression line, which 5 columns should you include?

Study These Flashcards

x_i | y_i | x_i² | y_i² | x_iy_i

What is the equation for the residual ε_i

Study These Flashcards

ε_i = y_i-E(y_i)

What is the equation for the residual e_i

Study These Flashcards

e_i = y_i - y ̂_i

s² is the ________

Study These Flashcards

sample variance

What is the equation for s²

Study These Flashcards

All of the equations below are equal

(∑(x_i-x ̅ )²) / (n-1)

SSE/(n-2)

MSE

s is the ____________

Study These Flashcards

sample standard deviation

What is the equation for s

Study These Flashcards

√MSE

√(SSE/(n-2))

SSE is

The sum of the squared errors

What is the equation for SSE

All of the equations below are equal ∑e_i² ∑(y_i - y ̂_i)² ss_yy - b₁²ss_xx

What does s²=.045 and s= .212 mean?

If the dist of GPA for ppl who watch x hrs of tv is approx. normal, then about 95% of them are expected to have GPAs within 2(.212) units of their simple linear reg model

You should assume ____ for hypothesis testing and confidence intervals

normality

b₁ and b₀ are _______ for β₁ and β₀

least squares estimators

Why do you want to have a large range of data?

The more variation you have, the better estimate of the slope you can get..

sampling distribution of \_\_(b₁)\_need to check this\_?

has a t-distribution of n-2, because we estimate b₀ and b₁

What does it mean to have a 95% CI?

If we took 100 samples of size xx, we would expect 95% of tem to contain value β₁ Interpretation: 95% of all b₁’s will fall within this range

What is Interval Estimation?

CI for mean of Y when x=x_h

SSTo

the error/variation when not using any model at all; never changes when using a diff model or using new variables; total var around y ̅

SSE

error/variation when using SLR; the variation in y not explained by using x; too high equals too much error

SSR

The error left after fitting the model; the chunk of variation in y explained by using x (we want this to be large)

What are the components of the ANOVA table?

_Source of Variation SS df MS_ Regression SSR ÷ 1 = MSR _Error SSE _ _÷_ _ n-2 = MSE_ Total SSTo n-1

What does an F-test for model usefulness tell us

if R² is signif, but not if it is useful

What are the four steps in conducting an F-test

**Step 1**: _two-sided_: H₀:β₁=0 H_a: β₁ ≠ 0 **Step 2**: F\*=MSR/MSE =SSR/MSE (all are always positive; want F\* to be \>1) **Step 3**: F {1-α, 1, n-2} (\*numerator df always 1 in SLR) **Step 4**: if F\* \> F {1-α, 1, n-2}, we reject H₀and we have evidence that the SLR model is useful * *\***in SLR (one predictor variable), the t-test for β₁=0 is the same as the F-test * *\***In SLR only F\* = t\*² √(f_crit) = t_crit

Linear Regression Flashcards

(39 cards)