For the Midterm Flashcards by Jessica Willkinson

What is the variable Y called?

The response variable

How well did you know this?

Not at all

Perfectly

What are the X variables called?

The predictor, or explanatory variables.

How well did you know this?

Not at all

Perfectly

What is multiple linear regression?

It relates one numerical characteristic, the response variable, to one or more predictor or explanatory variables.

How well did you know this?

Not at all

Perfectly

In multiple linear regression, what do we typically assume about epsilon?

That the epsilon are independent, homoscedastic (same variability) and, if the sample size is small to moderate, that they are approximately normal in distribution.

How well did you know this?

Not at all

Perfectly

What are regression methods used for?

1) Identify and characterize the relationships between the response and predictor/explanatory variables.
2) Estimate or predict the value of the response variable for combinations of the predictor/explanatory variables.

How well did you know this?

Not at all

Perfectly

What is the objective of time series analysis?

To identify patterns and trends, and to predict future observations.

How well did you know this?

Not at all

Perfectly

What does UCLM mean?

Upper confidence limit for regression line

How well did you know this?

Not at all

Perfectly

What does LCLM mean?

Lower Confidence Limit for Regression Line

How well did you know this?

Not at all

Perfectly

What does UCL mean?

Upper prediction limit

How well did you know this?

Not at all

Perfectly

What does LCL mean?

Lower prediction limit

How well did you know this?

Not at all

Perfectly

What is the simple linear regression model?

Y_i = B_o + B_i*x_i + epsilon

How well did you know this?

Not at all

Perfectly

How do we denote the ith residual?

e_i

How well did you know this?

Not at all

Perfectly

How are residuals found?

y_i - y_hat_i

How well did you know this?

Not at all

Perfectly

What is the prediction interval?

The interval a new value is likely to be in

How well did you know this?

Not at all

Perfectly

What is the confidence interval?

The interval we’re confident a value we already have is in.

How well did you know this?

Not at all

Perfectly

What sort of hypothesis test do we run on the slope in multiple linear regression?

Study These Flashcards

We test whether it’s equal to 0 or not! It we reject H_o, then the slope is not 0.

What does the analysis of variance do?

Study These Flashcards

Tests to see if there exist two slopes that equal each other, and if they equal zero.

What does the SSE tell us?

Study These Flashcards

How much of the model’s variance is due to random error.

What does SSE denote?

Study These Flashcards

Sum of Squares due to Error

What does SSR denote?

Study These Flashcards

Sum of Squares due to the Model (or our regression)

What does the SSR tell us?

Study These Flashcards

How much of our model’s variance is understood by the model.

What does R-square tell us?

Study These Flashcards

The proportion of our model’s variance that is understood by the model. (the closer to 1, the more accurate our model)

Why do we care about partial regressions?

Study These Flashcards

They help us to see the relationship between Y and a single X!

What do we use partial plots to determine?

Study These Flashcards

If there is a linear relationship between Y and each individual X (predictor variable)

What is collinearity?

When one or more predictor variables are close to being a linear combination of the other predictor variables.

What are symptoms of collinearity?

* The regression coefficients have unlogical "signs" (- or +) * The regression coefficients are huge in magnitude, and have even larger standard errots * The individual coefficients are nonsignificant, but when grouped with other coefficients are significant.

How can we diagnose collinearity?

1. A correlation matrix (look for variables that seem to have a linear relationship. 2. Regress each predictor variable on all the other predictor variables. • Look for high values of R-square • A Variance Inflation Factor ≥ 10 suggests collinearity. 3. Look at the condition indices (large values and jumps in the values indicate collinearity.)

How can we fix collinearity?

* Backward elimination * Forward Selection * "Stepwise"?? Selection

How does backward elimination work?

It takes out variables whose p-values are greater than alpha (usually .05)

How does forward selection work?

It adds variables that 1) have the highest correlation with the current model, and 2) have a p-value less than alpha.

For the Midterm Flashcards

(30 cards)