chapter 14: multiple regression and model building Flashcards by Cesar A. Contla

what are multiple regression models?

regression models that employ more than one independent variable

How well did you know this?

Not at all

Perfectly

why is it possible for the multiple regression model formula to be like the:

y = B0 + B1x1 + B2x2 + E

why are there 2 xs?

because the mean level (Uy) now is B0 + B1x1 + B2x2

this basically means that there are two different independent variables that can correlate or “Influence” the dependent variable “y”

E still remains the error term that causes y to deviate from the mean level

How well did you know this?

Not at all

Perfectly

what is the new name for the mean level:

Uy = B0 + B1x1 + B2x2

the plane of means

its in a three dimensional space

How well did you know this?

Not at all

Perfectly

what is B0 in Uy = B0 + B1x1 + B2x2?

it is still the y intercept

How well did you know this?

Not at all

Perfectly

what is B1 in Uy = B0 + B1x1 + B2x2?

the regression parameter for the variable x1

the slope of the plane of the x1 direction

How well did you know this?

Not at all

Perfectly

what is B2 in Uy = B0 + B1x1 + B2x2?

the regression parameter for the variable x2

the slope of the plane of the x2 direction

How well did you know this?

Not at all

Perfectly

what is the error term E in Uy = B0 + B1x1 + B2x2?

the error term

what describes the effects on y other than x1 and x2

How well did you know this?

Not at all

Perfectly

what is the formula of the point estimate or prediction of

y = B0 + B1x1 + B2x2 + E

what is the name of such equation

y^ = b0 + b1x1 + b2x2

called the least squared plane, the estimate of the plane of means

How well did you know this?

Not at all

Perfectly

what is there no error term when we use

y^ = b0 + b1x1 + b2x2

to predict a point of

y = B0 + B1x1 + B2x2 + E

the error term has a 50% chance of being positive and 50% chance of being negative

How well did you know this?

Not at all

Perfectly

what is the residual?

the difference between the observes and predicted values

How well did you know this?

Not at all

Perfectly

what is SSE

the unexplained variation

the sum of the squared residuals

How well did you know this?

Not at all

Perfectly

what is the multiple coefficient of determination?

the proportion of the Total variation in the n observed values of the dependent variable that is explained by the overall regression model

R^2

R^2 = explained variation / total variation

How well did you know this?

Not at all

Perfectly

what is the multiple correlation coefficient

How well did you know this?

Not at all

Perfectly

what is the adjusted R^2

the adjusted multiple coefficient of determination used to avoid overestimating the importance of independent variables

adjusted R^2 =

(R^2 - (k / (n - 1))) * ((n - 1) / (n - (k + 1)))

n is the number of observations

k the number of independent variables in the model

How well did you know this?

Not at all

Perfectly

what are the four assumptions of the error term values in the multiple regression model?

at any given combination of x1, x2, …, xk, the population of potential error terms has a mean value of 0
constant variance assumption
normality assumption
independence assumption

How well did you know this?

Not at all

Perfectly

what is the error term constant variance assumption?

Study These Flashcards

population of error term values has a variance that does not depend on the combination of values of x1, x2, …, xk

the different population of potential error terms corresponding to different combinations of values x1, x2, …, xk have equal variances

the constant variance is the population variance

what is the error term constant normality assumption?

Study These Flashcards

at any given combination of x1, x2, …, xk, the population of potential error terms has a normal distribution

what is the error term constant independence assumption?

Study These Flashcards

any one value of the error term E is statistically independent of any other value of E

an error term of a certain y has nothing to do with an error term of another y

what is the point estimate of the constant variance of the different populations of error terms?

formula too

Study These Flashcards

the mean square error

s^2

s^2 = SSE / (n - (k + 1))

what is the point estimate of the standard deviation of the different populations of error terms?

formula too

Study These Flashcards

the standard error

s = (SSE / (n - (k + 1)))^(1/2)

in the mean square error and the standard error (the point estimate of the constant variance and the standard deviation of the different populations of error terms),

what is the meaning of the following

n - (k + 1)

Study These Flashcards

degrees of freedom associated with SSE

is testing the significance of the relationship between y and x1, x2, …, xk a proper way of assessing the utility of the regression model?

Study These Flashcards

yeeee

how do you test the significance of the relationship between y and x1, x2

Study These Flashcards

with the F test

what is the null hypothesis (H0) of the the significance of the relationship between y and x1, x2, …, xk

Study These Flashcards

H0: B1 = B2 = … = Bk = 0

none of the independent variables x1, x2, … xk are significantly related to y

the regression relationship is not significant

what is the alternative hypothesis (H0) of the the significance of the relationship between y and x1, x2, ..., xk

Ha: at least one of B1, B2 ... Bk =/= 0 at least one of the independent variables x1, x2, ... xk is significantly related to y the regression relationship is significant

how do you calculate the F of the F statistic

F = | (explained variation) / k ____________________________ ((unexplained variation) / (n - (k + 1)))

how do the R^2 and adjusted R^2 differ?

R̅^2 differs from R^2 by taking into consideration the number of independent variables in the model Using R̅^2 helps avoid overestimating the importance of the independent variables

why would we test the significance of a single independent variable?

to gain further information of which independent variables significantly affect y?

when you test the significance of a single independent variable, how do you refer to it? what else must you assume?

xj you have to assume it is multiplied by the parameter Bj

what is the null hypothesis when you test xj?

Bj = 0 here, we say that xj is not significantly related to y

what is the alternate hypothesis when you test xj?

Bj =/= 0 here, we say that xj is significantly related to y in the regression model under consideration

what is the sbj

the standard error of the estimate bj the point estimate of the population standard deviation of bj

what test do you use to test the significance of xj?

the t test

what is the formula of the t test to to test the significance of xj?

t = bj / sbj

using the t test to to test the significance of xj, when do we reject Ho in favor of Ha?

t > t alpha p value < significance value

chapter 14: multiple regression and model building Flashcards

(35 cards)