Chpt 15 Flashcards
How many variables and what type of variables are involved in multiple regresson
2 or more independent variables
what is the Multiple regression the study of
how a dependent variable y is related to 2 or more independent variables
What is the multiple regression model
Y = B0+B1X1 to B2x2 +…..Bpxp + E
what is the random variable in teh regression model
E - the error term
what does the error term in multiple regression account for (SSE)
accounts for the variability in y that cannot be explained by the linear effect of the p independent variables
what are the assumptions in multiple regression
the mean or expected value of E is 0
What is the multiple regression equation
E(y) = B0 + B1x1 + Bx2x+………BpXP
What does E(y) stand for
mean or expected value of y
in multiple regression, when we use sample data to estimate the multiple regression equation, what is the fromula
yTriangle hat = b0 +b1x1+b2x2…….bpxp
what does y trainagle hat stand for in mutlple regression
predicted value of the dependent variable
what does yi stand for
observed value of the dependent variable for the ith observation
what does y traingle hat i stand for
predicted value of the dependent variable
when adding more independent variable to a multiple regression, does it mean the regression will be “better off” why
no, it can make things worse, called overfitting
what is multicollinearity
the addition of more independent variables creates more relationships among them
- so not only are the I.V. potentially related to the Dependent variable, they are also potentially related to each other
If you have 4 I.V., how many relationships do you have
4 - with the I.V and D.V and 6 more with the I.vs
so in total there are 10 relationships to consider
do all I.V. help at predicting the D.V?
no, some I.V. are better at predicting the D.V. than others, some contribute nothing
in multicollinearity, what is the ideal situation
that all of the I.Vs to be correlated with the D.V. but NOT with each other
in multiple regression, how is each coefficient interpreted as?
the estimated change in y corresponding to a one unit change in a variable when all other variables are held constant
What are the 6 preps for multiple regression
- generate a list of potential variable; indpednent and dependent
- Collect data on the variables
- check the relationship b/w each I.V and the D.V. using scatter plots and correlations
- (optional) conduct simple linear regression for each i.V./D.V pair
- use the non-redundant I.V.s in teh analysis to find the best fitting model
- use the best fitting model to make predictions about the D.V.
what two problems can happen in multiple regression
- overfitting and 2. multicollinearity
what is overfitting
is caused by adding too many I.V.; they account for more variance but add nothing to the model
What is multicolinearity
happens when some / all of the i.v.s are correlated with each other
In Simple linear regression how do we interpret bi
as an estimate of the change in y for a one-unit change in the I.V.
In multiple linear regression how do we interpret bi
we interpret each regression coefficient as : bi - an estimate of the change in y corresponding to ta one unit change in xi when all other intendent variables are held constant