Quiz 2 Flashcards
Y increases as X increases by?
Slope B
Simple linear regression model in words
response = predictor + error
What is a signal
Predictor
What is noise
Error
Formal statistical model:
response = intercept(p) + slope(p) + error “Where p = predictor variable)
Descripe linear model when pages is the response variable and words is the predictor
pages = words + error or pages = wordsp + wordsp1 + error
Multiple linear regression model
response = predictor 1 + predictor 2 + error
Simple linear regression is:
Linear regression with one continuous response variable Y and ONE continuous predictor variable X
Multiple linear regression is:
Linear regression with one continuous response variable Y, and MORE THAN ONE continuous predictor.
What are the basic assumptions of linear regression
Linear, normally distributed residual with homogeneous variances
How does B1 quantify different things between simple and multiple regression
The effects of X1 on Y controlls for effect of X2. Isolates the influence of x1 independent of x2 by estimating b1 holding x2 constant.
Does not allow X2 to interfere when assessing the effect of X1
Explain what B1 is in multiple regression model
For every additional X1 (predictor), the number of Y (response) increases by b1, holding the number of X2 constant.
Main difference between b1 in linear regression and multiple regression
b1 in linear is regression slope while in regression b1 and b2 are partial regression slopes
What is the 2nd complication in multiple linear regression?
Multiple predictors can interact in their effect on the response variable.
What is the regression model for interaction? Multiplicative model
response = b1 + b2 + (b3xb2) + error
What is the third complication in multiple regression models?
Predictor variables can themselves be corelated
What are the assumptions of multiple regression models?
- Linear relationship between predictor and response variable
- Equal variance of residuals around regression line
- normally distributed residual
- Predictors should not be strongly correlated (ie. no collinearity)
How do you detect collinearity?
- Think about which predictor variables are likely to be collinear before building model
- Plot predictor variables against each other
- Calculate the TOLERANCE associated with each predictor.
Tolerance
Lower tolerance is bad.
Tolerance < 0.1 is really bad
VIR
Variance inflation factor
VIF = 1/tolerance
Higher VIF is bad
VIF >10 is really bad.
Method 1 of writing multiple linear regression
Method 2 of writing multiple linear regression
Types of linear models
Simple
Y = B0 + b1x1 + error
Multiple linear
Y = B0 + B1X1 + B2X2 + error
(More than one continuous predictor variable)
Anova model
Y = B0 + B1X1a + B2X1b + error
(One or more categorical predictor variables that have more than one level [eg. a and b])
Ancova model
Y = B0 +B1X1a + B2X1b +B3X2 + error
(One or more categorical predictor variables that have more than one level AND one or more continuous predictor variables)
Linear statistical model with one categorical predictor variable
Yij = u +B1xaij +B2xbij + B3xcij + errorij
Where:
j represents a single observation from single organisms and i represents the level of the predictor
u = Mean of all observations across all levels of all factors
B1 = difference between the mean of ‘a’ (level) and ‘u’ (mean)
B2 = difference between the mean of ‘b’ (level) and ‘u’ (mean)
B2 = difference between the mean of ‘c’ (level) and ‘u’ (mean)