multiple regression Flashcards
what does a vif score of 5 or above mean
multicolinearity (intercorrelations between the independent variables)
what does a vif score of 10 or above mean
severe multicolinearity
multiple regression is an analysis of ___
dependence - one variable is examined by its dependence on another
what are independent variables referred to as?
predictors
the __ of each X variable describes its relationship with Y
coefficient
quadratic equation
y = cx2 + bx + c
multi regression equation
y = b1x1 + b2x2 + … + bnxn + c
in the multiple regression equation what does b stand for
the predictor values
what will the results from SPSS come out as (eg what is the name of the variable)
the R value - this is the measure of association between the observed and predicted value of the criterion variable
what is r2
simple linear regression
what does r2 adj account for
accounts for the number of predictor variables in multiple regression
should you assess the relative importance of the predictor by the size of the coefficient? if not then what should you do??
NO – STANDARDISE THE COEFFICIENTS WITH BETA WEIGHTS (looks at the response of y to each independent variable)
what is f?
the significance of the model being able to explain variance
what does f = 0 mean?
the model does not explain variance
what is B
the regression coefficient
what is t?
the significance of the coefficient in explaining the variance
what is assumed in multiple regression about the distribution?
it is normally distributed - must be for it to work
homoscedasticity is always assumed in multiple regression, what is this?
the variance is constant across all levels of the predicted variable. eg there is very little variance from the line of best fit for all variables.
what should you look at to see if x values are correlating with eachother?
vif factor (variance inflation factor)
is it better to have more or less values in your graphs and analysis
less - more values is not necessarily good, need around 2 or 3 as most regressors are likely to be significant
what is the equation showing: s^2y/s^2e
used to determine r2 value - simple linear regression
what does this represent: s^2y
variance of original simulation model output
what does this represent: s^2e
variance of regression residuals
what is the difference between Homoscedasticity and Heteroscedasticity
homo = everything is the same variance, normally distributed, okay hetero = violation of homo, variation of variance, errors in independent variable
what does it mean if the f ratio is greater than 1
explained variance is higher than unexplained variance
what is ‘tolerance’?
gives you the unique variance associated with each variable
what does a tolerance of 0.34 mean
means 34% of variance for that predictor is not accounted for by other predictors
a tolerance of less than 0.2 means?
that that predictor does not add anything new to the model - not good
3 ways to identify homoscedasticity - common question
in a scatter plot there should be no discernable change in distribution (should be randomly scattered)
data in histograms and p-p plots should also remain normally distributed
why do you need to standardize coefficients - common question
to reduce multicollinearity
to compare different stats effectively
what is stepwise multiple regression
it finds the independent variable that has the largest significant Pearson’s correlation with Y. It then returns to the matrix and finds the next most significant correlation and onwards
how does stepwise mutliple regression deal with insignificant predictors?
Stepwise regression involves adding predictors to the model one by one until it finds a non-significant predictor, at which point it stops building the model.
what is hierarchal multiple regression
This uses the system of ‘blocks’ in the ‘linear regression’ window to allow the user to define the order in which variables should be regressed. This approach is very logical and selection of predictors is based on underlying theoretical considerations
how does hierarchal multiple regression deal with insignificant predictors?
the system uses a method of ‘blocks’ in the linear regression window to allow the user to choose what variables to regress. this allows certain predictors to be controlled and better understood
how does forward stepwise regression work?
model selects from the group of predictors based on the predictor which makes the largest contribution to r2. adds the rest of the predictors but stops once the remaining variables cannot make a significant contribution.
how does backward stepwise regression work?
opposite of forward. dependent variable is regressed against all predictors and the weakest is taken out (contributes the least). this continues until only the statistically significant variables remain.
what is the equation for standardising values
zx (variable) = (x (each observation) + x mean) / standard deviation of variable
how to calculate f value
explained variance / unexplained variance