Linear & Multiple Regression Flashcards
Who is touted as the ‘Father of Behavioural Statistics’?
Francis Galton
Inspiringly, he is also known for not being very great at traditional statistics and had a severe breakdown when studying it at Cambridge.
Describe:
The relationship between regression models and variance.
Regression models involve predictor variables that account for at least some of the variance seen in the outcome variable.
Variance can be thought of the extent to which values differ from the mean.
Regressions are a predictive tool for this.
When trying to identify a question suited for a regression model, what key word should you look out for?
(In most cases)
Predict(s)
(e.g. “what predicts romantic interest on a date?”)
Give the general equation for a regression model.
y’ = a + bx + e
y’ : outcome variable (what you are predicting).
a : the intercept (the mean value of y’ when all predictors are zero).
bx : predictor variable(s) (variables intended to explain variance seen in y’).
e : error (accounting for the fact no model is perfect).
Why, in a regression model, is the intercept the value of the mean of the outcome variable?
When studying normally distributed populations, assuming a value will be close or equal to the mean should be correct more often due to chance alone.
The predictor variables should then ideally improve the accuracy of the prediction and account for variance in the outcome.
List:
The FOUR assumptions of a regression model.
- Independence of measurements.
- Normally distribution of variables.
- Linearity of predictor-outcome relationship.
- Homoscedasticity of residuals.
Homoscedasticity refers to whether or not the residuals (error of the model) are random or not. There shouldn’t be a ‘systematic error pattern’.
Note this is what was tested in PSYC232, but some sources seem to contradict it.
Explain:
Residual plots should display random variance.
And how does this relate to regressions?
If a residual plot (indicator of error) showed a non-random pattern, this would mean the predictor variables are NOT sufficiently predicting or explaining the variance in the outcome variable.
It likely means there are stronger predictors not being considered.
What is the purpose of the model coefficient box in a (multiple) regression output?
It provides important information on which exact predictor variables improve accuracy in predicting the outcome.
What is the purpose of the model fit box in a (multiple) regression output.
It gives holistic information on how well all of the variables in the model predict the outcome.
This is particularly denoted by ‘R’.
What does R2 represent in a (multiple) regression output?
The amount of variance in the outcome explained by the regression model.
This is an overall assessment, and does not tell you which variables contribute greater accuracy to others.
Having around or over 50% explained is considered impressive when studying human behaviour!
How does the adjusted R2 differ from R2 in a multiple regression output?
It accounts for the number of predictors used in the regression model.
Having a larger number of predictors may lead to an increased chance of accidentally being more accurate, and so the adjusted R2 reduces its value based on these factors.
When would the adjusted R2 value be the exact same as the R2 value?
(In a regression model output)
When there is only one predictor variable used in the model.
What is the general equation for calculating error in a (multiple) regression output?
1 - R2
In words, this means it accounts for the remaining variance not explained by the regression model.
Around 10 - 30% error is considered relatively normal for behavioural research.
What is the format for presenting the overall model test from a regression output?
F (df1 , df2) = F , p-value
Note, the second F refers to the test statistic from the results.
What does having a statistically significant regression model imply?
That, on average, the model is more accurate at predicitng the true value of a measurement compared to simply guessing the mean of the outcome variable.