Week 3- lecture notes (linear model) Flashcards
Terminology for plotting on the x and y axis
y- DV
x- IV
slope formula
change in y/ change in x
What is an intercept
Predicted Y value for when x is 0
How do we calculate what the regression line is
DV=intercept + slope*predictor
-can use this to predict mean of y given a certain value of x (deterministic)
Residuals
-regression model doesn’t fit any of the data points perfectly and the amount that it is inaccurate is quantified by residuals
-vertical differences seen coming off the regression line
-add an error term to the regression formula (DV=intercept + weight * predictor + error)
-stidastic- shows that predictions are never going to be perfect
What is linear regression
statistical method that’s used to create a linear model
What are the 4 different types
- simple linear regression- models using only one predictor
- multiple linear regression- models using multiple predictors
- logistic regression- models a categorical response variable
- multivariate linear regression- models for multiple response variables
what are the assumptions of the simple linear regression
-normally distributed (homodasteicity)
-non-constant variance and non-normal residuals don’t follow these assumptions
The null model: measuring model fit
-compare normal model with one that only considers the intercept
only calculates as intercept + error term NOT intercept + weight * predictor + error
how do we calculate residuals with word frequency model and null model
- do this using sum of squared errors (SSE)
-null model can be used to create a standardised comparison between the two
-rsquared = 1-SSEmodel/ SSEnull
-how much can be accounted for and how much is merely due to chance
what does R squared actually measure
-measure of effect size
range from 0-1, values closer to 1 indicate better model fits and stronger effects
-R(squared) uses the residuals of the null model to standardise the residuals of the main model. This provides an effect size and tells us what the proportion of variation in the dependent variable can be accounted for by the predictor in the main model
The relationship between fitted values, observed values and residuals can be summarized as follows: residuals = observed values - fitted values
True
Statements about R squared that are true
R-squared values range from 0 to 1
R-squared tells us what proportion of variance in the outcome variable can be accounted for by the predictor variables
R-squared is a measure of effect size