Week 3: Regression Flashcards
What is needed for simple linear regression? - (3)
- What sort of measurementt = DV
- How many predictor = 1
- What type of predictor variable = continous
Regression is a way of predicting things you have not measured - (2)
Predicting an outcome variable from one predictor variable. OR
Predicting a dependent variable from one independent variable.
Rgeression predicting variable y from
variable x
In regression
when we know that x should
influence y (insttead of y influencing x)
Regression used to create a linear model of relationship between
two variables
Regression has the different to correlation as it adds a
constant bo
In regression we create a model to predict y - (2)
Regression equation
In regression we test how good the model we created to predict y is
good at fittting the data
The straight line equation in regression model has 2 parameters - (2)
The gradient (describing how the outcome changes for a unit increment of the predictor)
The intercept (of the vertical axis), which tells us the value of the outcome variable when the predictor is zero
Yi in regression equation means
outcome variable e.g., album sales
b0 in regression equation means
intercept
biXi in regression equation means - (2)
Regression coefficient for predictor
e.g., direction and strength of the relationship between advertising budget & album sales
εi in regression equation means - (2)
eror
e.g., Error album sales not explained by advertising budget
biXi in regression equation means - (2)
Predictor variable
e.g., advertising budget
Example of using simple linear regression equation to predict values - (3)
- Imagine you want to spend £5 on advertising – you would then get this equation here –
- so based on your model we can predict that if we spend £5 on advertising, we will sell 550 albums (the error term).
- What is left shows this prediction will not be perfect as there is always a margin for error. Your outcome variable is also known as a predicted value in a regression.
The closer the Sum of Squares of the model (SSM) is to the total sum of squares SST) to the data,
the better the model accounts for the data and smaller residual sum of squares (SSR) must be
Formula of Sum of Squares of Model (SSM)
SSM =SST (total) - SSR (residual)
SST (total) uses the
difference between the observed data and the mean value of Y
Sum of squares (residual) uses the difference between the
observed data and the model
Sum of squares model uses the difference between
mean value of Y and the model
In simple linear regression, R^2 is the proportion of variance in DV (outcome variable; Y) that is explained
by IV (predictor variable X) in regression model
The R squared (Pearson’s Correlaiton Coefficient squared) is the
coefficient of determination
R^2 gives you overall fit of model thus
model summary
Adjusted R squared tells
how well R squared generalises to population
Adjusted R squared indicates how well a
predictor variable explains the variance in the outcome variable, but adjusts the statistic based on the number of independent variables in the model.
Adjusted R squared will always be lower or equal to R^2 value because..
It’s a more conservative statistic for how much variance in the outcome variable the predictor variable explains
. If you add more useful variables, adjusted r-squared will
increase
If you add more and more useless variables in mode, what will happen to adjusted R squared?
adjusted R squared will decrease
How to calculate R squared in simple linear regression?
SSM (SST - SSR)/SST
R squared gives raio of
explained variance (SSM) to total variancw (SST)
The F ratio tests if the line is better than the mean meaning if
overall model (fitted regression line) is a good fit
What is the mean squared error? (3)
- Sum of squares (SS) are total values
- Can be expressed as averages
- These are called Mean squares MS
What is SSM divided by DF gives
mean sum of squares (MSM)