L6 - Linear Regression Flashcards
What is the purpose of a simple linear regression?
Simple linear regression involves fitting a line of best fit to a scatterplot of figures representing two linearly related variables (X) and (Y).
What are the 6 features of a simple linear regression?
- Line of best fit
- Method of least squares
- Variance partitioning and residuals
- Coefficient of determination (R-squared)
- Regression equations/interpretation
- Coefficients and beta values
How do we develop a ‘line of best fit’
Find the y axis and then the intercept
Y = MX + C
What is the difference between a multiple regression and a linear regression?
Multiple regression has multiple predictors
Linear regression has a single predictor
What is the total variance in a regression?
How far your actual score is from the y bar (estimated value)
Total variance is calculated as Y - Ybar (Y = actual score, Ybar is original estimate)
- Y actual = top*
- Y1 (regression line) = middle*
- Y original prediction (i.e. class average) = Ybar*
- Y1 is an improvement on the original predictor*
- This is explained variance (i.e. improvement)- this is the “regression component”, the bit you have explained*
What is the difference between Y and Y1 in this example called?
The Residual (Error) of the model
How far off the model is from the actual number
If you take Y actual from Y1 (predicted) of every person, square them and then add them up, what do you get?
The Error Sum of Squares
Amount of variance we have not been able to explain with the variables
“Error variance”
What is the total sum of squares?
Tot SS = difference between actual scores and the mean value of Y.
For every Y person in the study, we square their results and then add the numbers up (DOUBLE CHECK NOT SURE)
What is the Regression Sum of Squares (RSS)?
The Explained Variance.
The difference between the original average estimate and the closer estimate after a regression has been done
Calculation: sum of (predicted value - roughest estimate)
What is the calculation of R2 (R-squared)?
RSS/TSS
(TSS = total sum of squares)
(RSS = regression sum of squares)
What is the multiple coefficient of determination (R2)?
Coefficient of determination (R2) is the proportion of total variance explained by the model
- When it is high, the line is really close to the actual points. The variables are capturing the variance (the rough estimate of the mean). It’s doing really well at explaining the variance. When it is low, you are not.*
- Tells us how correlated the Y variable is related to the X variable.*
How do you know if (R2) is meaningful (significant)? (e.g. is 20% of the variance meaningful?)
F test
F Ratio of systematic explained variance to error variance
Between sum of squares and within sum of squares
If between is bigger, then you have a difference between your groups that is bigger than the difference within your groups and then it is significant
What does a significant F ratio mean?
Your model is significant in explaining a meaningful amount of variance in the results (Y)
When is a small F ratio still likely to be significant?
When the population size is smaller
How do you interpret a regression equation?
Example: Y= 45.67 + 0.67Age
Always has an exam question on this
Example: Y= 45.67 + 0.67Age,
means that 1 unit changes in age lead to 1 unit (.67 unit) increases in Y.
Each increase in age, is associated with a .67 increase in Y
What does the coefficient (.67) mean?
Each unit increase in age is associated with a .67 increase in Y
What is a Beta Value?
Standardised values that can be compared.
What happens when you have a strong beta value?
The stronger they are, the stronger the relative importance of each predictor
How do you calculate a Beta Value (standardise the coefficients)
Get SD of dependent measure and divide by SD of particular predictor and multiple that by each coefficient.
What is the main issue with a regression coefficient?
It doesn’t tell you which variable is most important, there’s no effect size
The size of the coefficient will differ depending on the measure
Comparing regression coefficients tells you nothing about the imporatance of the variable.
How do you understand its importance?
You have to do is standardise the coefficients
Get SD of dependent measure and divide by SD of particular predictor and multiply that by each coefficient.
This is a standardised beta.
What are residuals in regression?
The variance you haven’t explained
The actual score minus the predicted score
The name for (R2) is…
Coefficient of determination
What are the two “principles” of multiple regression?
Explanation vs prediction
Prediction: Doesn’t care about theory, just what group you belong to
(e.g. gathering data online to predict behaviour, google)
More practical, which people are most likely to be e.g. problem gamblers
Explanation: Explaining what variable is the best predictor
E.g. Bronfenbrenner model. What is the variables that are most likely to impact child behaviour
Which of the levels of influence is MOST influential when I test them against each other, how much variance is attributed to each predictor variable
What is Multicollinearity?
Means that many of your variables are correlated
It means when you explain variance, x1 x2 x3 are all related and so each individually correlates with Y but when you put them together they all eat up each others variance since they are related.
What is the function for a simple linear regression?
Y=mx + c
Where m is the slope
and c= the Y-intercept
X is the independent variable
Y is the dependent variable.
How do we obtain a “line of best fit”?
method of least squares
involves the minimization of the squared deviations between the actual scores of Y vs. those predicted by the resultant regression equation (Y’).
Regression involves calculating three sum of squares
What are they?
Total sum of squares, Error sum of squares, Regression sum of squares.
What is the total sum of squares?
Total SS = difference between actual scores and the mean value of Y.
What is the error sum of squares?
Error SS (unexplained) = difference between Y’ and the actual values of Y.
What is the regression sum of squares?
Reg SS (explained) = difference between Y’ (predicted) and the mean of Y.
In linear regression, what is the coefficient of determination?
What is the value equal to?
The R-squared value.
The value is equal to:
regression sum of squares (variance explained) / total sum of squared
What is the F-value in a linear regression?
This is the ratio of the;
Regression Mean square/ Error mean square
What is the formula for getting the F-value?
Regression mean square = RSS/ DFregg
ie. divide RSS by the degrees of freedom (of that component)
* RSS = regression sum of squares*
* DF = degrees of freedom*
How do we obtain degrees of freedom for regression mean square?
DFregg = No. coefficients estimated (including the constant) – 1.
What is the formula for the error mean square (ESS) for obtaining the F-value?
F-value = RSS/ESS
Error mean square = ESS/ DFerror = Total observations (N)– No. coefficients (k) –1.
Error mean square divided by degrees of freedom for error sum of squares
How do we obtain the degrees of freedom for the error sum of squares?
DFerror = Total observations - total coefficiants - 1
Interpret the following linear regression example
Y= 45.67 + 0.67Age
important, always an exam question on this
For each unit of increase in age, Y increases by .67.
Y= 45.67 + 0.67Age
For every 1 unit change in age leads to 1 unit (.67 is 1 unit) increase in Y.
When we are using a linear regression equation, do we use standardised or unstandardised values?
Unstandardised
We only standardise afterwards to obtain some measure of the relative importance of variables
In linear regression, we standardise coefficients to understand the relative importance of each variable.
How do we standardise coefficients?
What is this called?
By multiplying them by the ratio of the standard deviation of the variable and the standard deviation of the dependent measure
This is called the Beta value.
ie. ratio = Sy / Sx
Sy = SD of dependent measure
Sx = SD of particular predictor
Once we obtain beta values for the coefficients of a linear regression, what test do we use to determine the relative significant of a coefficient in comparison to others?
t-test
What is the difference between a simple linear regression and a multiple regression
Simple linear regression: 1 predictor variable
Multiple linear regression: 2 or more predictor variables
Exactly the same coefficients, statistical tests etc. apply to a multiple regression as a simple linear regression. The main difference lies in the selection of different methods for entering variables into the equation.
What are the two classifications to consider in multiple regression?
hierarchical vs. statistical
What order to we add the variables in with hierarchical regression?
The variables to be entered in the order which is theoretically important.
What order do we add the variables in with statistical regression?
Variables are entered in according to a specific statistical criterion
e.g., the one with the next highest correlation with the dependent measure.
Which is considered more robust, hierarchical or statistical regression?
Hierarchical
You are controlling for what goes in and which order, based on a systematic or theory driven model.
There are two types of adding variables to a multiple regression analysis.
What are those two types?
Standard vs. Stepwise
Standard: In what is called ‘Ordinary least squares regression’, all variables get entered in at
once.
Stepwise: In Stepwise procedures, including hierarchical or theory-driven entry procedures,
the variables go in Step by Step.
What are the 3 types of stepwise regression?
-
Forwards method: variables go in according to the highest first-order and then
partial correlation. -
Backwards method: all variables go in, then the one with the lowest partial
correlation gets moved, until there is no significant change in R-squared. - Stepwise: combination of backwards and forwards.
Stepwise is considered “dodgy”, why?
Requires a lot more power + atheoretical + open to wild goose chases, if a
correlation is a Type 1 error.
It’s influenced by type 1 errors.
- Things can be significant by chance.*
- This is model inconsistency or unreliability*
Why is hierarchical regression not considered “dodgy”?
It is based on theory and not on chance.
No chance of the model being based on type 1 errors.
What do squared semi-partial correlations tell us?
How much variation in the dependent measure that particular predictor uniquely explains.
As squared semi-partial correlations tell us how much variation in the dependent measure that particular predictor uniquely explains, adding up all the squared part correlations will add up to R-squared (variance explained)
True or False?
False
There might be small amounts of variation explained by non-included variables (not in the equation).
A lot of the variance might be shared by 2 or more predictors.
What are the two types of relationships that can be found in regression models?
Mediators
Moderators
What is a mediated relationship?
Where a variable mediates the relationship between two variables.
e.g. B mediates the relationship between A and C. Variable B carries the relationship between A and C.
A only correlates with C because A gives rise to B, which in turn, gives rise to C.
Is this example a mediated or a moderated relationship?
Example: No. of work hours (A) might correlate with decreases in work satisfaction (C).
However, this may only occur when increases in work leads to increases in Stress levels
(B), which is what decreases work satisfaction.
Mediated.
A and C are only correlated when B exists
When can you test for mediation?
An analysis of mediation only makes sense if all 3 variables are correlated at least moderately.
If this weren’t the case, then the effect probably wasn’t there.
The Baron and Kenny (1986) Method is a test of…
Mediation
How would the Baron and Kenny (1986) method operate in this example?
We run 2 regressions
R1: Run a regression with the number of work hours as a predictor of Work satisfaction
R2: Run a regression with both variables in the equation.
Then, compare the beta coefficients for No. hours between R1 and R2.
Usually, beta value will be higher in R1, but if it has gone down, partial mediation has occurred. If it is fully reduced to 0, full mediation has occurred.
What is the difference between a partial mediation and full mediation
When doing a mediation analysis (Baron and Kenny Method), if the second equation the beta coefficient has reduced in size it is partial mediation. If it has reduced to 0, it is full mediation.
If there has been a partial mediation, how can we tell if the mediation is meaningful?
Sobel Test
This test can be used to test differences in the magnitude of beta coefficients between the first and second equations.
When is using a Sobel Test appropriate?
When you have got a very large sample and where you can assume normality in the product term used to captual the indirect effect.
(i.e., product term of the coefficients corresponding to pathways between the predictor-mediator- outcome variable).
What should you do if there is more the one mediator in your regression model?
A lot of people test these individually,
If you want to determine which one is the best mediator (based on the assumption that
the 2 are correlated), then it is better to run them all in the same model.
What is a moderator in a regression model?
A moderator is a third variable which influences the nature of magnitude of the relationship between two other variables.
How do we test for moderation in a regression model?
Ny testing for a significant A x B interaction.
- We can obtain an interaction term simply by multiplying A and B’s scores together to give a product term.*
- We then conduct a hierarchical analysis. On Step 1, we enter the main effects (B) and (A), and then the product term is entered on the Step 2.*
- The idea is to show how much additional variation can be explained, or whether the interaction term, shows anything above and beyond what is already explainable in terms of main effects.*
What is a significant interaction in a regression?
A significant interaction means that the relationship between two variables as expressed in the standardised slope coefficient (beta) is not consistent across the level of the other factor.
- for example, if the coefficient for Age was 0.40 for Females (i.e., increases in being male increases memory), and –0.10 for Males,*
- ie., increases in age for males slightly decreases predicted memory function. The main thing is that the two coefficients vary significantly.*
What is the best way to analyse a regression interaction?
- Explain how you would run a regression.*
- e.g. Memory as predicted by age comparing males and females*
Break it down into simple linear effects.
We select Males only and then run a simple linear regression (Memory as
predicted) by age. This gives us an equation.
Then we do the same with Females only.
We thus have two equations Memory = Constant + Beta. Age.
By slotting in some made up values for age, e.g., 20, 25,30, up to 80, one can then get predicted memory scores for males and females separately. We then plot these two functions, and this gives us a clear depiction of the nature of the interaction.
What are the 4 assumptions of linear regression?
1. Homoscedascity (The variance of residual is the same for any value of X)
2. Normality (For any fixed value of X, Y is normally distributed)
3. Linearity (The relationship between X and the mean of Y is linear)
4. Independence or Non-serial error dependence (Observations are independent of each other)