Linear regression Flashcards
In linear regression, what is the notation used to represent the intercept and the slope (regression coefficient), respectively (based on sample data)
a and b
In a regression equation, what does Yi denote?
Observed scores on the DV
In a regression equation, what does Xi denote?
Observed scores on the IV
In a regression equation, what does Y(hat) denote?
Predicted scores on the DV
In a regression equation, what does ei denote?
Residual scores in the regression model (ie difference between observed and predicted scores)
In regression analysis, what does OLS stand for?
Ordinary Least Squares
Ordinary Least Square (OLS) estimates are biased T/F
FALSE
They are unbiased
In a regression equation, what does p denote?
Number of partial regression coefficients
this applies in multiple regression analysis, where you have multiple IVs
Ordinary Least Square (OLS) estimates are very efficient T/F
TRUE
What metric do we use to calculate the strength of prediction of our overall regression model?
R2
What does r2 actually tell us
The proportion of the variance accounted for by our regression model
What is the range of possible values for r squared?
0-1
To calculate the confidence interval on r squared, you need the upper and lower degrees of freedom associated with the F statistic, T/F?
TRUE
R squared is biased and consistent, T/F
TRUE
The bias means you need to get the adjusted R squared too
If you have lots of IVs and a small sample size, what should you do to that r2
Adjust it
So you want to compare the strength of two partial regression coefficients. What are your two options?
- STANDARDISE IT
2. Use a SEMI PARTIAL CORRELATION
How can you tell you are looking at an R output containing standardised regression coefficients?
There will be no intercept presented
When looking at standardised regression coefficients, what is the unit they are expressed in?
SDs
What are you looking at when you’re looking at (squared) semi partial correlation?
The proportion of the variance in the DV explained by the IV you have isolated, assuming all other IVs are held constant
‘This IV uniquely accounts for x % of variation in the DV’
When commenting on the proportion of the variance that a given IV accounts for in the DV, should you used semipartial correlation or SQUARED semipartial correlation?
SQUARED!
What are the four statistical assumptions underlying the linear regression model?
- Independence of scores
- Linearity (of what?)
- Homoscedasticity (constant variance of residuals)
- Normality of residual scores
In the context of assumptions for regression models… what does LINEARITY refer to
The assumption that scores on the DV are a linear function of scores on the IV
In the context of assumptions for regression models… what does HOMOSCEDASTICITY refer to
It means…
The variance of the residual scores… is the same for any score on each dependent variable
In the context of regression analysis, how do we check for LINEARITY …
and what are we looking for when we do the looking
- Scatterplot matrix of… ALL DVs and IVs
- Scatterplot matrix of… residual scores and observed IVs AND residual scores and predicted IVs
- Marginal model plots of… scores on DV and observed IVs AND predicted IV
We are looking for straight lines