Multiple Linear Regression (MLR) Flashcards
How can we detect variable that co-vary
scatter plots, causion of cause or effects, always think of potential third variable
statistical analysis of co-variants CANT distinguish between, spurious, causal and common process reasons
statistical analysis of co-variants CANT distinguish between, spurious, causal and common process reasons
Covariance doesn’t tell us about _________. So always use _________________
independence; scatter plots
Covariance is _______________ sensitive, so we use the _________________ values in place of the raw data, this makes all scales have mean_________ and standard deviation ____, and is called _________________________________
size; standardised, 0, 1, Pearson’s product moment correlation coefficient
what is effect size
a quantitative measure to allow comparisons between studies, this is given by r squared
aka coefficient of determination
r squared is the proportion of variance that one variable explains in another
how to find the slope of regression
find the slope that gives the minimum error variance: the least squares approach to regression
what is partial correlation
the amount of variance in X3 that is related to X1 and X1 alone
‘How much of the X3 variance that is not explained by other variables is explained by X1’
‘partial out’ the variability explained by X2
look at how much remaining variability in X3 is explained by knowing the remaining variability in X1
take out the variability using simple linear regression
a pure measure uncontaminated by other variables
uniqueness of variance makes theoretically simpler
reveal hidden relationships
what is semipartial correlation
how much of the total variance of X3 does X1 and X1 alone explain
a more intuitive baseline
allows easy comparison of coefficients because X3 is constant
Give an example of an misleading schematics on illustrating partial correlation
- No correlation (r=0) between desirability (X1) and frequency of buying (X3)
- Positive correlation (r=0.5, r^2=0.25) btw amount of pocket money (x2) and buying frequency (X3)
- Negative correlation (r=-0.4, r^2=0.16) btw desirability (X1) and pocket money (X2)
- from Venn diagram, partial correlation r3.2 should be zero
- but calc
r13. 2 = (r13-r12r32)/ rt((1-r12^2)(1-r32^2)) = 0.252
partial correlation uncovering hidden relationships when ____________________________
two components / factors influence a third in opposite ways
How to regard ANOVA as regression
In ANOVA, looking to see if ‘knowing’ the level of the factor (IV) explains variability of the DV
In regression, looking to see if ‘knowing’ the score of the IV explains var of DV
ANOVA talks about a level mean, e.g., 3 different levels of coffee consumption (1cup, 2cup…), to do regression, you take a measure of how much coffee they had and now you have a continuous variable
Why can ANCOVA be seen as a kind of partial correlation
In ANOVA: have DV and factor, factor explain some var of DV, the unexplained var is the SSerror
Idea of covariate is to ‘remove’ unwanted var in DV, some DV var is attributed to CV and hence removed from the SSerror
Covariate is the X2 here
Multiple linear regression describes the relationship between \_\_\_\_\_ variables Have \_\_\_\_\_\_\_\_\_\_ predictors (X1,X2...) and \_\_\_\_\_\_\_\_\_\_ DV (Y) Effect size (R^2), is the proportion of variance of Y (the DV) explained knowing \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
multiple variables
2 or more predictors, 1 DV
all of the predictors
Multiple linear regression
with 1 predictor, describe data as _______
with 2 predictors, describe data as ___________
use ___________ method with k predictors
1 predictor, describe data as a line
2 predictors, describe data as a surface
use least squares method with k predictors
Curvilinear regression
with single predictor, __________________
with single predictor, restricted range of curves
Curvilinear regression
MLR allows for multiple predictors
e.g., can use X^2, X and a constant
same predictor (X) and X^2. In MR you are fitting your data from a _________ with a _____________, but your predictor might not be independent, here you are fitting ur data with curve
fitting your data from a plane with a line but your predictor might not be independent
What is polynomial regression
- you fit equations with increasing powers of X
- each increase in the order adds a point of inflection (bend) to the curve)
- as the number of predictors increases, the fir will get better
- caution of overfitting
What is overfitting? Why not use the Lagrange polynomial
1.when the number of beta is the same as number of data points, polynomial equation will be exact (when all the X’s are different), go through every data point with R^2=1
2. Lagrange polynomial fits the noise
we get extremely small or large numbers between the points
when does overfitting data occurs
- when regression is fitting noise rather than the underlying trends
How to avoid overfitting
- build up the model step-by-step (Y=a+bx –>Y=a+bX+cX^2, etc)
- each additional term (beta) costs a degree of freedom
so can ask if the additional term explains a significant amount of ‘extra’ variance - Calculate Fchange= change in SS/MSerror
- Plot R^2 against Polynomial order, stop when p< 0.05 (or other significance level)
- Can get different results if add several variables at once (several ns improvements can, overall, indicate a sig improvement)
- Use change in R^2 (Fchange) as a way of evaluating extra predictors
Compression using regression
Number of ___________ is typically much less than the number of _______________
Number of parameters is typically much less than the number of data points
what is the disadvantage of regression
compression of data leads to loss of information (variations) in data
which equation
how many parameters
what is the assumption of regression
- uses a pooled estimate of error variance
assumes independently and identically distributed (iid) distributions over the predictor variable - need to check the error deviations
look at plot of ‘residuals’ against the values of the different X’s
sign of good model residuals should look random
Residual = distance from regression surface to the data point
How to report regression
reporting regression analysis is all about the coefficients
so a standardised slope might be useful
the beta values (annoyingly subtle name change) give standardised slopes