Week 9 (Simple Linear Regression II) Flashcards
R squared
The proportion of variance in the outcome variable accounted for by the predictor
F-ratio
The ratio of model variance to error variance, whether the regression model is significant.
Intercept
The value of the outcome variable when the predictor = 0
Slope
The rate of change in the outcome variable in relation the change in the predictor
Unstandardised beta (b)
The change in Y for a one unit change in X
Standardised beta (β)
The standardised change in Y for one standard deviation change in X
-Another measure of slopes
-B0 (intercept) is always 0
-B1 (the slope): as X increases by one standard deviation, Y changes by B1 of a standard deviation
What does an output of Standard Coefficients Beta mean (e.g, -.344)
In this example: As X increases by one standard, Y decreases by .34 standard deviation.
Look at standard deviation readings to see what this means in context
When to use standardised OR unstandardised beta
Unstandardised b
-When you want coefficients to refer to meaningful units
-When you want a regression equation to predict values of Y
Standardised β (independent of units)
-When you want an effect size measure, e.g., small/medium/large β is equivalent to small/medium/large r(.1/ .3/ .5)
-When you want to compare the strength of relationship between the predictor and the outcome (across predictors measured in different ways)
Covariance
-The extent to which variables co-vary (change together)
-High covariance: means there is a large overlap between the patterns of change (variance) observed in each variable.
Outliers influence of regression
-Affect the model’s ability to predict all cases
How influential an outlier is depends on
-Distance between Yobs and Ypred (Residual)
-Leverage (unusual value on predictor)
Cases with standardised residuals or predictors in excess of +- 3.29 (p <0.001) have the potential to be influential outliers
How to check for outliers
Standardised residuals (Std. Residual) should have no extreme values (be smaller than 3.29)
Assumptions of linear regression
- Linearity
-The outcomes (continuous variable) is linearly related to the predictors - Independence
-Observations are randomly and independently chosen from the population - Normality of residuals
-The residuals are normally distributed - Homogeneity of variance (homoscedasticity)
-The variability of the residuals is the same for all levels for the predictors
Independence assumption (linear regression)
-This assumption means that the residuals in your model are not related to each other.
-The residuals are NOT independent of each other in cases such as repeated observations on the same subject, or observations from related subjects (e.g twins, students in the same class)
-If this assumption is violated then the model standard errors (SEs) will be invalid, as will the conference intervals (CIs) and significance tests based upon them
-Ensure independent sampling in your design - subjects randomly and independently chosen from the populations.
Residual plot
X = ZPRED (standardised predicted value)
Y= ZRESID (standardised residual)
Normality assumption (Linear regression)
-The residuals (not IV or DV) should be normally distributed, can be done using histogram and Normal probability plot
-In small samples a lack of normality invalidates confidence intervals and significance tests , whereas in large samples it will not because of the central limit theorem