Simple Regression Flashcards
Explain what the regression line can be used for
- Prediction
- Estimating the magnitude of effects of the predictor on the outcome.
Define the regression line
A straight line drawn through a scatterplot of two variables that comes as close to the data points as possible
Line of best fit
Method of least squares
Method used to find the regression line
What is the intercept in regression analysis?
Point at which the regression line cuts through the Y-axis or a in the regression equation
E.g. with no practice at all on a test, the score would be 2.45
Slope
Another name for the regression coefficient or b in the regression equation
The number of units that the regression line moves on the Y-axis for each unit it moves along the x-axis.
What is the linear regression equation and what can it be used for?
y = a + b * x
The value of y is equal to a (intercept) plus b (slope) multipliedW by the value of x for the given point
Use to predict how a case with a given score on x will score on Y
What do you compare the line of best fit with when assessing the significance of the effects of the predictor on the outcome?
- A regression line that is flat
- A line based on the mean value of the outcome
- A line indicating that the value of Y is always the same regardless of changes in the value of X
What is the value of the regression coefficient when the regression line is flat?
0
Implies that a line based on the mean sees the two variables as having no relationship.
Define what is meant by model sum of square (SSm)
The portion of total variance that the regression line accounts for
The difference between the total variance in Y scores and the variance in Y scores accounted for by the regression line.
Obtain by calculating the difference between the mean and each value of Y as predicted by the regression line, then square each difference and finally calculate the sum of all squared differences
When performing a simple regression, what does the F-value in the ANOVA table show?
The ratio between the portion of total variance accounted for by the regression line and the variance not accounted for by the regression line.
R-square
- Also known as coefficient of determination
- The proportion of variance in Y explained by X
- The variance explained by the regression line divided by the total variance in Y to be explained.
- Proportion of total variance in Y explained by the regression line/model (SSm), relative to how much variation there was to explain in the first place (SSt)
- Correlation coefficient squared
- SSM/SSt
Adjusted r square
Adjusted measure of R square accounting for possible overestimation.
Reduced value for R squared attempting to make an estimate of the value of R squared in the population.
Total sum of squares (SSt)
The line based on the mean of Y scores and its residuals
Calculate by calculating the difference between each actual value of Y and the corresponding predicted value of Y, then square each difference.
Explain the difference between SSR, SSM and SST
SSR (Sum or squared residuals) - Variance in Y that is not explained by the regression line. How well a linear model fits the data. Uses the differences between the observed data and the model.
SSM (Model sum of squares) - Variance in Y that is explained by the regression line. Uses the differences between the mean value of Y and the model.
SST (Total sum of squares) - Total variance in Y to be explained by the regression line, cannot account for. Represents the degree of inaccuracy when the best model is fitted to the data. Uses the differences between the observed data and the mean value of Y
When performing a simple regression, what does the coefficients table tell you?
Provides further information about the magnitude of the effects of X on Y
Beta =The standardised regression coefficient
B on constant = Value of the intercept
B on variable = Value of the slope (regression coefficient)