ANOVA Flashcards
What is the regression error sum of squares (SSE)?
SSE (Error Sum of squares) is variation due to factors other than the relationship between X and Y
What is the regression sum of squares (SSR)?
SSR (Sum of squares) is a variation that is explained by the relationship between X and Y
How is the SST value calculated?
SST is the total sum of squares. It is deermined by adding SSR and SSE to get a complete measure of variation of the Yi values around their mean (Ȳ).
What does the correlation coefficient (r) measure?
The correlation coefficient measures the strength of the relationship between X and Y.
What is the coefficient of determination(r2)?
The coefficient of determination is the regression sum of squares, divided by the total sum of squares.
r2 = SSR/SST
What is the difference between:
coefficient of determination and correlation coefficient?
The correlation coefficient (r) is multiplied by itself to get the coefficient of determination (r2).
The coefficient of determination (r2)demonstrates a percentage of variation in y, which is explained by all the x variables in the model. This value is always between 0 and 1.
The coefficient of correlation (r) is the degree of relationship between two variables, i.e. x and y. It can fall between -1 and 1.
What is another term for the concept of “Residual”?
Estimated error value.
What is the equation to calculate residuals (estimated error value) for any point in a regression model?
ei = Yi - Ŷi
Where Yi is the observed value and Ŷi is the predicted value.
What table is displayed in the image? What is it for?
ANOVA table.
It is used to summarize regression terms applicable to models being evaluated. Especially to determine the F test statistic and p-value for hypothesis testing against a regression model.
For a straight line probabilistic model, there are 6 possible relationship types. WHat are they?
(Hint: PL, NL, PC, NC, UC, NA)
Positive linear
Negative linear
Positive Curvilinear
Negative curvilinear
U-shaped Curvilinear
No relationship
What part of the simple linear regression model is the deterministic component?
Y=β0+β1X+ε is the simple linear regression model.
E(y) =β0+β1X is the deterministic component of the regression model.
What is the model equation for a simple linear regression model?
Y=β0+β1X+ε
Where Y = dependent variable (aka response variable)
β0 = y-intercept for the population
β1 = slope for the population
X = independent variable (aka predictor of Y)
ε = random error component
Reminder: without the ε component, the equation
E(y) =β0+β1X is the deterministic component of the regression model.
What are three measures of variation to observe when working with linear regression models?
SSR (regression sum of squares)
SSE (error sum of squares)
SST (total sum of squares)
How do we calculate the SST (total sum of squares)?
This is the total sum of squares.
∑(Yi - Ȳ)2
This reads as the sum of the difference between the observed Y and the average of Y, squared.
How do we calculate the SSE (error sum of squares)?
This is the unexplained variation or error sum of squares.
∑(Yi - Ŷi)2
This reads as the sum of the difference between the observed Y and the predictedY, squared.