Ch. 17 Flashcards
Regression
A method that predicts values of one numerical variable from values of another numerical variable
Difference between regression and correlation
Correlation measures the strenght of association in the data, which reflects on the scatter of the data
Regression fits a line through the data to predict one vriable from another and to measure how steeply one variable changes w/ changes in the other
Linear regression
Most common regression
Assumes a linear relationship between variables
Least-squares regression line
Line for which the sum for all squared deviation in Y is the smallest
Slope (what is it?)
The slope of a linear regression is the rate of change in Y per unit X
Represented by b(sample estimate), population version (B, beta)
What is “Y-hat”?
It represents the prediction of Y-values
What do predicted values of Y tell you?
They give you an estimate of the mean value of Y for all individuals for that given value of X
Residual
Observed value minus predicted value
MSresiduals
Gives the variance of the residuals
Confidence bands
95% Confidence bands measure the precision of the predicted MEAN Y for each value of X
Prediction intervals
Measure the precision of the predicted SINGLE Y-values for each X (usually 95%)
Extrapolation
The prediction of the value of a response variable outside the range of X-values in the data
Why is extrapolation a bad idea?
There is no way to guarantee the relationship between X and Y holds for points beyond the range of the data; thus, it is not accurate.
Degrees of Freedom for Regression?
n-2 (because we needed to calculate slope and intercept)
When can ANOVA be used in place of the t-test?
When the test is two-sided and the null hypothesized slope is ZERO