4.3 residuals and outliers Flashcards
importance of avoiding extrapolation
the line can be determine whether to use the least-squares regression line in predicting selling price
never try predicting the y-value for an x-value that is outside the range of data
this is because we don’t know if the least-squares regression line is linear outside the given range
extrapolation
making predictions for values outside the data; leads to unreliable predictions
residual (y-yhat)
the difference between the predicted point on the y-hat line and the y-value of the point
residual plot (x, y-yhat)
a plot in which the residuals are plotted against the values of the explanatory variable x
when there is a curved or any type of pattern
dont use the least-squares regression line because that means theres no linear relationship
influential point
an outlier that causes a big shift in the position of the line when included in the scatterplot
when influential points are present
its good practice to show the least-squares regression lines both including and excluding the influential points