Stats 1045 Vocab 4 Flashcards
Ecological Correlations
Correlations based on rates or averages, often used in political science or sociology, that tend to over-state the strength of an association.
Correlation Coefficient
A pure number, without units, that is not affected by changes of scale.
Association
What correlation ACTUALLY measures (contrast this with the next question before you answer!).
Causation
What correlation DOES NOT necessarily measure, but is often confused to measure.
Graph of Averages
A scatter plot that shows the average value of Y for each value of X.
Regression Fallacy
The incorrect notion that the regression effect is due to some real-world cause or force rather than just the spread in the data
Regression Line
A plot of all regression estimates, created using the regression method, that lie on the same line.
Error (in regression prediction)
The name of the value that occurs in regression predictions when we perform the following subtraction: (Actual y-value) - (predicted y-value).
R.M.S. Error For Regression
This value indicates how far typical points are above or below the regression line.
Square root of
The equation for finding the R.M.S. error for regression.
Residual
Another name for prediction errors in regression.
Residual Plot
The diagram that is made when we transfer each point on a scatter diagram to a new graph that replaces all the y-coordinates with their residual values.
Homoscedastic
Concerning a scatter plot, we use the phrase “football shaped” in place of this more technical term.
Heteroscedastic
The technical word that means all the vertical strips in a scatter plot do NOT have the same scatter.
The least-squared line
A name used for the regression equation that emphasizes the fact that the regression equation minimizes the r.m.s error.