Regression Flashcards
Regression
Using correlation (r), you can roughly predict one variable value based on another
Regression uses
Prediction (College admissions, car insurance rates, dating sites, health insurance) estimation, hypothesis testing, modeling causal relationships
Predictor Variable (X)
information you have
Criterion Variable (Y)
to-be-predicted variable can be estimated within a certain degree of certainty given known values
X & Y Synonyms
X: Predictor Variable, Independent Variable, Explanatory Variable
Y: Criterion Variable, Dependent variable, response variable, outcome variable
Best-fit line
A line that is an equal distance from all points on a scatterplot. A line that best fits the data
Regression Questions:
- Is a pattern evident in a set of data points?
- Does the equation of a straight line describe this pattern?
- Are the predictions made from this equation significant?
Sum of Squares (SS) for regression
The sum of the squared distances of data points from a straight line (gives more information than basic deviation scores, that do not capture the degree of variability from the data points to the line)
Sum of deviation
The sum of deviation scores from best fit line always equals zero, does not measure spread
Sum of Squares
Squaring each deviation value captures spread from data points to regression line
Equation of a line (slope)
Y=bX+a (Y=Mx+b)
Slope
B(measures the change in y relative to the change in x)
Intercept
a (y when x equals 0)
Method of least squares
Method to compute slope and y-intercept of the best fitting straight line to a set of data
Formula for Least Squares
- Calclulate SSxy, SSx, SSy
- b=SSxy/SSx
- a=My-(b)(Mx)
- Yhat=bx+a