simple linear regression Flashcards
bivariate data
data we get when we work with exactly two quantitative measurements on each individual
variables in bivariate data
response (y)—> measures out some
explanatory (x) —> explains changes in y
scatterplot characteristics
DONT FUCK SHIT UP
D - direction (positive or linear)
F - form (linear or clearly not linear)
S - strength (weak,moderate,strong)
U (unusual) - outliers
correlation
r
quantifies the strength and direction of the linear relationship between variables x&y
always between -1 and 1
not resistant/ strongly affected by outliers
regression line
that = bo + b1x
residuals
observed response- estimated response
e = y-hat
smaller residuals = line describes the linear relationship in data better
least squares regression line
produces the line that fits the data best
- minimizes the sum of the squared residuals
R^2 value
coefficient of determination
describes the fraction of variation in the response variable that is explained by the least-squares regression line
ex: r^2 = 0.75
about 75% of the variation in y can be explained by x