GLM 1 - Simple linear regression Flashcards
In a linear relationship, how is the value of an outcome variable Y approximated?
Y ≈ β0 + β1X.
Y= dependent variable
B0= is an intercept
B1 = slope coefficient of X
What is the intercept/B0 (often labelled the constant)?
Expected mean value of Y when all X=0.
What is the B1?
The slope or how y changes per unit increase in x.
B1 is increase in y when you change x by a unit/when x is increased by one unit y will increase by beta 1
What is the terminology of a linear regression?
- We say that Y is regressed on X .
- We are expressing Y in terms of X .
- The dependent variable, Y , depends on X .
- The independent variable, X , doesn’t depend on anything.
How are the coefficients or parameters B0 and B1 estimated?
Using the available data:
(x1, y1), (x2, y2), . . . , (xn, yn ) - We have here a sample size of n data points.
How are the estimates of parameters written?
The estimates of the parameters are written with a circumflex or hat: ^
We then write our linear equation with these estimated coefficients: y^ = β^0 + β^1 xi
Only a hat over the dependent variable.
Independent variable (xi) does not have a hat as treated as fixed.
B0 and B1 are independent of each other
True or false
True
What does the circumflex allow us to differentiate between?
True value and estimated value
What happens if add value to B0?
This would only affect y but not B1xi – B0 can change independently of B1
What is y^I?
Predictions or predicted values of the outcomes y , given the independent variables, xi ’s
What are the differences between the predicted values, y^ i’s, and the observed values, yi ’s?
The residuals:
e^ := yj − yi^ .
That is, these are the values that remain after we have removed the
predictions from the observations.
Why are the residuals, e^i ’s, also equipped with a hat?
Because these are also estimated values.
Why are the black error bars vertical, and not perpendicular to the line in blue?
Residuals correspond to an addition to value of y hat
How can the optimal value of the parameters, β0 and β1 be found?
By considering the sum of the squares of the residuals:
RSS := e^1 + e^2
Why do we square residuals?
Residuals are defined as a subtraction of the predicted values from observed values; we can rewrite RSS in the following fashion: RSS = (y − y^1)2. Some values may be negative and some may be positive and thus must square them to normalise them and ensure they make a positive contribute to RSS.