Part 1 - Regression Flashcards
what is regression?
Describing and evaluating the relationship between a variable, and other variables.
more specifically, regression is an attempt to explain movement in one variable, as a result of movement in other variables.
what is important to remember regarding correlation
Correlation is nothing about causality. Correlation is a degree of association (linear association). We cannot say whether one cause movement in the other
assumptions about the dependent vs ndependent varialbes=
dependent variable is assumed to be stochastic in some way. We can say that there is a probability distribution associated with it.
On the other hand, we typically assume that the independent variables are non-stochastic, e.g. fixed in repeated samples.
why cant we just plot and draw regression lines by hand? it is extremely fast and simply to do so
We are interested in determining the extend to which the movement in some variable can be described by an equation. If we have a number on it, we can compute risk analysis and al lthat.
what should we think about when looking at this:
y = a + bx
the function is exact. Meaning, it is not a prediction, it is a statement.
y = a + b x
is exact, and exact relationships rarely occur in real life. What do we do?
We modify so that we include the fact that we have a number of sample points.
y_t = a + b x_t + u_t
where u_t is the random disturbance term.
why add the disturbance term
Many cases are simply impossible to model exact for a great number of reasons.
Accoutns for measurement error, unknown events etc
when fitting the line, the regression line, we minimize what?
Vertical distances. These are the distance between the fitted line and the sample points.w
why minimize vertical distances and not horizontal?
we assume that the sample point x values are fixed in repeated samples, so there is no uncertainty there. there is no gap to minimize.
What does it mean that something is “fixed in repeated samples”
should we conduct the experiment again, we’d get the same values for our x values.
what methos is used for linear regression?
OLS. OLS is the main workhorse