GenerL Flashcards
If the independent variable is divided or multiplied by some nonzero constant, c
then the OLS slope coefficient is multiplied or divided by c, respectively.
In general, changing the units of measurement of only the independent variable
does not affect the intercept.
the goodness of fit of the model should not depend on the
units of measurement of our variables.
more reasonable to increase/decrease by
constant percentage
log(wage)=b0 +b1(edu)+u
%delta(wage)=(100*b1) delta education
why using the log on wage?
Is to impose a constant percentage effect of education on wage
Another important use of the natural log is
in obtaining a constant elasticity model
log(salary)=4.822+0.257log(sales), The coefficient of log(sales) is the estimated
the elasticity of salary with respect to sales. It implies that a 1% increase in firm sales increases CEO salary by about 0.257%
the sum of the logs is equal
therefore, the slope is still b1, but the intercept still increases
if independent variable is log(x) and we change the units of measurement of x before taking the log
the slope remains the same, but the intercept changes.
linear regression: The key is that this equation is linear in
the parameters b0 and b1. There are no restrictions on how y and x relate to the original explained and explanatory variables of interest.
not linear in their
PARAMETERS
Linear regression estimates the
conditional mean of the response variable. This means that, for a given value of the predictor variable X, linear regression will give you the mean value of the response variable Y.
In (simple) linear regression, we are looking for
a line of best fit to model the relationship between our predictor, X and our response variable Y.
To find the intercept and slope coefficients of the line of best fit, linear regression uses
the least squares method, which seeks to minimise the sum of squared deviations between the n observed data points y1…yn and the predicted values, which we’ll call y^.
the OLS slope and intercept estimates are not defined unless
we have sample variation in the explanatory variable.
Random sampling
where individuals are chosen randomly and their wage and education are both recorded.
unbiasedness is a feature of
the sampling distributions of b^ 1 and b^ 0, which says nothing about the estimate that we obtain for a given sample. We hope that, if the sample we obtain is somehow “typical,” then our estimate should be “near” the population value.
the error term u in equation is correlated with lnchprg
In fact, u contains factors such as the poverty rate of children attending school, which affects student performance and is highly correlated with eligibility in the lunch program
correlation + anova
linear regression
regression=
the value of one variable is a function of another variable
linear:
NOT radical and not power
m= OR beta or slope
Y2-Y1/X2-X1, Rise/Run
f(x)=
Y
E(y)=
is expected value or average value of y for the given value of x, is where we expect to intersect in the graph.
the y value is not a POINT in regression
its distribution of y for given x because the regression value is not perfected, so it is best to use approximation. so the, y value is the mean of the distribution of y value for each x given value. the main value of y.
when the slope amount is zero the independent variable
does not exist.
goal:
line reduce residuals sum of squares
y hat:
when we know the parameter of the sample NOT population. E(y) is for the population.