Reg review Flashcards

1
Q

what is the standard error?

A

The estimated standard deviation of the sampling distribution of the slope parameter, which tells us how precise our estimate is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the p-value?

A

The smallest significance level at which we would reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a confidence interval?

A

Over repeated sampling, we would expect 95% of confidence intervals constructed in this manner to contain the true population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does it mean for a coefficient to be unbiased?

A

If an estimator is unbiased, then the mean of the sampling distribution of the estimate should be centered on the true population parameter (E(beta1_hat) = beta1_population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Definition of beta1 coefficient?

A

Beta1_hat is the slope of y with respect to x when all other regressors are held constant, or fixed, a one-unit change in X is associated with a beta1_hat change in Y, holding all else constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an endogenous regressor?

A

correlated with the error term, or correlated with Y through the error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is an exogenous regressor?

A

uncorrelated with error and has a direct impact on Y, should be included to avoid OVB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is consistency?

A

Betahat is a consistent estimator of betapop, as N approaches infinity, betahat converges in probability to betapop (a.k.a. asymptotic unbiasedness, large samples property)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the law of large numbers?

A

Our estimates of the population mean and variance will converge in probability to the true population parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the CLT?

A

As N approaches infinity, the sampling distribution will be normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the Guass-markov assumptions for MLR and which are needed for unbiasedness and consistency? Which are needed to be BLUE?

A

Gauss Markov Assumptions for MLR: (1 – 4 for unbiasedness, 1 – 5 for BLUEs (best linear unbiased estimators))

  1. Linear in parameters
  2. Random sampling (independent and identically distributed random variables)
  3. No perfect collinearity (between any of the predictors, r < 1 between all regressors)
  4. Zero conditional mean (the expected value of U conditional on all Xs is equal to zero)
  5. Homoskedasticity (the variance of U conditional on all Xs is equal to the variance of U, sigma-squared)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Probability of type 1 error?

A

alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Probability of type 2 error?

A

beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is r-squared?

A

Proportion of the sample variation in Y that is explained by X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Factors affecting sampling variances of OLS slope estimators

A

1) Error Variance: Take more stuff out of the error (make σ2 smaller); add more explanatory variables. As error variance in pop decreases, Var((β_j ) ̂) gets smaller.
2) Total Sample Variation: It is easier to estimate how xj affects y if we see more variation in xj (increase SSTj); increase SSTj by increasing the sample size.
3) As Rj2 gets bigger so does Var(b1). If xj is unrelated to all other independent variables, it is easier to estimate its ceteris paribus effect on y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens when you have heteroskedasticity?

A

Variance formulas for OLS invalid, does not affect beta coefficients; cannot perform F/t-tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In small samples ^B1 = (B1 + ^cov(x,u)/^var(u))

What can we do to show that zero conditional mean assumption holds here?

A

the cov of x and u is equal to 0 in our sample.

18
Q

Formula for B1 coefficient?

A

cov(x,y)/var(x)

19
Q

Formula for B0 coefficient?

A

E(y)-B1*E(x)

20
Q

Formula for SST?

A

∑(actual xj – x ̅j)2)/n

21
Q

Formula for var(Bj), SLR?

A

σ^2/SSTx

22
Q

Formula for var(Bj), MLR? (expressed in terms of sigma squared)

A

σ^2/(SST_j *(1-R_j^2))

23
Q

Formula for standard error of Bj, SLR?

A

√(SSR/(n-2))/√SSTx

24
Q

Formula for standard error of Bj, MLR?

A

√(SSR/df)/√(SST_j (1-R_j^2))

25
Q

Variance of regression?

A

SSR/(df)

26
Q

RMSE (standard error of regression)?

A

√(SSR/df)

27
Q

How to calculate residual?

A

yi - ^yi (actual y – predicted y)

28
Q

How to calculate SST?

A

∑(actual y – avg y)2

29
Q

How to calculate SSE?

A

∑(predict y – avg y)2

30
Q

How to calculate SSR?

A

∑ (actual y – predict y)2 or ∑u2

31
Q

How to calculate R squared?

A

SSE/SST or 1 – SSR/SST

32
Q

How to calculate correlation given covariance and variances?

A

(Cov(X,Y))/√(Var(X)Var(Y))

33
Q

Formula for F statistic (SS form?)

A

((SSRr-SSRur)/q/

SSRur/(n-k-1)

34
Q

Formula for F statistic (R squared form?)

A

((R^2r-R^2ur)/q/

1-R^2ur)/(n-k-1

35
Q

Change in y with respect to change in x when you have a quadratic? (B0+B1x1+B2x2^2+u)

A

change in y/change in x=

B1+2*B2x

36
Q

What are advantages of using LPM (OLS) for a binary DV?

A

easy estimation and interpretation of coeffs. That are reasonably good

37
Q

Disadvantages of using LPM (OLS) for a binary dv?

A

predicted probs. May be greater than 1 or less than zero (Marginal probability effects sometimes logically impossible)

Partial effects of explanatory variables are constant

LPM is heteroskedastic -have to use OLS with robust standard errors

Non-normality of errors (are binomial)

38
Q

True or false: Partial effects from logit models are nonlinear and depend on the level of x

A

True

39
Q

What values do researchers typically hold other covariates at when generating marginal effects?

A

holds all other covariates at their means (which might not make sense for dummy variables)

40
Q

What does MLE do?

A

use a likelihood function, which gives us the liklehood of the data given a set of proposed parameters

Computer keeps running iterations of the above until the improvements in the liklehood are small; process called “convergence”

41
Q

What distribution does logit model use for hypothesis testing? What kind of stat will you get?

A

Use normal distribution; report z-statistics

42
Q

How can you measure goodness of fit from logit model?

A

Percent correctly specified

Pseudo r-squared

Chi-square test (like f-test, tests null that all coeff. Are zero)