Prelim Exam Prep Deck Flashcards

1
Q

Sandwich (robust) estimator for variance is given by ….

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Gibbs sampler is given by

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s consistency?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s the Fisher information matrix?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

M-estimator

A

Any solution to \sum \psi(Y_i, theta) = 0

\psi does not depend on i or n.

true parameter value of theta is given by E\psi(Y_i, theta) =0

distribution (derived through Taylor expansion) is of form

\hat theta \sim AN (theta_0, V(theta_0) / n )

V= A^-1 B A^(-1T)

See sandwich variance estimator for more details

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What’s the gamma-exponential model?

A

prior p(theta) = Gamma(alpha, beta)

&

likelihood p( y | theta) = Exponential( theta )

> >

posterior p(theta | y) = Gamma(alpha + 1, beta + y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the Poisson-Gamma Model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are Jeffrey’s priors?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What’s the normal-normal model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the beta-binomial model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What’s the Metropolis Algorithm?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s the Metropolis-Hastings algorithm?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What’s importance sampling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What’s the normal pdf?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s the gamma pdf?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What’s the gamma mean?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What’s the weak law of large numbers?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What’s the CLT?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What’s asymptotic normality of the MLE?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What’s the Gauss-Markov Theorem?

A

The Gauss Markov theorem says that, under certain conditions, the ordinary least squares (OLS) estimator of the coefficients of a linear regression model is the best linear unbiased estimator (BLUE), that is, the estimator that has the smallest variance among those that are unbiased and linear in the observed output variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the conditions required for the Gauss-Markov Theorem to apply?

A

independent/ uncorrelated error is the key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What’s the OLS estimator?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What’s the sum of an infinite power series?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What’s the sum of an infinite geometric series?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What’s Small o: convergence in probability?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What’s Big O: stochastic boundedness?

A
27
Q

What’s the EM algorithm?

A
28
Q

What’s an orthogonal projection?

A

The orthogonal projection of a vector $s$ onto a given subspace $R$ is the vector $rin R$ that is closest to $s$.

29
Q

What does it mean for a projection to be idempotent?

A
30
Q

What’s Newton’s Method?

A
31
Q

When/ how might Monte Carlo Integration be useful for likelihood estimation?

A
32
Q

What are M-estimators and how are they different from MLE?

A
33
Q

What’s the Poisson PMF?

A
34
Q

When might consistency/ asymptotic approximation of the MLE break down?

A

If the number of parameters increases with the sample size, the usual theorem about consistency/ asymptotic approximation (normality) of MLEs doesn’t apply.

35
Q

When might we use the delta method?

A

The Delta method is a theorem that can be used to derive the distribution of a function of an asymptotically normal variable.

It is often used to derive standard errors and confidence intervals for functions of parameters whose estimators are asymptotically normal.

It can be useful in contexts where asymptotic properties of the MLE breakdown e.g. the number of parameters increases with sample size and the fisher’s information consequently doesn’t approximate the asymptotic variance.

36
Q

What are the assumptions for OLS regression?

A
  1. Linearity in parameters (variable transformation can be used to meet this assumption but may make interpretation difficult)
  2. Full column rank X (When a matrix is rank deficient (not full rank), its determinant becomes zero and subsequently, its inverse does not exist)
    - If perfect multicollinearity exists between any two or more regressors, matrix X does not have “k” linearly independent columns. It is, therefore, not full rank.
    - If n<k then the rank of the matrix is less than k, it does not have full column rank. Therefore, when we have k regressors, we should have at least k observations for being able to estimate β by OLS.
    - For simple linear regression, if all the values of the independent variables x are the same then we have two columns in matrix X — one having all 1’s (coefficient for the intercept term) and the other having all c (the constant value of the regressor), as shown below. It is evident that in this case too, the matrix is not full rank, the inverse does not exist, and β cannot be determined.
  3. Zero Conditional Mean of Errors

Note: Conditions 1-3 mean the OLS is unbiased.

To have the OLS as the BLUE by Gauss Markov Theorem:

  1. Homoscedasticity: we do not want the regressors to carry any useful information regarding the errors.
  2. Nonautocorrelation: Errors are unrelated to other errors.

Additional assumption:
6. Normality
- When all the other assumptions are met along with the normality assumption, OLS estimates coincide with the Maximum Likelihood estimates, giving us certain useful properties to work with.
- Coefficient estimates for βᵢ can be shown to be a linear function of errors ϵᵢ. A very cool property to keep in mind— a linear function of a normally distributed random variable is also normally distributed. Therefore assuming ϵᵢ as normally distributed gives us βᵢ estimates as normally distributed as well. * This makes it easier for us to calculate confidence intervals and calculate p-values for the estimated β — coefficients (which are commonly seen in R and Python model summaries). If the error normality condition is not satisfied, then all the confidence intervals and p-values of individual t-tests for β — coefficients are unreliable.

Source: https://towardsdatascience.com/assumptions-in-ols-regression-why-do-they-matter-9501c800787d#:~:text=What%20does%20it%20mean%20for,linearly%20independent%20of%20each%20other.

37
Q

What is Iteratively reweighted least squares?

A

The method of iteratively reweighted least squares (IRLS) is used to solve certain optimization problems with objective functions of the form of a p-norm, and can be used to estimate the beta coefficients for a generalized linear model.

38
Q

What’s the invariance property of the MLE?

A
39
Q

What’s Bayes Rule?

A
40
Q

What’s the definition of correlation?

A
41
Q

What’s standard brownian motion?

A
42
Q

What’s the variance of the sum of two random variables?

A
43
Q

What’s a Poisson process?

A
44
Q

What’s the exponential distribution?

A

Note, the exponential distribution is memoryless.

45
Q

What’s the connection between the exponential distribution and the gamma?

A

Sum of exponentials(lambda) = gamma(n, lambda)

46
Q

What’s the general form for the f-statistic?

A
47
Q

What’s the Kronecker product of A (an m × n matrix) and B is (a p × q matrix)?

A
48
Q

What’s the bernoulli pdf?

A

EK=p and VarK=p(1-p)

I think the pdf might actually be
f = p^X(1-p)^(1-X)

49
Q

What’s the binomial pdf?

A

EX=np and VarX=np(1-p)

50
Q

What’s variance in terms of expectation?

A

Note what variance reduces to when E(X) = 0.

51
Q

What’s the law of total expectation?

A
52
Q

What is the E( beta(a,b) )?

A
53
Q

What’s a useful gamma function identity?

A
54
Q

What’s the Bayes Factor for model comparison?

A
55
Q

What’s the beta-binomial model’s marginal?

A
56
Q
A
57
Q
A
58
Q

a) time to state change?

A
59
Q

b) generator matrix?

A
60
Q

c) how might we use the generator matrix to get the stationary probabilities?

A
61
Q

What’s the difference between Bayesian and frequentist residuals in the context of Linear Models?

A
62
Q

What’s the equivalent Bayesian setup of a frequentist linear regression?

A
63
Q

What’s the equivalent Bayesian setup of a frequentist penalized regression?

A
64
Q

What’s the beta function in terms of the gamma function?

A