F4 Mathematical Foundations / Probability Theory III Flashcards
What is the difference between the Bernoulli and the binomial distribution?
Bernoulli represents the outcome of a single trial.
Binomial represents the total number of successes in a fixed number of independent Bernoulli trials.
How do you pronounce this: e^x
E raised to the power of x
ln(e^x ) = ?
ln(e^1 ) = ?
ln(2,71) ≈ ?
ln(e^x )=x
ln(e^1 )=1
ln(2,71)≈1
Why can the log transformation be very helpful?
Log-transformation can be a way of dealing with extreme outliers – like income.
How does the exponential function look?
It’s never negative. It approaches zero never completely. The intercept=1.
What is the difference between a logit and probit model?
Probit Model: Assumes the error terms follow a standard normal distribution. The normal distribution is used as the link function.
Logit Model: Assumes the error terms follow a standard logistic distribution. Logit link function.
What is the standard logistic distribution used for? How is compared to the normal distribution?
Model binary outcomes.
It has slightly heavier tails than the normal distribution, which means it assumes there may be more extreme values.
Write up the simple binary regression and name the coefficients
y_i=β_0+β_1 x_i+u_i
Independent, dependent, intercept and error term.
What do we call a coefficient, when it’s an estimate?
Beta-hat
What do we assume about the error term? Two things.
The expected value of the error term is zero (no systematic error). E(u)=0
X and u are independent. We cannot predict the value of u based on x. E(u|x)=E(u)=0. This is for all independent variables.
u_i ∼ N(0,σ^2)
What is a synonym for the regression line?
The conditional expectation function
What is the estimate for beta-1 and beta-0?
Beta-1: Cov(x,y)/Var(x)
Beta-0: Y-bar - beta-1x-bar
What does the following indicate? Draw it
Cov(x,y)>0:
Cov(x,y)<0:
Cov(x,y)=0:
Cov(x,y)>0: Indicates a positive linear relationship. As X increases, Y tends to increase.
Cov(x,y)<0: Indicates a negative linear relationship. As X increases, Y tends to decrease.
Cov(x,y)=0: Suggests no linear relationship between X and Y.
What is the support of covariance and variance?
Covariance: -∞ to ∞
Variance: 0 to ∞
Which of the two are a linear operator: Covariance and variance
Covariance: Yes
Variance: No
What are the three types of error in a regression? And which of them are equal each other when x and y are independent
TSS (all variation in y) = ESS (explained variation) + SSR (unexplained variation)
Independent: TSS=SSR
What is the total sum of squares (TSS)? Draw it.
Totalt variation y around the mean y-bar.
(y_i - y-bar)^2
How much error is there if I guess the mean every time.
What is explained sum of squares (ESS)? Draw it.
What does the model explain beyond guessing the mean (we want it to be high).
(y_i-hat - y-bar)^2
What is sum of squared residuals (SSR)? Draw it.
Variation not explained by the model (residuals)
(y_i - y_i-bar)^2 = u_i^2
What do we called the proportion of explained variation by the model? What is the equation?
R^2 = ESS/TSS
So, how much variation explained by the model as a proportion of the totalt amount of variation.
How closely is the regression line correlated with y_i?
What is the ‘true’ theoretical error term called when we try to estimate it?
Estimated error term or residual