WWWW Flashcards
F-Distribution + F-Stat
Distribution:
- Chi Squared
- Want to test if one event is significntly different from another
Group A, B and C put on 10 mg, 5 mg and placebo.
- Mean Square Between (MSB) = Mean square between these groups
- Mean Square Error (MSE) = Mean Variance of all these groups added together
F-stat = MSB/MSE - A large F-stat might indicate that the population means are not equal
goodness of fit
Explained Sum of Squares (ESS)
- Difference between predicted value and the mean of the dependent variable
Sum of Squared
Residuals (SSR)
- Difference between predicted and observed value at squared level.
Total Sum of Squares
- Difference between observed dependent variable and its mean
- TSS = SSR + ESS
unbiased estimators conditions
linear in parameters,
random sampling,
sample variation in explanatory variable,
zero conditional mean
significance level
The significance level is the probability of rejecting the null hypothesis when it is in fact true. a 5% significance level says that we have a prob of 5% of rejecting null when its true.
significance probability
The probability of drawing a statistic at least as adverse to the null hypothesis as the one you computed in your sample, assuming that the null hypothesis is true.
• What is meant by the size of a test?
In hypothesis testing, the size of a test is the probability of committing a Type I error, that is, of incorrectly rejecting the null hypothesis when it is true.
p-value
- P-value is the probability of rejecting the null hypothesis, when it should not be rejected (Type 1 error).
- We usually use 5% significance level, meaning that if a medicine is tested and it doesn’t have a real effect, the result will tell us it has an effect 1/20 times.
critical value
What are degrees of freedom?
- independent values that are free to vary in a data set
Intuitively:
A data set of four numbers. Three of the values are 4, 4, 4. and average of data is 4.
This must mean that the last number also has to be 4. It must be 4, it is not allowed to vary
What happen to a confidence interval when then sample is bigger and bigger?
It becomes smaller and smaller., sample size increase, more precise
How do we interpret Binary dependent variable regression
interpreted as a conditional probability function
depends on which model
- LPM easy
- Probit, cpf
- Logit, lcpf
interpret coeff probit logit
Standard errors in LPM are always
Heteroscedastic
Why is it called LPM
Because the probability that Y = 1 is a linear function of the regressors
What is cumulative distribution function
It is the probability that the variable takes a value less than or equal to X
What does Probit and Logit regression allow for that LPM doesnt
Probit and Logit models allows for non-linear relationship between regressors and dependent variable.
What is the z value
Rule of thumb: Z should be over 2 and p under 0,05 for H0 to be rejected
estimated intercept divided by the standard error
- the number of standard deviations the estimated intercept is away from 0 on a standard normal curve (Wald test)
Maximum Likelihood Intuitive of the mean
- Imagine that you have a line of observed values.
- Then imagine that you test every point on that line for where you get the highest likelihood of observing the data
- when all areas are checked you pick the one that maximizes the likelihood
logistic distribution is what
continious distribution
likelihood in statistics means
trying to find the optimal value for the mean or std for a distribution
How do wee find the best regression line
maximum likelihood
if p-value is < 0,05
probit logit
there is a statistically significant association between the response variable and dependent
Unbiasedness
Same Distribution as population and thus close to real result