Statistics Flashcards
Probability vs Likelihood
Probability:
What is the chance of something occurring given a sample distribution of data
ex. P(height > 170 | mu = 170, sigma = 3.5)
Likelihood:
What is the best distribution of data given a value
ex. P(mu = 170, sigma = 3.5 | height > 170)
What are some examples of non-normal distributions?
Uniform: All values equally likely
Exponential: Number of coin flips before heads
Skewed: Long-tailed normal distribution
What is a p-value?
P-value:
The probability that the event you observed occurred by random chance
Statistical significance:
When an event did not occur by random chance alone
How many ways can you split 10 players in two teams to play 5-a-side football? (Order of teams does not matter)
= (10 choose 5) / 2
= 252 / 2
= 126
252 if order matters
126 if order does not matter
What is Bayes theorem? Show formula.
Formula for computing the conditional probability.
P(A|B) = P(B|A)P(A) / P(B)
Steps of significance testing
- Define Null and Alternative Hypothesis
- Set threshold (alpha)
- Calculate p-value
- If p-value < threshold. Reject the null (statistically significant)
Combination vs. Permutation? Show formula’s.
Combination. Choose from set of values, order does not matter.
nCm = n! / (n!(n-m)!)
Permutation. Choose from set of values, order matters.
= n! / (n-m)!
What is the central limit theorem?
If you take a sufficiently large sample with replacement and take a sampled statistic, and repeat, the distribution of the sample statistic will be normally distributed.
ML relationship:
- Confidence intervals, Comparing models to one another (t-tests, Annova)
What are Z-tests, T-tests and ANOVA tests? When would you use each one?
Tests for statistical significance.
T-test (almost always this):
- Normal distribution where we don’t know the standard deviation
- Random sample (<30 points)
Z-test:
- If you know the standard deviation
- Population (>100 points)
- This assumes the variance is identical to any other sample taken from the same population
ANOVA (analysis of variance)
- More than 2 distributions
What are the Harmonic and Geometric means?
Harmonic: Average of rates (speeds).
= n / ((1/x1) + (1/x2) + (1/x3))
Geometric: Average of growth rates (bacteria growth, return on investment).
= nthroot(x1x2x3)
What is statistical significance? What is a p-value?
Statistical Significance:
- When some event happens not due to chance
P-value
- The probability that the observed event was random chance
What is the difference between PDF and CDF?
PDF - The probability that a random variable is exactly x
CDF - The probability that a random variable is less than or equal to x
What is entropy? How does this relate to ML?
Entropy is the amount of uncertainty in a random process. (Lower the better)
Using log-loss/categorical cross entropy/binary-cross-entropy all measure entropy. We can use these to measure how well our model makes predictions.
MLE vs MAP
Maximum Likelihood Estimate:
- Find the parameters that give us the best distributions for our predictions
Maximum A Posteriori:
- Optimize params, but we have some prior knowledge
- ie. We know the probability in the real world. Given this probability, what are the optimal parameters