QM LM8 Hypothesis testing Flashcards
What is statistical inference?
The process of making judgements about a larger group (population) based on a smaller group (sample)
What is hypothesis testing?
Testing to see whether a sample statistic is likely to come from a population with the hypothesised value of the population parameter
This is statistical inference
What is a hypothesis?
A statement about one or more populatiosn that are tested using sample statistics
What is the process of hypothesis testing, in 6 steps?
- State the hypothesis
- Identify the appropriate test statistic
- Specify the level of significance
- State the decision rule
- Collect data and calculate the test statistic
- Make a decision
What is the null hypothesis?
Something we assume to be true unless we can reject it
- Typically, we WANT to reject the null
- However there are some cases in multivariate testing where we want the null.
What is a two tailed test?
- For example, the null hypothesis might be that mu = 6%
- The value might be greater or less than 6%
- It can be not equal in both directions
What is a one tailed test?
- For example, where the null hypothesis that mu is less than 6%
- Here, when the mu is close to or above 6% we reject the null
- Therefore there is a rejection outcome on only one side of the distribution
If the population variance is unknown, what kind of test statistic do we use?
- T-test by default, because it is stricter than a z-test
- If the T-test allows us to reject the null, we will CERTAINLY be able to reject the null under a z-test as well
What level of significance should we use?
- Depends on the seriousness of making a mistake
- In social sciences we might use 10%
- In finance we use 5% as standard, maybe 1%
- The greek shorthand is alpha
What are the two types of mistake we could make when doing hypothesis testing?
- If the null is rejected and happens to be true (false positive). E.g., telling someone they’re pregnant when they’re not.
- If the null is not rejected and happens to be false (false negative). E.g., telling someone they’re not pregnant when they are
How does decreasing the level of significance affect type II errors?
- As alpha decreases (level of sig), beta increases
- Beta is the likelihood of type II error. This is a false negative: failing to reject the null hypothesis when the null should be rejected (telling someone they’re not pregnant when they are)
- The only way to decrease both alpha and beta (decrease false psitives AND false negatives) is to increase n (the number of observations)
What is 1 - beta?
The power of the test
The ability to reject the null hypothesis when it should be rejected
This is contrasted to 1 - alpha, which is the significance level
Do we test standard deviation or variance?
- We always test variance, not SD
- We use chi squared test to test variance
- since variance cannot be negative we can’t use a z-test
When would you use parametric testing?
- When sample statistics are being used to test population parameters
- However our data has to meet distributional assumptions, for example follow an approximately normal distribution
When would you use non-parametric testing?
- When there are no parameters tested
- When there are no distributional assumptions
- I.e., if n<30 the population is non-normally distributed
- When there are outliers, we might test for median rather than mean (so use non-parametric)
- When data are given in ranks, or used an ordinal scale (ordered or categorical)
- When the hypothesis does not concern a parameter (i.e., testing if a sample is random)