! Hypothesis Testing Flashcards
Hypothesis Space
- set of all hypotheses that can be produced by learning algorithm
- set of all possible finite discrete functions: representable by decision tree(s)
Hypothesis
= statement about a population parameter
Null Hypothesis H0
= statement about population parameter assumed to be true unless there is convincing evidence to contrary
Alternative Hypothesis Ha
= statement about population parameter contradictory to H0 & accepted only if there is convincing evidence
Hypothesis Testing
= statistical procedure choose between H0 & Ha based on info in sample
Result Options of Hypothesis Testing
- Reject H0 (= accept Ha)
- Fail to reject H0 (= fail to accept Ha) -> H0 cannot be proven to be true
Statistically significant
= probability not due to chance
p-value
- probability level
- defines when sample results = strong enough to reject H0
- Low p-value = H0 unlikely to be true
Error Types
- Type 1: Ho is true but rejected (a = probability of this error)
- Type 2: H0 is false but not rejected
Hypothesis Testing Procedure (p-value approach)
- Identify H0 & H1
- Identify test statistic & its distribution
- Compute value of test statistic from data
- Compute p-value
- Compare p & ts: reject H0 if p <= a
- Formulate decision
Extrapolation
= drawing a conclusion about something beyond data range e.g. conclusion from a biased sample
Sampling distribution
- distribution of sample statistics
- with mean approx. equal to mean in original distribution & sd known as standard error
Major Error Sources
- Sampling Error
- Sampling Bias
Sampling Error
- proportion of overall error attributable to sampling procedure
- how much sample estimates of varibales differ btw samples
Sampling Bias
- Systematic favoring of certain outcomes due to the methods employed to obtain the sample (e.g. self-selection bias)