Module 6, Probability and Introduction to Hypothesis Testing Flashcards
Hypothesis Testing (5 steps)
- state the null and alternative hypotheses (H0 and H1)
- choose the level of significance
- find critical values
- find test statistic
- draw a conclusion
Research Hypotheses
expression of relationship between variables in words (can expect to find based on past theory or research)
Statistical Hypotheses
- focuses on numerical expressions of relationships between variables
- null and alternative hypotheses
- what we actually test when doing statistical hypotheses
Null and Alternative Hypotheses
a set of hypotheses that present two mutually exclusive conclusions about data
- we make statistical hypotheses about population parameters and not sample statistics as we want to know if the effect exists in the population and not just in sample
null hypothesis (H0):
- proposes that the hypothesized change, difference, or relationship does not exist in the population
- example: null hypothesis would be the conclusion that playing at home does not increase chances of winning in the NHL
alternative hypothesis (H1):
- proposes that the hypothesized change, difference, or relationship does exist in the population
- example: does represent that conclusion that playing at home does increase our chances of winning in the NHL
H0: μ = 6 (what type of hypothesis is this)
null hypothesis
- when researchers conduct hypothesis testing they are testing the null hypothesis and not the alternative
H1: μ ≠ 6 (what type of hypothesis of this)
alternative hypothesis
- playing at home makes the probability of winning the game not 50% or .05
- the probability of winning a home game would not be equal to 6 out of the 12 games, thus we can reject the null hypothesis, if the value of the statistic is above or below 6
- the “does not equal” sign allow for the possibility that there could be a home team advantage (the home team wins more than 6 OR loses more than 6 game in the population)
- ≠ (two tailed test) < OR > (one tailed test)
Choose the level of significance: Why care about probability in research?
- it is feasible to study samples and thus why researcher collect data from samples to represent the population
- research is typically conducted with samples
- estimation population parameters based on sample statistics (want to draw conclusions more broadly from population) - this process is called inferential statistics
- as data does not come from the entire population, we must evaluate data using probability
◦ evaluate sample data
Sampling Error
- the difference between statistics calculated from a sample and those from the population
- samples are imperfect representations of the population
What data would convince us that H0 is false?
- the calculated statistic value is not strong enough evidence
◦ recall the discussion of
sampling error - we look for evidence that has a low probability of occurring if H0 is true (to reject a null hypothesis and show there is a statistically significant effect or relationship)
Alpha (α): defining “low probability”
- the probability of the statistic used to make a decision to reject H0
- conventionally, α = .05, or a 5% chance of a statistic occurring if H0 is true
Find Critical Values
we need to identify values of statistics that are associated with low probability:
- how many games out of 12 would the home team have to win to convince us that there was a home team advantage or disadvantage taking into accountability the possibility of sampling error
- set our alternative hypothesis in a way that would allow for a home court advantage or disadvantage
- the number of wins on the x-axis and the probability of the home team winning on the y axis, thus the probability of the home team winning 6 games out of a random sample of 12 games is .2256 (22.56%)
Find Critical Values: Region of Rejection
- represents the values of a statistic whose combined probability is low enough that we could reject H0
- represent that there is an effect or relationship within the population, which in our example would be a home team advantage or disadvantage
Find Critical Values: Region of Non-rejection
- represents the values whose probability is not low enough to allow us to reject H0
- too great of a chance we could be rejecting the null hypothesis due to a sampling error
Critical Value
value of a statistic that separates the two regions (rejection and non-rejection)
Find Test Statistic
- the test statistic that is calculated depends on the statistical analysis required to answer the research question and type of data
◦ example: t, r, F, x2, z - if the value of the statistic calculated for the sample lies beyond the critical values, reject the null hypothesis, otherwise, do not reject the null hypothesis
◦ implies that the probability
of having found a value as
big as we did was not due to
chance but it was due to the
actual fact
◦ if the value does not lay
beyond the critical values
than we do not reject the
null hypothesis and thus we
conclude that it is unlikely
that effect exists in the actual
hypothesis
◦ we found that the home
team won 10 times out of
the 12 games, 10 exceeds
the critical value of 9 and
thus a probability that we
got this particular statistic by
chance is less than 5% and
thus our analysis would
support the theory that
there is a home team
advantage