Chapter 7- Hypothesis Testing Flashcards
hypothesis
a statement or proposed explanation for an observation, a phenomenon, or a scientific problem that can be tested using the research method. A hypothesis is often a statement about the value for a parameter in a population.
Hypothesis testing or significance testing
a method for testing a claim or hypothesis about a parameter in a population, using data measured in a sample. In this method, we test a hypothesis by determining the likelihood that a sample statistic would be selected if the hypothesis regarding the population parameter were true.
FOUR STEPS TO HYPOTHESIS TESTING
Step 1: State the hypotheses.
Step 2: Set the criteria for a decision.
Step 3: Compute the test statistic.
Step 4: Make a decision.
null hypothesis (H0)
stated as the null, is a statement about a population parameter, such as the population mean, that is assumed to be true, and a hypothesis test is structured to decide whether or not to reject this assumption.
alternative hypothesis (H1)
a statement that directly contradicts a null hypothesis by stating that the actual value of a population parameter is less than, greater than, or not equal to the value stated in the null hypothesis.
Level of significance, or significance level
a criterion of judgment upon which a decision is made regarding the value stated in a null hypothesis. The criterion is based on the probability of obtaining a statistic measured in a sample if the value stated in the null hypothesis were true.
test statistic
a mathematical formula that identifies how far or how many standard deviations a sample outcome is from the value stated in a null hypothesis. It allows researchers to determine the likelihood of obtaining sample outcomes if the null hypothesis were true. The value of the test statistic is used to make a decision regarding a null hypothesis.
p value
the probability of obtaining a sample outcome, given that the value stated in the null hypothesis is true. The p value for obtaining a sample outcome is compared to the level of significance or criterion for making a decision.
Significance, or statistical significance
describes a decision made concerning a value stated in the null hypothesis. When the null hypothesis is rejected, we reach significance. When the null hypothesis is retained, we fail to reach significance.
Type II error, or beta (β) error
the probability of retaining a null hypothesis that is actually false.
Type I error is the probability of rejecting a null hypothesis that is actually true. Researchers directly control for the probability of committing this type of error by stating an alpha level.
alpha (α) level
the level of significance or criterion for a hypothesis test. It is the largest probability of committing a Type I error that we will allow and still decide to reject the null hypothesis.
The power in hypothesis testing
the probability of rejecting a false null hypothesis. Specifically, it is the probability that a randomly selected sample will show that the null hypothesis is false when the null hypothesis is indeed false.
one-sample z test
a statistical procedure used to test hypotheses concerning the mean in a single population with a known variance.
Nondirectional tests, or two-tailed tests
hypothesis tests in which the alternative hypothesis is stated as not equal to (≠) a value stated in the null hypothesis. Hence, the researcher is interested in any alternative to the null hypothesis.
critical value
a cutoff value that defines the boundaries beyond which less than 5% of sample means can be obtained if the null hypothesis is true. Sample means obtained beyond a critical value will result in a decision to reject the null hypothesis.
rejection region
the region beyond a critical value in a hypothesis test. When the value of a test statistic is in the rejection region, we decide to reject the null hypothesis; otherwise, we retain the null hypothesis.
z statistic
an inferential statistic used to determine the number of standard deviations in a standard normal distribution that a sample mean deviates from the population mean stated in the null hypothesis.
obtained value
the value of a test statistic. This value is compared to the critical value(s) of a hypothesis test to make a decision. When the obtained value exceeds a critical value, we decide to reject the null hypothesis; otherwise, we retain the null hypothesis.
Directional tests, or one-tailed tests,
hypothesis tests in which the alternative hypothesis is stated as greater than (>) or less than (<) a value stated in the null hypothesis. Hence, the researcher is interested in a specific alternative to the null hypothesis.
Type III error
a type of error possible with one-tailed tests in which a decision would have been to reject the null hypothesis, but the researcher decides to retain the null hypothesis because the rejection region was located in the wrong tail.
effect
For a single sample, an effect is the difference between a sample mean and the population mean stated in the null hypothesis. In hypothesis testing, an effect is not significant when we retain the null hypothesis; an effect is significant when we reject the null hypothesis.
Effect size
a statistical measure of the size of an effect in a population, which allows researchers to describe how far scores shifted in the population, or the percent of variance that can be explained by a given variable.
Cohen’s d
a measure of effect size in terms of the number of standard deviations that mean scores shifted above or below the population mean stated by the null hypothesis. The larger the value of d, the larger the effect in the population.
Cohen’s effect size conventions
standard rules for identifying small, medium, and large effects based on typical findings in behavioral research.
Define Type I error
the probability of rejecting a null hypothesis that is actually true. The probability of this type of error is determined by the researcher and stated as the level of significance or alpha level for a hypothesis test.
Effect size calculation
For Z tests, this is often the mean difference but divided by σ instead of the standard error.
(sampling distro mean- population mean)/ population sd
What values make a small, medium, and large effect?
an effect size of 0.2 to 0.3 is often considered a “small” effect, around 0.5 a “medium” effect, and 0.8+ a “large” effect.
what effects the width of a Confidence Interval?
- n (as n increases, SE decreases so CI becomes narrower)
- σ (as σ decreases, the SE decreases, the CI becomes narrower)
- α or our level of confidence e.g., 5% to 1%, the wider our CI +/- 1.96 becomes +/- 2.58
Why do we have more control over Type I error rates compared to Type II error rates?
Significance level determination: In hypothesis testing, the significance level (often denoted as alpha, α) is predetermined by the researcher. It represents the probability of making a Type I error, i.e., rejecting the null hypothesis when it is actually true. By choosing a lower significance level, such as 0.05, researchers can reduce the likelihood of committing a Type I error. This control allows researchers to set a strict threshold for rejecting the null hypothesis and control the false positive rate.
Statistical power: The statistical power of a test is the probability of correctly rejecting the null hypothesis when it is false (i.e., avoiding a Type II error). By increasing the sample size or using more sensitive statistical tests, researchers can enhance the statistical power, thereby reducing the chances of committing a Type II error. Researchers have some control over the power of their study design, which indirectly influences the Type II error rate.
Importance of avoiding false positives: In many fields, such as medicine and criminal justice, avoiding false positives (Type I errors) is crucial. False positives can lead to incorrect conclusions or actions, potentially causing harm or unnecessary costs. Therefore, researchers and practitioners tend to be more cautious about accepting alternative hypotheses without strong evidence, resulting in more stringent control over Type I error rates.
The Relationship Between Sample Size and Power
Increasing sample size decreases standard error, thereby increasing power.
How to increase power
Increase effect size (d), sample size (n), and alpha (α).
Decrease beta error (β), population standard deviation (σ), and standard error (σM).