chapter 10: hypothesis testing Flashcards
Procedure for the scientific method (hint: 5 steps)
- observing nature
- asking questions
- formulating hypotheses
- conducting experiments
- developing theories and laws
two types of hypotheses
scientific hypothesis and statistical hypothesis
definition of scientific hypothesis
a scientific hypothesis is a testable supposition that is tentatively adopted to account for certain facts and to guide in the investigation of others; a statement that requires verification
what are the 3 characteristics common to all scientific hypotheses?
- intelligent informed guesses about phenomena of interest
- can be stated in the if-then form
- their truth, or falsity, can be determined by observation and experimentation
give a good example of scientific hypothesis
students who study the material in spaced intervals perform better on the exams compared to students who cram
definition of a statistical hypothesis
a statistical hypothesis is a statement about one or more parameters of a population distribution that requires verification
good example of a statistical hypothesis
a new class registration procedure at BU will reduce time required for students to register
what is the process of choosing between H0 and H1 (null hypothesis and alternative hypothesis) called?
the process of choosing between H0 and H1 is called HYPOTHESIS TESTING
this chapter on hypothesis testing does not focus on “proving” anything. what does it actually do?
through hypothesis testing we are simply saying that the occurrence of an event is improbable, not impossible
what does it mean to not reject Hº (the null hypothesis)
-hint: 3 different possibilities
- Ho is true and should not be rejected
- Ho is false and should be rejected, but the particular sample that was used to estimate µ and 𝛔 is not representative of the population
- Ho is false and should be rejected, but the experimental methodology is not sufficiently sensitive to detect the true situation
what does it mean to not reject Hº (the null hypothesis)
-hint: 3 different possibilities
this is bad because we want to be able to reject the null hypothesis
- Ho is true and should not be rejected
- Ho is false and should be rejected, but the particular sample that was used to estimate µ and σ is not representative of the population
- Ho is false and should be rejected, but the experimental methodology is not sufficiently sensitive to detect the true situation
what are the two options if we fail to reject the null hypothesis (Ho)
- state that we fail to reject the Ho ——> Ho remains credible
- suspend judgement about the Ho and scientific hypotheses until the completion of a new, improved experiment
statistical test is….
the process of deciding wether to reject Ho.
the decision of wether to reject Ho is based on what 3 things?
- a test statistic computed for a random sample from the population
- hypothesis testing conventions
- a decision rule
5 step research process
- state the null and alternative hypothesis
- specify the test statistic based on the hypothesis being tested, info known about the population, and assumptions about the population that appear to be tenable
- specify the size of the sample (n) and make assumptions that permit specification of the sampling distribution of the test statistic, given that Ho is true
- specify significance level (⍺)
- obtain a random sample of size n from the population, compute the test statistic, and make a decision
what is significance level?
significance level is the acceptable risk of making a decision error (example: rejecting the null hypothesis when it is true)
what is the critical region?
the critical region is the region under the t-distribution curve for rejecting Ho
in general we use a significance level of ⍺=.05
(only 5 times out of 100 will we observe a discrepancy between x̄ and µ as large or larger as expected)
by convention a probability of .05 is the largest risk a researcher should be willing to take of rejecting a true hypothesis
decision rule
reject Ho if test statistic falls in critical region; otherwise do not reject Ho
what is critical value
critical value is the value of t that cuts off the critical region of the sampling distribution of t
step 1: stating the statistical hypotheses using the example: a new class registration procedure at BU will reduce time required for students to register
current procedures: x̄ = 3.10 hours
the statistical hypothesis postulates: µ< 3.10 hours
-we can also postulate that µ ≥ 3.10 hours
(these two statistical hypothesis are mutually exclusive and exhaustive)
µ< 3.10 hours is the alternative hypothesis (H1)
µ ≥ 3.10 hours is the null hypothesis (Ho)
-we want to test the tenability of this
-we want to reject it
if we reject Ho then H1 is the only tenable hypothesis
reminder: the process of choosing between Ho and H1 is called hypothesis testing
* we are only saying that the occurrence of an event is improbable
µ and σ are estimated during hypothesis testing using…..
µ and σ are estimated during hypothesis testing using the sample statistics from the experiment (x̄ and σ ∧)
step 2: specify the test statistic that will be used to test the population mean
we can either use
t-statistic (sampling distr. is the t distribution) or
z-statistic (sampling distr. is the standard normal distr.)
choice of which statistic is based on what 3 things?
- the hypothesis being tested
- info known about the population
- assumptions about the population that appear to be tenable
step 2: specify the test statistic used to test Ho: µ ≥ 3.10 hours
- this hypothesis contains the mean of a single population
- the population standard deviation is unknown
- the population is assumed to be normally distributed
because of this we use the t-statistic
*appropriate if the population of registration times is normally distributed
(see notes for formula)
alternative hypothesis
This is the hypothesis that is potentially inferred given a rejection of the null hypothesis.
t-statistic is used…
when dealing with a single population and unknown standard deviation
*assume that population is normally distributed
z-statistic is used…
when dealing with a single population and KNOWN standard deviation
*assume that population is normally distributed
difference between t and z statistic
t = random variable-constant/random variable z = random variable-constant/constant
*z and t look alike but in z-statistic the denominator is a constant, in t-statistic the denominator is a random variable
step 3: specifying n and making assumptions that permit specification of the sampling distribution of the test statistic
specifying n and making assumptions that permit specification of the sampling distribution of the test statistic
students t-distribution
symmetrical with a mean of 0
the t-distribution is a family of distributions whose shapes depend on the degrees of freedom
what does it mean that dispersion of the t distribution (standard deviation) depends on the sample size
standard deviation of the t-distribution depends on degrees of freedom
degrees of freedom
degrees of freedom refers to the number of scores who’s values are free to vary
smaller df vs. greater df.
smaller degrees of freedom= more leptokurtic
greater degrees of freedom= more normally distributed
(central limit theorem)
- n of 30 is often taken as the dividing point between large and small samples
2 purposed served by the normality assumption
we can test numerator or t-statistic without regards to sample size
and both the denominator and numerator of the t-statistic are statistically independent (don’t affect each other)
location and size of the critical region are determined by what? and why is this important
the location and size of the critical region determined by H1 and α
-this is important for step 4 of the process which is specifying significance level
what is step 5?
obtain a random sample from the population of interest, compute the test statistic, and make a decision using the decision rule
what is the decision rule?
reject the null hypothesis if the test statistic falls in the critical region; otherwise do not reject the null hypothesis
what is critical value?
critical value is the value of t that cuts off the critical region of the sampling distribution of t
this helps determine if we can reject the null hypothesis because we reject it if the test statistic falls in the critical region
faced with non rejection of the null hypothesis what can a researcher conclude
a researcher can either conclude that the evidence does not support the original scientific hypothesis or suspend judgment pending the completion of a new, improved experiment
what can a researcher conclude if the null hypothesis is rejected?
researcher can conclude that the scientific hypothesis is probably true.
why is the inclusion of a control group (participants who do not receive treatment) an experimental design consideration?
control groups provide data on the effects of extraneous variables.
there is a possibility that samples can pull a “john henry effect” and over-perform because they know they are being observed. control groups are used in consideration of this
when is a one-tailed test used?
when the researcher makes a directional prediction (one-sided hypothesis)
*the critical region will be located in either the upper or lower tail of the sampling distribution
when is a two-tailed test used?
when the researcher just wants to show a difference, but is nondirectional (two-sided hypothesis)
* the critical region will be located in both the upper and lower tails of the sampling distribution
half the significance level is assigned to the upper tail and the other half to the lower tail
why are most significance tests in the behavioral sciences two tailed?
because we lack the information necessary to formulate directional hypotheses
type 1 error
rejecting Ho when Ho is true and shouldnt be rejected
-probability = α
type II error
not rejecting Ho when Ho is false and should be rejected
-probability = 𝛃
correct rejection (1- 𝛃) is power
to compute, you must know 𝛍 (true population mean) and 𝛔 (population standard deviation)
type I vs. type II errors
as the probability of one increases the probability of the other decreases (inverse relationship)
which error is worse?
it depends on what youre studying but usually type I errors are worse
indicate type of error: a false null hypothesis was rejected
correct rejection
indicate type of error: the researcher did not reject a true null hypothesis
correct acceptance
indicate type of error: the null hypothesis is false and the researcher failed to reject it
type II error
indicate type of error: the researcher rejected a true null hypothesis
type I error
indicate type of error: a false null hypothesis was not rejected
type II error
cohens d
effect size that expresses the magnitude of the absolute mean difference (𝛍-𝛍𝗈) one wants to detect in units of the population standard deviation
*see notes for formula
small effect= .2
medium effect= .5
large effect= .8
-cohens d helps specify sample size, n
what are the 5 pieces of information needed to estimate sample size?
- effect size d= 0.2, 0.5, 0r 0.8
- significance level 𝛂= 0.5 or 0.1
- acceptable power 1-𝛃=.80, .90 or .95
- type of statistical hypothesis: one or two tailed
- type of test: one or two sample test
what is statistical significance concerned with?
statistical significance is concerned with wether a result is due to chance or sampling variability
what is practical significance concerned with?
practical significance is concerned with whether the result is useful in the real world
what is p-value?
probability value is the probability of obtaining a value of the test statistic equal to or more extreme than that observed, given that Ho is true.
*p value should be divided by two when dealing with two-tailed tests
how to not confuse p value with significance level
significance level is the probability a researcher has specified an acceptable level of falsely rejecting a null hypothesis (type I error)
-it is commonly set at 𝛂= .05 or .01
the decision rule can be formulated in terms of the p value and significance level. explain
reject the null hypothesis if the p value is less than or equal to the preselected significance level
that is if p≤𝛂
otherwise, do not reject the null hypothesis