Hypothesis Testing Flashcards

1
Q

hypothesis tests

A

A hypothesis test is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to assess vs. a contradictory claim (hypothesis)

  • Confidence intervals are appropriate when our goal is to estimate a population parameter
  • But when your goal is to assess the evidence provided by data about some claim concerning a population, then hypothesis tests (or tests of significance) are the appropriate statistical method to use
  • A statistical hypothesis is a claim about the value(s) of a single parameter or several parameters or about the form of an entire probability distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

null hypothesis and alternate hypothesis

A

The alternative hypothesis (HA) is usually the hypothesis that the researcher would like to prove is true – Can be “two-sided” or “one-sided”

The null hypothesis (H0) is the opposite to the alternative hypothesis and is the hypothesis of no change (from current opinion), no difference, no improvement, etc.

– The null hypothesis, denoted by H0 , is the claim that is initially assumed to be true and the alternate hypothesis, denoted by HA , is the assertion that is contradictory to H0

– If sample evidence suggests H0 is false, we reject H0

– If the sample evidence does not strongly contradict H0 , then we fail to reject H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

general procedure for hypothesis tests

A

The basic steps for hypothesis testing are:

  1. State a null and alternative hypothesis, H0 vs. HA
  2. Collect data and calculate the test statistic
  3. Determine the P-value associated with the test statistic
  4. Reach a decision/conclusion based on the P-value: reject or fail to reject H0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

test statistic

A

a test statistic is a standardized score of our sample statistic, that helps conduct the hypothesis test

example: assume normal probability distribution

how many standard deviations away is the statistic from the mean if H0 is true?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

P-value

A

A P-value is the probability (computed assuming that H0 is true) of obtaining a value of the sample statistic that is at least as extreme or more extreme (as defined by the alternative hypothesis) as the value actually observed

use the magnitude of the P-value as a measure of the strength of evidence against the null hypothesis

  • large P-values fail to give convincing evidence against H0, because they say that the observed result could have occurred by chance if H0 were true
  • small P-values are evidence against H0, because they say that the observed result is unlikely to occur when H0 is true (i.e., we observed something rare by chance or the null hypothesis is not correct)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

statistically significant

A

Statistically significant” is an adjective used to describe a sample that seems too unlikely to have occurred just by chance alone

example: researcher compares mean weight loss for a diet treatment to that for an exercise treatment, and reports a P-value of 0.036. She concludes these sample data are “statistically significant”.

But, we never know whether the null hypothesis is true or not, nor does the P-value tell us why we observed the sample we did

at most only one type of error is possible at a time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

power of the test

A

The power of the test is the probability of rejecting H0 , when H0 is false; it measures the ability of a hypothesis test to find evidence against a null hypothesis that is actually incorrect

power is influenced by:

  • # of observations in the sample
  • the magnitude of the effect size to be detected

just because we fail to find strong evidenve against the null hypothesis doesn’t mean it’s true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

effect size

A

The effect size (magnitude of effect) is the magnitude of the difference between groups or deviation from expected null value

example: a completely randomized experiment compares a current insomnia treatment to a newly developed treatment. researchers observe a statistically significant increase in mean hours slept for the new treatment (P-value = 0.002)

“statistically significant” does not necessarily imply “practical” significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

multiple comparisons

A

Multiple comparisons: Conducting multiple hypothesis tests increases the likelihood of type I error

– Rare statistics are unlikely to occur in a single sample, but more likely to occur in repeated sampling

– Multiple tests is analogous to repeated sampling

researchers conducting multiple comparisons should control for overall type 1 error rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

take home message

A

don’t fall victim to (nor contribute to) the misunderstanding of P-values and “significance”

we never know if a hypothesis is true or not

the results of a hypothesis test depend on:

  • study design
  • sample size
  • effect size (magnitude effect)
  • power
  • number of comparisons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly