week 9 - t-tests, Comparing means Flashcards

1
Q

hypothesis testing (formal definition of p value)

A

p value = P(D given not-H)
OR
p value = P(D given H-null)

D = “our observed data or more extreme data”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Low p value?

A
  • We have observed something that would be VERY
    IMPROBABLE if the null hypothesis true
  • so we decrease believe in null Hypothesis
  • (and increase belief in alternative)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

null hypothesis

A
no change
(i.e., H = our drug enhances IQ, null-H = the drug doesn't change IQ)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

alpha

Compute alpha (α): α = 1 - (confidence level / 100)

A

an acceptable significance value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

comparing means

the population model parameter of interest is the difference between the two means:
μ1 - μ2

A

We are working with means and estimating the standard error of their difference using the data –>
SO the sampling model is a Student’s t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

z or t?

A

If you know σ, use z. (That’s rare!)
Whenever you s to estimate σ,
use t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The confidence interval we build is called a two-sample t-interval (for the difference
in means).

A

The corresponding hypothesis test is called a two-sample t-test.

The interval looks just like all the others we’ve seen—the statistic plus or minus an estimated margin
of error: ȳ1 - ȳ2 +/- ME

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

REMEMBER: if you know the standard deviation of the population then use the ……… distribution

A

the Normal distribution.

If you do not know the standard deviation of the population then you MUST use the Student-t distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Hypothesis Testing

A

Hypothesis testing is the use of statistics to determine the probability that a given hypothesis is true. The usual process of hypothesis testing consists of four steps.

  1. Formulate the null hypothesis H_0 (commonly, that the observations are the result of pure chance) and the alternative hypothesis H_a (commonly, that the observations show a real effect combined with a component of chance variation).
  2. Identify a test statistic that can be used to assess the truth of the null hypothesis.
  3. Compute the P-value, which is the probability that a test statistic at least as significant as the one observed would be obtained assuming that the null hypothesis were true. The smaller the P-value, the stronger the evidence against the null hypothesis.
  4. Compare the p-value to an acceptable significance value alpha (sometimes called an alpha value). If p
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Hypothesis testing, alpha value

A
  • P(concluding H-0 is false given that H-0 is true)
  • the probabilitiy that we will conclude we have a real finding when we actually don’t
  • False positive rate
  • Type I error rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What can we conclude if the p value is large?

A

– Unfortunately, not much.
● If p is large, it might be that H0 is true…
● … but it might also be that:
– Our sample is too small
– The population is too varied
– The real effect in the population is too small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

2-sample t-test

A

difference between the means of two samples

df = n1 + n2 -2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Power

A

– P(concluding that H0 is false | H0 is false)
– The probability that we will conclude we have a real
finding, when we really do.
– True positive rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

margin of error

the amount of random sampling error in a survey’s results

The margin of error is usually defined as the “radius” (or half the width) of a confidence interval for a particular statistic from a survey.

A

In a confidence interval, the range of values above and below the sample statistic is called the margin of error.

We could devise a sample design to ensure that our sample estimate will not differ from the true population value by more than, say, 5 percent (the margin of error) 90 percent of the time (the confidence level).

The margin of error can be defined by either of the following equations.
ME = Critical value x Standard deviation of the statistic
ME = Critical value x Standard error of the statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

standard error

A

A Bayesian interpretation of the standard error is that although we do not know the “true” percentage, it is highly likely to be located within two standard errors of the estimated percentage. The standard error can be used to create a confidence interval within which the “true” percentage should be to a certain level of confidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does a Confidence Interval mean?

A

that, e.g., 90% of all random samples of this size will produce intervals that will contain the true
value of the mean difference between the times of the two groups.

17
Q

SD vs SE

A

SE does not describe the variability of individual values
- A new value has about 95% probability of being within 2 SDs of sample mean.

SD does not describe the accuracy of the sample mean
- The sample mean has about 95% probability of being within 2 SEs of the population mean.

18
Q

When do you use t-distribution?

A

When the variance is not known and has to be estimated from sample data.

the t distribution is leptokurtic has relatively more scores in its tails than does the normal distribution.

As a result, you have to extend farther from the mean to contain a given proportion of the area.

Normal distribution, 95% of the distribution is within 1.96 SDs of the mean.

t distribution (with small sample) 95% of the area is within 2.78 SD of the mean.

Therefore, the standard error of the mean would be multiplied by 2.78 rather than 1.96.

19
Q

In a two-sample problem, what is the null hypothesis for comparing two means?

A

In a two-sample problem the null hypothesis will be the difference between the two means.

20
Q

In a two-sample problem, must/should the two sample sizes be equal?

A

As long as you are doing a two-sample t-test for the difference between two means it is ok to have different sized samples because you are comparing the means and not the individual data. As seen on pages 475-476 a sample problem is done about cameras with two different sample sizes.