Exam 2 Flashcards
process of drawing conclusions about the entire population based on information in a sample
statistical inference
a number that describes some aspect of a population
parameter
a number that is computed from the data in a sample
statistic
the sample statistic of the true value of the population parameter when we only have one sample and don’t know the value of the population parameter
best estimate
if μ = the mean commute time for workers in a particular city, what statistic would you use to estimate?
x-bar
if p = the size of dinner bills and size of tips at a restaurant, what statistic would you use to estimate?
r
distribution of sample statistics computed for different samples of the same size from the sample population
sampling distribution
if samples are randomly selected and sample size is large enough the distribution will be ______ and the centered at _____
symmetrical and bell-shaped
value of the population parameter
the standard deviation of the sample statistic
standard error
as sample size increases, variability of the sample statistic _____
decrease
how do you give a plausible range of values when you are given the margin of error and the sample statistic?
sample statistic +/- margin of error
what is the margin of error?
a number that reflects the precision of the sample statistic as an estimate for a parameter
an interval computed from sample data by a method that would capture the parameters for a specified proportion of all samples
confidence interval
success rate or the proportion of all samples whose intervals contain the parameter
confidence level
how do you determine the confidence interval from the standard error?
statistic +/- 2(SE)
difference between standard deviation and standard error?
standard deviation is of the individual sample
standard error would be if the mean of the sample units were computed over and over again
amount added and subtracted in a confidence interval
margin of error
standard deviation of the sample statistic if we could take many samples of the same size
standard error
how do you interpret the confidence interval?
we are sure that our interval contains the population parameter
“we are sure the parameter falls within these values”
sampling with replacement from the original sample using the same sample size
bootstrap sample
computing the statistic of interest for each of the bootstrap samples
bootstrap statistic
the statistic for many bootstrap samples
bootstrap distribution
how would you conduct a bootstrap sample from a jar of 100 nuts with 52 peanuts in it to find the proportion of peanuts in the jar?
shake the jar, pull a nut out, record if it is a peanut, put the nut back, and repeat 99 more times
if sample size is increased, precision of estimate is _____
increased
if sample size is increased, standard error _____
decreases
if sample size is increased, width of confidence interval _____
decreases
used to determine whether results from a sample are convincing enough to allow us to conclude something about the population
statistical tests
Ho - claim that there is no effect or no difference
null hypothesis
Ha - claim for which we seek significant evidence
alternative hypothesis
in a hypothesis test, we want to refute the _____ and support the _____
null hypothesis
alternative hypothesis
Ho and Ha describe the sample or population?
population
what must always be present in a null hypothesis?
=
for a hypothesis test, if there is one categorical variable, what is the parameter?
proportion
p
for a hypothesis test, if there is one quantitative variable, what is the parameter?
mean
mu
for a hypothesis test, if there is one categorical variable and one quantitative variable, what is the parameter?
difference in means
mu1-mu2
for a hypothesis test, if there is two categorical variables, what is the parameter?
difference in proportions
p1-p2
for a hypothesis test, if there is two quantitative variables, what is the parameter?
correlation
rho
simulate many samples assuming the null hypothesis is true and collect the values of a sample statistic for each simulated sample
randomization distribution
the randomization distribution will be centered at the ______
value indicated by the null hypothesis
the farther out the observed sample statistic s in the tail of the randomization distribution, the ______ the evidence is against the null hypothesis
stronger
proportion of samples when the null hypothesis is true that would give a statistic as extreme as or more than the observed statistic
p-value
how do u find the p-value from the randomization distribution?
find the observed statistic in the randomization distribution
find the proportion of the simulated samples that have statistics as extreme as the statistic observed in the original sample
x-x
sample statistics farther out in the tail give ____ p-values
smaller
the smaller the p-value, the _____ the evidence is against the null hypothesis and in support of the alternative
greater
how to write out the p-value explanation
there is a (p-value) change of a difference of proportion of (variable) of (null number) or more extreme if there was no difference in (variable testing)
you generate a randomization distribution by _______
assuming the null is true
(putting null in the center)
shows how extreme the difference in statistics is
p-value
if the p-value is small enough, then results as extreme as the observed statistic are ____ to occur by random chance alone and we say that the sample _____ statistically significant
unlikely
is
if our sample if statistically significant, we have convincing evidence that _________
against Ho and in favor of Ha
the significance level a for a test of hypotheses is a boundary below which we conclude that a p-value shows ______
statistically significant evidence against the null
if significance level is not specified, we use a=
0.05
if p<a,
reject Ho
results are significantly significant and we have convincing evidence that Ha is true
is p> or equal to a,
do not reject Ho
results are not significantly significant and we do not have convincing evidence Ha is true
note: if p value is small enough then it is not likely to happen by random chance if the null is true
smaller p value is better/worse?
better!
what is a type 1 error?
rejecting a true Ho
what is a type II error?
accepting a false Ho
if you reject a true Ho, what kind of error is that?
type I
if you accept a false Ho, what kind of error is that?
type II
represents a tolerable probability of making a type I error
significance level
with a larger sample size, it is _____ to find a significant result
easier
what is 1-β
power
chance/probability of not making a type II error
how can you reduce the probability of a type I error?
decreasing the significance level
how can you reduce the probability of a type II error?
increase the significance level or sample size
when you test the same data a lot, the chance of type I error _____ overall
increases
shows the distribution of sampling statistics from a population, and is generally centered at the true value of the population parameter
sampling distribution
simulates a distribution of sample statistics for the population, but is generally centered at the value of the original sample statistic
bootstrap distribution
simulates a distribution of sample statistics for a population in which the null hypothesis is true, and is generally centered at the value of the null parameter
randomization distribution
the formal decision for a two-tailed hypothesis is related to
whether the null parameter falls within a confidence interval
when the parameter value given in Ho falls _____ of a 95% confidence interval, then it is not a plausible value for the parameter and we should reject Ho at a 5% level in a two-tailed test
outside
when the parameter value given in Ho falls _____ of the 95% confidence interval, then it is a plausible value for the parameter and we should not reject Ho at a 5% level in a two-tailed test
inside
bootstrap simulates population centered at sample stat - bell shaped
confidence interval
simulate null hypothesis and centered at null - bell shaped
hypothesis test
if null is not in the confidence interval, then we
reject Ho
a probability needs to be between
0 and 1