Stats Exam 3 Flashcards
every individual in the population has an equal chance to be selected
Simple Random Sample (SRS)
Sample in which every individual in the population has a chance (greater than zero) to be selected in the sample.
probability sample
sometimes the sample contains several subgroups, these subgroups make up different proportions of the population
stratified random sample
another word for subgroups
strata
people in each stratum should be _______
similar
sampling that uses probability sampling in a series of stages
multistage random sample
Variable that takes numerical values that describe the outcomes of some random process
Random Variable
the ___________ ______________ of a random variable gives its possible values and their probabilities
probability distribution
the two main types of random variables
discrete and continuous
characteristics of discrete random variables
able to list all possible outcomes; assign probabilities to each outcome
in general a discrete random variable X takes a fixed set of ___________ ___________
possible outcomes
the ____________ ___________ lists the outcomes xi, and their probabilities pi
probability distribution
the probabilities p must satisfy two requirements:
- every probability p1 is a number between 0 and 1
- the sum of the probabilities is 1
a ___________ __________ _________ takes on all values in an interval
continuous random variable
the probability of distribution x with a continuous random variable is described by a _________ _________
density curve
the probability of any event is the area under the _____ _______
density curve
a continous random variable has _____ many possible values
infinitely
all continuous probability models assign probability __ to every ________ outcome
0; individual
statistical inference involves two prominent techniques:
confidence intervals, hypothesis tests
What are the three Simple Conditions for Inference About a Mean
- SRS from the population of interest. No nonresponse or other practical difficulty.
- The variable is exactly normally distributed N(u,o)
- The population mean u unknown, but the population standard deviation o known
the entire group of individuals that we want information about
population
what we actually examine in order to gather information
sample
sample of size n that consist of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected
simple random sample
sample chosen by chance. we must know what samples are possible and what chance each possible sample has
probability sample
to select this sample, first divide the population into groups of similar individuals, called strata. then choose a separate one of these sample in each stratum and combine these samples to form the full sample
stratified random sample
occurs when some groups in the population are left out of the process of choosing the sample
undercoverage
occurs when an individual chosen for the sample can’t be contacted or does not cooperate
nonresponse
variable whose value is a numerical outcome of a random phenomenon
random variable
this variable has a finite number of posisble values
discrete random variable
the __________ of DRV lists the values and their probabilities
probability distribution
this variable takes all values in an interval of numbers
continuous random variable
the probability distribution of continuous random variables is described by a
density curve
the probability of any event is the area under
the density curve and above the values of x that make up the event
a number that describes the population
parameter
a fixed number, but in practice we do not know its value
parameter
a number that describes a sample
statistic
the value of this can change from sample to sample
statistic
we often use a statistic to estimate an unknown
parameter
the distribution of values taken by the statistic in all possible samples of the same size from the same population or randomized experiment
sampling distribution
the center of the sampling distribution
bias
a statistic used to estimate a parameter is unbiased if
the mean of its sampling distribution is equal to the true value of the parameter being estimated
the variability of a statistic si described by the
spread of its sampling distribution
this spread of sampling distribution is determined by
the sampling design and the sample size of n
statistics from larger probability samples have
smaller spreads
to reduce bias, use
random sampling
when we start with a list of the entire population, simple random sampling produces unbiased estimates –
the values of a statistic computed from an SRS neither consistently overestimate nor consistently underestimate the value of the population parameter
to reduce the variability of a statistic from an SRS,
use a larger sample
you can make the variability as small as you want by
taking a large enough sample
the purpose of a confidence interval
to estimate an unknown parameter with an indication of how accurate the estimate is and how confident we are that the result is correct
any confidence interval has two parts:
an interval computed from the data and a confidence level
the interval often has
the form estimate +- margin of error
the margin of error is obtained from
the sampling distribution
what does the margin of error indicate
how much error can be expected because of chance variation
the confidence level states the
probability that the method will give a correct answer
if you use 95% confidence intervals, in the long run
95% of your intervals will contain the true parameter value
the margin of error of a confidence interval decreases as
the confidence level C decreases, the sample size n increases, and the population standard deviation decreases.
intended to asses the evidence provided by data against a null hypothesis in favor of an alternative hypothesis
test of significance
the hypotheses are stated in terms of
population parameters
the test of significance is based on
a test statistic
the p-value is
the probability that the test statistic will take a value at least as extreme as that actually observed
small p-values indicate
strong values against H0
calculating p-values requires
knowledge of the sampling distribution of the test statistic
if the p-value is as small or smaller than a specified value a, the data are statistically
significant at significance level a
the power of a significance test measures
its ability to detect an alternative hypothesis
increasing the size of the sample _______ the power when the significance level remains fixed
increases