Stats Exam 3 Flashcards
Central Limit Theorem
- Random samples from a normally distributed population are normally distributed
- As n increases (> 30), random samples from skewed distributions become normally distributed
- The means of all sample means is the population mean. (also true for proportions)
- The standard deviation of normally distributed sample means and proportions are: ….
What can Z scores tell you about sample means
Z scores can also tell us how far a sample mean is from the population mean, and therefore how likely or unlikely a given sample mean is
What happens to Z as n increases
Z approaches zero
Central Limit Theorem also applies to…
CLT also applies to proportions which are used for categorical data
3 forms of inference
- Point estimation
- Confidence intervals
- Hypothesis Testing
Point Estimation
using a single values form a sample to estimate a population parameter
Confidence Intervals
(Interval Estimation)
- Using a range of values to estimate a parameter
- Stating our confidence that an interval captures a parameter
* smaller interval/rang, less confidence
Hypothesis Testing
using samples and probability to support or reject assumptions about population parameters
sampling error
random sampling produces samples that aren’t exactly like the population
interval estimation
incorporates the likely size of the sampling error associated with the point estimate
weakness of point estimation
without quantifying the likely among of estimation error, point estimates are of limited use (sampling error)
Confidence Interval (CI)
a range of plausible values for a parameter in addition to the level of confidence that the parameter is included within the interval
2 components of CI
- Interval
- Confidence
a range of values that is likely to include to u
principle of “confidence” is the same in both scenarios
margin of error for means
-an absolute quantity Size of an interval (precision) partly chosen --> z score partly natural --> sd partly experimentally determined --> n
confidence intervals downside
- created when we do NOT know the population mean
- Establish a range of values that “probably” includes the population mean
- how probable depends entirely on the choice of a z score
steps to find z score?
- Find Z score for a … CI
- Calculate Standard Error
- Calculate Margin of Error
- Apply Margin of Error to Point Estimate
hypotheses
claims or statement about population parameters (never about samples)
null hypothesis
the no effect, no difference, nothing special difference
generally does not reflect the researchers belief
alternative hypotheses
Ha
3 possible forms:
1. a parameter is greater than some value Ha:u>#
2. a parameter is less than some values: Ha:u
one tailed Ha
can be supported by sample statistics from only one tail of a distribution (less/greater)
two tailed Ha
can be supported by sample statistics from both tails (different)
critical regions
tail regions of sampling distributions that contain unlikely values, that when observed lead us to reject Ho
critical values
specific standardized scores (like Z scores) that separate critical regions form the rest of the curve
alpha
=significance level
- a= the area under a normal curve with unlikely (extreme) observations, such that when observed, we reject the null hypothesis and support the Ha
- a= the acceptable rate of a type 1 error, mistakenly rejecting the null
- a & Ha determine the critical values
One tailed Ha & alpha
C.V puts alpha in one tail
Two tailed Ha & alpha
2 C.V.s that split alpha into 2 tails
Hypothesis Testing Steps
- Write the Ha and Ho and statistical terms
- choose alpha and determine critical values
- calculate a test statistic. For tests with one sample, 3 choices
- Compare test statistic to a CV of calculate a p value
- State conclusions in context
p value
probability that the difference between the sample mean and the population mean occurred by chance alone
T scores for sample means
- almost identical to Z scores (same assumptions)
- used when we don’t know the population standard deviation
- substitute the sample standard deviation into the standard error expression
degrees of freedom
df=n-1
choose between t score and z score?
do z score it will be more accurate
Type 1 error
rejecting the null, when the null is actually true
- occurs when we get an extreme test statistic by chance alone
- p(type 1 error) = alpha
- alpha is chosen in advance
Type 2 error
failing to reject the null, when the null is false
- must be calculated
- p(type 2 error)= beta
Power
the ability to reject a false null hypothesis
Power calculations
are used to determine the sample size needed to reveal the smallest difference that is actually interesting between two hypothesized values of a parameter