MIXED VOCAB FOR WINTER Flashcards
Minimum sample size for means?
If population is normalish, then there is no minimum sample size. If it is skewed or bimodal or any other non-normal distribution, then n>30.
What is a sampling distribution?
A pile of statistics taken from many many many samples
What are the the sample size requirements for inference for both means and proportions?
- [BOTH]You need a random sample. 2. [BOTH] Less than 10% of population 3. [DIFFER] PROPS: np>10 and nq>10, and MEANS n>30 unless from normal population, then no minimum.
What is formula for nCr ?
n! / r! (n-r) !
Describe independence and association with categorical examples.
Grade and pizzsa preference are independent, gender and gaming status are associated
How do you describe an association between two quantitative variables? (scatter plot)
DIRECTION (pos/neg) FORM (linear,curved) STRENGTH (strong, moderate, report “r” value)
Describe independence and association with quantitative examples.
Height and IQ are independent. Height and weight are associated.
function to find a percentile in normal model?
INVNORM
What is a p-value?
The likelihood you obtained your statistic or one more extreme due to just chance if the Null was actually true.
What is probability?
Long run relative frequency. (the long run percent)
Minimum sample size for proportions?
You need at lease 10 successes, np>10, and 10 failures, nq > 10
Interpret r^2 ?
The percent of variablility in Y explained by the model with X
What is error?
Distance from a statistic to the parameter. How far off your stat is from the truth.
What is a Z score?
the number of SD a data value is away from the mean
What graphs for CATEGORICAL data?
segmented bar, bar, pie, mosaic
What does SD of residuals tell us?
Typical residual. Average distance to the model. About how far off we expect model to be.
What is variance?
A measure of spread- the average squared distance to the mean. SD^2
What points are outliers in regression?
Those that don’t follow the flow.
function to find area under normal curve?
normcdf
How do you describe the distribution of a single data set? (a histogram)?
SHAPE (#modes, skewness), CENTER (measure of center), SPREAD (measure of spread), STRANGE (outliers or gaps)
Interpret SLOPE EQUATION: rSy/Sx
For every 1 unit of x, there is a change of SLOPE units of y
Diff between standard deviation and standard error?
Standard deviation is typical distance to mean for a data point, Standard error is typical distance to parameter for a statistic in a sampling distribution.
What is the Law of Large Numbers?
In the long run, after many many trials, the % of successes approaches the true probability. Think: if you flip a coin twice, you may get 0% heads, 50% heads or 100% heads. If you flip 10,000 times, you probably will have about 50% heads (def not 0 or 100)
Suppose p value = 0.003. How would you interpret?
With a p-value this low (0.003 < 0.05), I reject the Ho, there is enough evidence to say [Ha in context]
What are the measures of spread we use?
standard deviation, variance, range, interquartile range, standard error
What points have influence in regression?
Those that would change the slope if removed (they are outliers that have leverage)
What is margin of error?
Distance you reach up and down when making CI. It is CRIT * SE
What is alpha?
It is the rejection threshold. Reject Ho when p-value is below alpha.
What points have leverage in regression?
Those far to the left and right from x-bar
Interpret y-intercept?
When X=0, the model predicts this much Y.
What graphs for QUANTITATIVE data?
histogram, box/whisker, stemplot, dot plot, ogive, time plot, line graph
What are the measures of center we use?
mean, median, mode
What is a confidence interval?
A parameter catcher. It tries to catch the truth.
What does “95% confident” mean?
If you took 100 samples and made 100 confidence intervals, about 95 would contain the parameter and about 5 would not.
What is a test statistic?
The number of SE a statistic is away from the hypothesized parameter.
What is the golden sentence?
I was curious about a population paramter, but a census was too costly so instead I took a sample and used the data to calculate a statistic and then made an inference about the parameter with that statistic.
Where are outliers located in a data set ?
outside the fences. Lower Q1-1.5IQR and upper Q3+1.5IQR
When we combine random variables, what do we add?
Add means and add variances. DO NOT ADD ST DEV. You add variances and take the square root of the sum to find combined SD.
What does rSy/Sx mean?
slope formula. For each SD in X, you go r SD in y
Find P( Z > 1.5) ?
normcdf( 1.5, 9999)
What is a critical value?
1 for 68% confidence, 2 for 95 and 3 for 99.7. It is the number of SE you want to reach out in a confidence interval.
What are the two sampling distributions we have discussed?
MEANS: N ( mu, sigma/root n) and PROPORTIONS: N ( p, root (pq/n) )