AP Final Exam Flashcards
SOCS - fairly symmetric
- Shape
- Center- mean
- Spread - SD
- Outliers - too small if Q1 - 1.5
too big is Q1 + 1.5
SOCS - slightly/strongly skewed
- Shape
- Center - median
- Median - IQR
- Outliers - too small if Q1 - 1.5
too big is Q1 + 1.5
How to find SOCS easily
Make list –> stats –> stat calc –> 1 variable stats
True or false: you can determine the shape with a boxplot
False
Interpret SD
The (context) typically varies by (SD) from the mean of (mean).
Interpret percentile
(Percentile) % of (context) are less than or equal to (value).
Interpret z-score
(Specific value w/ context) is (z-score) standard deviations (above/below) the mean.
Describing the distribution
DUFS:
- Direction - (+) or (-)
- Unusual features - outliers or clusters
- Form - linear or nonlinear
- Strength - weak, moderate, or strong
Calculate slope
use b in y(hat) = a + bx –> b/increment (ex. years)
Interpret slope
The predicted (y-context) (increases/decreases) by (slope) for each additional (x-context).
Interpret coefficient of determination (r^2)
(r^2 as percentage) of the variation in (y-context) can be explained by the linear relationship with (x-context).
Calculate residual (r)
r = (actual) - (predicted)
predicted = y(hat) = a + bx
Interpret residual (r)
The actual (y-context) was (r) less/greater than the predicted (y-context & predicted value).
Convenience sample
Selected for inclusion because they are easy to access
(ex. first 30 people to walk through the door)
- Underestimates or overestimates true proportion
- Not representative of population
Voluntary response
Choose to participate in a survey or experiment
Simple random sample
Random sample which takes a random population and randomly assigns them into groups with equal probability.
Stratified sampling
Takes population and splits them into groups (strata) into a characteristic that we think has some effect.
(ex. SRS within grades)
- Homogeneous groups
- SRS within each group
Cluster sample
Grouping is similar to population and SRS is taken to choose a cluster.
(ex. SRS of classrooms)
- Heterogeneous groups
- SRS of groups
Undercoverage bias
Don’t have access to survey
Response bias
No reply
Response bias
Participants who are untruthful
Confounding variable
Variable that causes suspicious association
Observational study
Variables are observed to determine if there’s a correlation
Experimental study
Using controlled variables to determine if there’s causation
Use blocking for…
Experimental designs
Use stratifying and clustering for…
Observational studies
Mutually exclusive
Probability for A or B (A U B); cannot occur together
- think ADDITION
- If there’s an overlap, subtract
Independent
Probability A and B; one thing does not affect the other
- think MULTIPLY
- If A already happened; what’s the probability of second
How to find probability when given percentages
Tree diagram
Interpret probability (A) (mutually exclusive)
After many, many (context), the proportion of times that (context A) will occur is about (P(A)).
Describing binomial distribution
Conditions: BINS
1) Binary - success or failure
2) Independent
3) Number of trials fixed - n = ___
4) Same probability - p = ___
- Shape - (np ≥ 10, n(1-p) ≥ 10)
- Center - mean: μx = np
- Variance - SD: ox = √np(1-p)
Calculate binomial probability (exactly)
binompdf
(clearly identify parameters)
Calculate binomial (a least)
1 - binompdf
(clearly identify parameters)
Interpret conditional probability (independence)
Given (context B), there is a P(A|B) probability of (context A)
Interpret expected value (mean, μ)
If the random process of (context) is repeated many times, the average number of (x context) we can expect is (expected value).
Interpret binomial mean (μx)
After many, many trials, the average number of (success context) out of (n) is (μx).
Interpret binomial SD (ox)
The number of (success context) out of (n) typically varies by (ox) from the mean of (μx).
Transforming random variables (multiple/divide by A)
- Mean - multiply or divide by A
- SD - multiply or divide by A
- Variance - multiply or divide the SD by A^2
Transforming random variables (add/subtract A)
- Mean - add or subtract by A
- SD - no change
- Variance - no change
Combining random variables (S = X + Y)
- μ = μx + μy
- o = √ox^2 + oy^2
- o = ox^2 + oy^2
Combining random variables (D = X - Y)
- μ = μx - μy
- o = √ox^2 + oy^2
- o = ox^2 + oy^2
Calculate normal distribution probability (more/less than)
normalcdf(>, <, μ, o)
Calculate normal distribution (gives % on curve)
inversecdf(area, μ, o)
Identifying when to use geometric distributions
“On any given ___, there is a ___% probability…”
Describing a geometric distribution
Conditions: BIFS
- Binary - success or failures
- Independence
- First success
- Same probability
- Shape
- Center - μx = 1/p
- Variability - ox = √(1-p)/p
Find the probability of a geometric distribution (until)
geometricpdf(p, x)
Find the probability of a geometric distribution (within)
geometriccdf(p, lower, upper)
Sampling distribution
Many, many samples and a statistic calculated for each of those samples
What makes a good statistic?
- No bias
- Low variability
Z score for one sample proportion
z = (p (hat) - p) / √p(1-p)/n
Z score for one sample mean
z = (x̄ - μ) / o/√n
Calculate z score into p-value
normcdf(z score, 1E99, 0, 1)
Interpret standard deviation of sample proportions (op(hat))
The sample proportion of (success context) typically varies by (op(hat)) from the true proportion of (p).
Standard deviation of sample means (ox̄)
The sample mean amount of (x-context) typically varies by (ox̄) from the true mean of (μx).
One sample confidence interval for mean (μ)
1) State: μ = true mean (context)
CL = ___
2) Plan: name: one sample t interval for μ
Conditions: 1) random
2) 10% rule
3) normal - pop. distribution is normal,
CLT (n ≥ 30), graph shows no strong
skew or outliers
3) Do: x +/- t* (S/√n)
4) Conclude: We are (CL)% confident that the interval
from ___ to ___ captures the true mean of (context).
One sample confidence interval for proportions (p)
1) State: p = true proportion (context)
CL = ___
2) Plan: name: one sample z interval for p
Conditions: 1) random
2) 10% rule
3) normal - CLT (n ≥ 30)
3) Do: p(hat) +/- z* (√(p(hat)(1-p(hat))/n
4) Conclude: We are (CL)% confident that the interval
from ___ to ___ captures the true proportion of
(context).