Midterm Review Flashcards
Area under density curve
Always = 1
How to tell if something is an outlier
Q1 - 1.5(IQR)
Q3 - 1.5(IQR)
Outside of this range is an outlier
Density Curve
A smooth curve which approximates the shape of a histogram and describes the overall pattern of a distribution
Residual
The difference between an expected value and the actual value (y-yhat)
Complementary events
When two events together make up the entire sample space (even and odd numbers)
Ogive graph
Graph of percentile, relative cumulative frequency distribution
Variance calculation
Σ(x-xbar) squared / (n-1)
Independent event
Choice of selecting one object does not affect ways of selecting other objects
Dependent events
Selecting an object does affect selecting other objects
Circular permutations
N objects in a circle, then (n-1)! permutations of the objects
Calculating correlation
Sum (standardized x)(standardized y) / (n-1)
Least squares regression
Method of predicting response given explanatory
Line of “response” on “explanatory”
LinReg(a+bx)
Yhat= a + bx
LSRL “b”
Slope:
r(sy/sx)
LSRL “a”
Intercept:
Ybar- b(xbar)
Coefficient of determination
R^2,
Percent variation that can be explained by the lsrl
Influential point
When removed, dramatically changes slope of lsrl, often x outlier
Power law model
Y=ax^p
Log y = log a + p log x
Both variables are transformed
Exponential growth model
Y=ab^x
Log y = log a + x log b
Only response is transformed
Common response
A lurking variable- both x and y are acted on by another z force
Confounding
A lurking variable which also affects the response, making it unclear how much effect the explanatory actually has
Conditional distribution
Table cell/row or column total
Marginal distribution
Sum or row total/table total
Important in experiment design
Control (lurking variables)
Random (SRS)
Replication of experiment
Observational study
No treatment/experiment
Multiplication rule
If events a and b are independent, then p(a)p(b) = p(a and b)
Random variable
Variable with numerical value
Continuous random variable
Each individual outcome has p=0, use a normal distribution
Discrete random variable
Has a finite # of values, each value has a probability
Variance of discrete random variable
Sigma^2 sub x = Σ(x-μ)^2(p)
Mean of discrete random variable
μx = Σ(xp)
Means of random variables
μ(a+bx)= a + b μx μ(x+y) = μx + μy
Variance rules
Sigma^2sub x+y = sigma^x + sigma^2y + 2ρsigmaxsigmay
Sigma^2x-y= the same but - 2ρsigmaxsigmay
Where ρ is the correlation
Rule of thumb for a binomial distribution
Use a normal approximation when np is greater than or equal to ten, and n(1-p) is greater than or equal to ten
Mean and standard deviation of a binomial distribution
Mean=np
Strd Dv= root (np(1-p))
Stratified random sample
Divide population into strata, take srs from each stratum and combine for whole sample
Blocking
In an experiment, group together those known to be similar, and apply each treatment to each block so that there arent confounding variables