Midterm Flashcards
What is the total area under the density curve?
100% or 1
What percent of observations will fall within 1 standard deviation of the mean of an approx. Normal distribution?
68%
What percent of observations will fall within 2 standard deviation of the mean of an approx. Normal distribution?
95%
What percent of observations will fall within 3 standard deviation of the mean of an approx. Normal distribution?
99.7%
What is “r”?
the correlation r measures the strength of a linear relationship
What values can r take? What does it mean when r is less than 0?
r can take any value from -1 to 1. If it’s less than zero, it describes a negative correlation.
What does it mean when r = 1? r = -1?
perfect correlation/points on a scatterplot lie exactly on a straight line; r = -1 means perfect negative correlation
What is the slope b of a regression line y-hat = a + bx?
the predicted change in y-hat when x increases by 1 unit
What does the standard deviation of the residuals measure?
typical size of prediction errors when using the regression line
What does the coefficient of determination r^2 measure?
fraction of the variation in the response variable that is accounted for by the least-squares regression on the explanatory variable
Define influential observations
Individual points that substantially change the correlation or the regression line; outliers are often influential for the regression line
The least squares regression line of Y on X is the line with slope B= ? and intercept A = ?
r(Sy / Sx)
YMean - bXMean
What are the 4 basic principles of experimental design?
- Comparison: use a design that compares 2 or more treatments
- Random assignment: use chance to assign experimental units to treatments. This helps create roughly equivalent groups before treatments are imposed.
- Control: keep as many other variables as possible the same for all groups. Control helps avoid confounding and reduces the variation in responses, making it easier to decide whether a treatment is effective.
- Replication: impose each treatment on enough experimental units so that the effects of the treatments can be distinguished from chance differences between the groups.
Describe a randomized block design
A randomized block design forms groups of experimental units that are similar with respect to a variable that is expected to affect the response. Treatments are assigned at random with in each block. Responses are then compared with in each block and combined with the responses of other blocks after accounting for the differences between the blocks.
Describe a matched pairs design
A matched pairs design is a common form of blocking for comparing just two treatments. And some matched pairs designs each subject receives both treatments in a random order. And others to very similar subjects are paired and the two treatments are randomly assigned within each pair
What makes something a simple random sample?
It gives every possible sample of the same size an equal chance to be selected
How should you organize a matched pairs experiment?
Subjectively divide the sample into pairs to make the pairs as similar to each other as possible, and then randomly assigned the treatment to one of the members of the pair
What is the law of large numbers?
The law of large numbers says that the proportion of times that a particular outcome occurs in many repetitions will approach a single number.
What will the probability of the sample space always equal?
1
What is the addition rule for mutually exclusive events?
P(A or B) = P(A) + P(B)
What probability does the union of A and B describe?
P(A or B)
What probability does the intersection of A and B provide?
P(A and B)
What is the general addition rule to find P(A or B)?
P(A or B) = P(A) + P(B) - P (A and B)
What does the general multiplication rule state?
P(A and B) = P(A) * P(B | A)
If two events are mutually exclusive, they can/cannot be independent
Cannot
What are the sums of the means of X + Y?
X mean + Y mean
What are the variance of X + Y?
variance X + variance Y
What are the variance of X - Y?
variance X + variance Y
What are the qualifications of a binomial setting?
Binary- The possible outcomes of each trial can be classified as a success or failure
Independent- trials must be independent; that is, knowing the result of one trial must not tell us anything about the result of any other trial
Number- The number of trials of the chance process must be fixed in advance
Success- there is the same probability of success on each trial
What is the 10% condition?
The binomial distribution gives a good approximation to the count of successes in a simple random sample from a large population containing proportion P of success. This is true as long as the sample size is no more than 10% of the population size.
What is the large counts condition?
You can use a normal approximation for a binomial distribution when the sample size times the probability of success is greater than 10 and the sample size times the complement of the probability of success is also greater than 10
How do you find the geometric probability?
P(Y=k) = (1-p)^(k-1) (p)
How do you find a mean or expected value of a geometric random variable?
The mean is equal to one divided by the probability of success
The central limit theorem states that when n is ?, the sampling distribution of XBar will be approx. Normal in most cases
n > or = 30