Reading Quiz 8 Flashcards
binomial setting conditions
- each observation has two possible categories/outcomes, success or failure
- the observations are independent
- the probability of success, called p, is the same for each observation
- there is a fixed number of observations, n
binomial random variable
if data are produced in a binomial setting, then the random variable X = number of successes is called a binomial random variable
binomial distribution
if data are produced in a binomial setting, and the random variable X = number of successes is a binomial random variable, then the probability distribution of X is called a binomial distribution
binomial distribution
the distribution of the count X of successes in the binomial setting
parameters n and p
parameter n
number of observations
parameter p
probability of a success on any one observation
possible values of X
whole numbers from 0 to n
X is
B(n,p)
always be careful
to check when binomial distributions apply
binomial coefficient
number of ways of arranging k success among n observations given by this
(n choose k) = (n!)/(k!)(n-k)!
formula for binomial coefficients uses
factorial notation
0!
1
(n choose k)
binomial coefficient n choose k
counts number of ways in which k successes can be distributed among n observations
binomial probability
if X has binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0, 1, 2,…, n
if k is any one of these values:
P(X=k) = (n choose k) (p^k) (1-p)^(n-k) or
(n choose k) (p^k) (q)^(n-k)
probability distribution function (pdf)
given a discrete random variable X
assigns a probability to each value of X
probabilities must satisfy the rules for probabilities given in chapter 6
cumulative distribution function (cdf)
given a random variable X
cdf of X calculates the sum of the probabilities for 0, 1, 2, up to the value X
calculates the probability of obtaining at most X successes in n trials
mean and standard deviation of a binomial random variable
if count X has binomial distribution with number of observations n and probability of success p,
mean = np
standard deviation = square root of ((np)(1-p)) or
square root of (npq)
BUT ONLY FOR BINOMIAL DISTRIBUTIONS CAN’T BE USED FOR OTHER DISCRETE RANDOM VARIABLES
normal approximation for binomial distributions
count X has binomial distribution with n trials and success probability p
when n is large, distribution of X is approximately normal, N(mean, standard deviation)
use normal approximation when n and p satisfy np greater than or equal to 10 and n(1-p) greater than or equal to 10 or nq greater than or equal to 10
geometric setting conditions
- each observation has two possible outcomes/categories, success or failure
- the observations are all independent
- the probability of a success, called p, is the same for each observation
- the variable of interest is the number of trials required to obtain the first success
how does geometric variable differ from binomial variable
in the geometric setting the number of trials varies and the desired number of defined successes (1) is fixed in advance
if X has geometric distribution with probability p of success and (1-p) failure on each observation
possible values of X are 1, 2, 3, etc. if n is any one of these values, the probability that the first success occurs on the nth trial is
P(X=n) = (1-p)^(n-1) (p) or
P(X=n) = q^(n-1) (p)
mean and standard deviation of a geometric random variable
if X is geometric random variable with probability of success p on each trial,
mean/expected value = 1/p
variance = (1-p)/p^2 or
q/(p^2)
standard deviation is square root of variance formula
probability that it takes more than n trials to see first success is
P(X>n) = (1-p)^n or q^n
Suppose someone looks at the numbers of 1’s, 2’s, 3’s, 4’s, 5’s, and 6’s that result from 600 die rolls. Is this situation an example of the “binomial setting”? Why or why not?
A. Almost, but not quite. For there to be a binomial setting, you have to have each observation fall into only two categories, rather than the 6 categories described here. However, if you defined a 1 as a “success” and anything else as a “failure,” then you would have a binomial setting, and you could then do the same thing separately with 2, 3, 4, 5, and 6.
What are the four requirements for the binomial setting?
A. 1. Two categories: success or failure 2. Fixed # of observations 3. Independence 4. probability of
success same for all observations.
- The distribution of the number of successes out of n trials (with probability of success p on each trial) is the ______ _______.
A. binomial distribution
If someone has 51 socks in a drawer, with 1/3 red and 2/3 black, and the person grabs a handful of 5 of them, and counts the number of black, will the results of such a trial follow the binomial distribution? Why or why not?
A. Not quite, because grabbing a handful of 5 is equivalent to sampling without replacement. The probability of a black sock being included in the handful is altered some depending on what other socks are also in the handful. If you picked a sock one at a time, replaced the sock and mixed them thoroughly, and then picked again, the binomial distribution would apply.
When we say a certain random variable has a B(100, 0.7) distribution, what do we mean?
A. That there is a binomial distribution with 100 observations and probability of success on each
observation is .7.
If there is a discrete random variable (such as a binomial), and you want to find the probability of any given value of X, what function do you use – the cumulative distribution function or the probability distribution function? (cdf or pdf?)
What are the formulas for the mean, variance, and standard deviation of a binomial random variable in terms of n and p, and, if you want, q (or 1-p).
A. mean: np , variance: 2 npq and standard deviation: npq
As a rule of thumb, the normal distribution may be used as an approximation to the binomial when both np and nq (expected successes, expected failures) equal or exceed what number?
10
For a binomial setting, the number of trials is fixed, and the random variable is the number of successes. For a geometric setting, the random variable is the number of ____ necessary to achieve the first ____.
trials, success
What are the four requirements of the geometric setting?
A. 1. Two categories: “success” or “failure.” 2. Independent 3. Probability of success is the same for each observation. 4. Variable of interest is the number of trials required to obtain the first success.
- In a geometric setting, with probability of success p, what is the probability that the first success will occur on the nth trial?
A. P(X n) 1 pn1 por qn1 p
True or False: the probabilities of success on the first, second, third, etc. trial in a geometric setting, when arranged in order, form a geometric series where p is the first term and each successive term being (1-p) (or q) times the previous one?
true
True or False: if you apply the formula a/(1-r) for the sum of the terms of an infinite geometric series, where a is the first term and r is the ratio of each term to the previous one, for the geometric setting p is the first term and (1-p) is the ratio, so the sum becomes p/(1-(1-p)) or 1. Thus even though there are infinitely many possibilities for the outcome of the experiment in the geometric setting, the probabilities of each outcome sum to 1.
true
If your chances of getting a success at anything in the geometric setting is p, what is the average or expected number of trials that you would have to conduct before getting a success?
1/p trials
- If your chances of rolling a 1 on a die roll are one in 6, what is the expected or average number of times that you would have to roll the die before getting a 1?
6 times
What is the variance in the geometric random variable?
A.1por q p2 p2
- In the geometric setting, if q = 1 - p, what is the probability that it takes more than n trials to see the first success?
A. P(X n) (1 p)n or qn
For a geometric distribution, would you say that it is approximately true that 34% of the observations would fall between the mean and 1 standard deviation above the mean, and 34% would fall between the mean and 1 standard deviation below the mean? Why or why not?
A. No, because the geometric distribution is always strongly skewed to the right, and its shape doesn’t resemble the normal distribution (for which the above statement is true).
how to get on calc
2nd vars, binompdf/cdf/whatever, (# trials, probability, X=)
equal sign
inequality
cdf
cdf counts
less than or equal to
P(X=k)
binompdf(n, p, k)
P(X less than or equal to k)
binomcdf(n, p, k)
P(X less than k)
P(X less than or equal to k-1)
binomcdf(n, p, k-1)
P(X greater than k)
1 - P(X less than or equal to k)
1 - binomcdf(n, p, k)
P(X greater than or equal to k)
1 - P(X less than k)
1 - P( X less than or equal to k-1)
1 - binomcdf(n, p, k-1)
rule check
np greater than or equal to 10
nq greater than or equal to 10
geometric P(X=n)
2nd vars geometpdf(p, n)
geometric P(X less than or equal to n)
2nd vars geometcdf(p, n)
binomial mean and standard
mean = np standard = square root (npq)
geometric mean and standard
mean = 1/p standard = square root (q/(p^2))