proportions Flashcards
What are the four conditions of a Binomial random variable?
Fixed number of trials (๐)
Independent trials
Two possible outcomes per trial
Same probability of success for each trial
How is the sample proportion (๐ฬ) estimated?
๐ฬ = (Number of successes) / (Total number of trials)
What is the formula for the standard error of the sample proportion?
๐ ๐(๐ฬ) = sqrt[๐ฬ(1โ๐ฬ)/๐]
What is the formula for a 95% confidence interval for ๐?
๐ฬ ยฑ (1.96 ร ๐ ๐(๐ฬ))
Why do we assume a Normal approximation for the sample proportion?
Because the sample sizes are large, allowing the Binomial distribution to be approximated by a Normal distribution.
What R function is used to compute a confidence interval for proportions?
binconf(x, n, alpha=0.05, method=โasymptoticโ) from the Hmisc package.
What happens when the sample size is too small for a Normal approximation in confidence intervals?
The confidence interval may extend below zero, which is not possible for probabilities. A different method, such as Wilsonโs interval, should be used.
What is Wilsonโs confidence interval formula used for?
It is used for small sample sizes to ensure the confidence interval does not go below zero or above one.
What is the formula for the standard error of the difference in two proportions?
se(p^1โ p^2)= sqrt[((p^1(1 - p^1) / n1) + (p^2 (1 - p^2) / n2)]
What are the null and alternative hypotheses for comparing two proportions?
Null Hypothesis (๐ป0): No difference between proportions (๐๐ด - ๐๐ถ = 0)
Alternative Hypothesis (๐ป1): There is a difference (๐๐ด - ๐๐ถ โ 0)
Why is a hypothesis test used to compare two proportions?
It determines whether the observed difference between two proportions is due to chance or represents a true difference.
Why is Wilsonโs interval preferred for small sample sizes?
It avoids impossible probability values (e.g., negative probabilities) and provides more accurate confidence intervals when ๐ฬ is close to 0 or 1.
What does the test statistic measure in a two-proportion z-test?
It measures the observed difference between sample proportions as a ratio of the standard error, helping determine statistical significance
How is the test statistic for comparing two proportions calculated?
dataestimateโhypothesizedvalue / standard error
What does a test statistic of 4.82 indicate in a z-test?
It means the observed difference is 4.82 standard errors away from the null hypothesis (zero difference), suggesting strong evidence against ๐ป0
What is the p-value for a test statistic of 4.82 in a z-test?
The probability of obtaining such an extreme value (or more) under ๐ป0 is very small, around10โ6, leading to rejection of ๐ป0
How is the confidence interval for the difference in two proportions calculated?
(p^1โp^2)ยฑ(zรse(p^1โp^2))
What are the three sampling situations for comparing proportions?
Situation A: Independent samples (e.g., comparing two countries).
Situation B: One sample, mutually exclusive categories (e.g., voting choices).
Situation C: One sample, multiple response options (e.g., survey with multiple answers).
How is the standard error calculated for Situation A (independent samples)?
sqrt [((P^1(1-P^1))/ n1)+ ((p^2(1-p^2) / n2)]
When comparing survey responses from two countries, which sampling situation applies?
Situation A (independent samples), since each person belongs to only one countryโs sample.
How does the choice of standard error formula impact results?
If the wrong formula is used, confidence intervals and hypothesis tests may be incorrect, leading to misleading conclusions.
What is the formula for the standard error of the difference between two proportions?
se(p^1โ p^2)= sqrt [(P^1 + P^2 - ( P^1-P^2)^2) /n]
When should Situation B be used in sampling?
Situation B is used when one sample is asked a single question with mutually exclusive response options, such as โagree,โ โdisagree,โ or โdonโt know.โ
How are statistical odds calculated?
Odds= p(success) / p(failure) = p / 1โp
What does an odds ratio (OR) greater than 1 indicate?
It indicates that the intervention group has higher odds of success compared to the control group.
How is an odds ratio (OR) calculated?
ฮธ= oddsingroup1/ oddsingroup2โ
What does an odds ratio (OR) less than 1 indicate?
It indicates that the control group has higher odds of success compared to the intervention group.
Why do we use the log of the odds ratio to construct confidence intervals?
Because the distribution of the odds ratio is highly skewed, and taking the log makes it approximately normal.
What is the formula for the standard error of the log odds ratio?
seOR= sqrt [ 1/n11 + 1/n12 + 1/n21 + 1/n22 ]
How do you obtain a confidence interval for an odds ratio?
Compute log(ฮธฬ)
use: log(ฮธ^)ยฑz1โฮฑ/2รseOR
โExponentiate the lower and upper limits to return to the odds ratio scale.
What does an odds ratio of exactly 1 indicate?
It indicates no difference between the two groups.
What is the formula for the pooled sample proportion when testing the difference between two proportions?
p^ = (x1 + x2) / (n1 + n2)
where ๐ฅ1 and ๐ฅ2 are the number of successes in each sample.
How do you interpret a confidence interval for the difference between two proportions?
If 0 is in the interval, there is no significant difference.
If the interval is entirely positive, ๐1>๐2
โ
If the interval is entirely negative, ๐1<๐2
Why are odds used instead of probabilities in logistic regression?
Because odds have mathematical properties that allow for a linear relationship with predictor variables on the log scale.
What does a log odds ratio of 0 mean?
It means the odds ratio is 1, indicating no difference between the two groups.
What transformation is used to make the odds ratioโs distribution approximately normal?
The natural logarithm (log transformation) is applied to the odds ratio.