Statistics Flashcards
Nominal data
In category, non-parametric
Study Power?
The power of a study is the probability of detecting a significant difference between treatments or study groups when there really is one.
Low power increases the likelihood of failing to identify a statistically significant difference when a real difference does exist.
High power (80% or more) is desirable .
Power is affected by sample size, etc.
Ordinal data?
In order, with unequal interval,non-parametric
Interval data?
Equal interval
No absolute zero
Cannot compute ratio
parametric
Eg Tm in Celsius or Fahrenheit
Ratio data?
Equal interval
with absolute zero or true zero
Can calculate ratio
parametric
Eg. Wt, hight, Kelvin Tm
“NOIR”
Measurement of central tendency?
Mean
Median
Mode
Mean= Median = Mode, what distribution?
Normal distribution
Relationship of mean, median and mode in right (positive) distribution?
Right skewed -Tail on the right
Mean>Median>mode
(Rule of thumb: mean always follows the tail)
The relationship of mean, median and mode in left skewed distribution?
Tail is on the left of the distribution
Mean<Mode
For normal distribution, select statistic method?
Select Parametric statistics test
Eg. Student t-test, chi-square, ANOVA, ANCOVA, regression analysis
For non-normal distribution, eg. Bimodal, skewed, etc. test methods selection?
Non-parametric test eg.Fisher’s exact test, McNemar test,Mann-Whitney U test, Wilcoxon’s rank sum test, Kruskall-wallis test
Ways of obtaining random sample?
- Simple random sampling
- Systemic random sampling
- Stratified random sampling
- Cluster sampling
Bias?
Systemic error
Impacts internal validity
Chance
Radom
Confounder?
Associated with exposure (risk) and outcome
An independent risk factor for the outcome
Not in the causal pathway between the risk factor and disease
Power
The chance of finding an effect in your sample if it truly exist in the population.
Power is not a question in a study that shows a significant effects.
If a study results had failed to show a significant difference (p>0.05) between the two groups, one may wonder whether the study had sufficient power.
When apply to a population,
Given sensitivity and prevalence,
True positive =?
False negative =?
True Positive = Sensitivity x Prevalence
False negative = (1- Sensitivity) x Prevalence
When apply to a population, given Specificity and Prevalence,
True negative =?
False positive =?
True Negative = Specificity x (1- Prevalence)
False positive = (1- Specificity) x (1-Prevalence)
Regression toward the mean
In any group selected on a characteristic with substantial day-to-day variation, many will have values closer to the population mean when the measurement is repeated and worst pts will improve.
Baseline drift
Which occurs with measurement on certain machines that requires frequent calibration.
Hawthorne effect
A tendency among study subjects to change simply because they are being studied or watched.
1SD =? %
2SD =? %
3SD =? %
1 SD = 68% (Z score = 1)
2 SD = 95% (Z score = 2)
3 SD = 99% (Z score = 3)
When two events are independent, the probability of either will occur?
Is the sum of their probability, minus the probability that both will occur.
P (A or B) = P (A) + P (B) - P (A and B)
When two conditions are mutually exclusive, the probability that either one will occur is
The sum of their probability
Randomization
Assignment occurs by chance
ROC curve - Receiver-operator curve
X axis: 1 - specificity, or the false - positive rate
Y axis: Sensitivity
ROC curve is used to determine
Optimal Cut-off point for the respective test.
In general, the point closest to the upper-left corner, where sensitivity is highest and the false-positive rate is lowest, is chosen as the cut-off.