14. Stats Flashcards
What type of data is categorical?
Qualitative
What type of data is numerical?
Quantitative
What are the 2 types of categorical data?
- Nominal
2. Ordinal
Data that can be in categories, but have no particular order or magnitude differences?
Nominal
Data that can be allocated to an ordered set of categories?
Ordinal
Discrete data that can only be certain whole numbers and continuous data that can be any numerical value?
Numerical
What type of data is blood groups?
Nominal
What type of data is AHA class?
Ordinal
What type of data is # of surgical procedures?
Discrete
What type of data is cardiac index?
Continuous
Case-control advantages?
- Can study rare disease
- Can study disease with long latency between exposure/manifestation
- Can be launched/conducted over short time periods
- Inexpensive (compared to cohort)
- Can study multiple causes of disease
Case-control disadvantages?
- Recall bias (information on exposure/past history based on interview)
- Validation of info on exposure is difficult
- Concerned with one disease only
- Can’t provide information on incidence rates of disease
- Incomplete control of extraneous variables
- Choice of appropriate control group can be challenging
- Methodology can be hard to comprehend for non-epidemiologist
- Correct interpretation of results can be hard
Cohort advantages?
- Complete information on subjects exposure (quality control of data)
- Clear temporal sequence of exposure/disease
- Study multiple outcomes related to a specific exposure
- Calculation of incidence rates (absolute risk and relative risk)
- Methodology/results easily understood by non-epidemiologists
- Study relatively rare exposures
Cohort disadvantages?
- Not suited for rare disease (need large # subjects)
- Not suited when time between exposure and disease manifestation is very long (can be overcome in historical cohort studies)
- Exposure patterns may change during course of study and make results irrelevant
- Maintaining high rates of follow-up can be difficult
- Expensive to carry out (need large # subjects)
- Baseline data sparse… large # of subjects doesn’t allow for long interviews
Research involving administration of a test regimen to humans to evaluate both efficacy and safety
Clinical trial
Phases of a clinical trial?
1- Safety and pharmacologic profiles
2- Pilot efficacy studies
3- Extensive clinical trial
4- Studies after FDA approval for distribution
Administration of a single subtherapeutic dose of the drug to a small group (0-15) to gather preliminary data on pharmacokinetics and pharmacodynamics
Phase 0
A small group (20-80) of volunteers to assess the safety and pharmacokinetic profile of medication
Phase 1
A large group (20-300) to assess safety in a larger group of patients as well as effectiveness of the drug
Phase 2
Randomized controlled multicenter trial on a relatively large group (300-3000+) depending on the medical condition and is to assess the effectiveness of the drug in comparison with an accepted therapy
Phase 3
Safety surveillance and ongoing technical support of a drug after permission for it to be distributed
Phase 4
What type of test is used when numbers in contingency table of categorical variables are relatively small?
Fisher exact
What test is used for two groups with paired data?
McNemar
What test is used to measure the difference between actual/expected frequencies of categorical variables?
Chi2
What test is an extension of the Chi2 test used when comparing several 2-way tables (meta-analysis)
Mantel-Haenszel
What tests are used to compare samples of normally distributed data?
Parametric
What type of tests are used when data are not normally distributed?
Non-parametric
What types of tests are student T-test, ANOVA, ANCOVA, Kolmogorov-Smirnov
Parametric
What types of tests are Wilcoxon signed-rank, Mann-Whitney U-test, Wilcoxon rank sun, Kruskal-Wallis?
Non-parametric
What test is used to compare 2 samples to test probability that samples come from population with same mean value?
Student T-test
What test is used to compare the means of 2+ samples to see whether they are derived from the same population
ANOVA
What test is used to compare the means of 2+ samples to see whether they are derived from the same population and accommodates continuous variables?
ANCOVA
What test is used to test hypothesis that the collected data are from a normal distribution so that the parametric stats can be used?
Kolmogorov-Smirnov
What test compares the difference between paired groups?
Wilcoxon signed rank
What non-parametric test is like the t-test for parametric data?
Wilcoxon signed-rank
What tests compare 2 sets of data that are derived from 2 different sets of subjects?
Mann-Whitney U-test or Wilcoxon rank sum
What test compares 2+ independent groups (like the ANOVA for parametric)?
Kruskal-Wallis
What describes the frequency of occurrence of new cases during a time period?
Incidence
What measure is useful to explore causal theories or evaluate effects of preventive measures?
Incidence
What is the equation for incidence?
new cases in a population in a period of time/Sum for each individual in population of length of time at risk for getting disease
What is the equation for cumulative incidence?
of individuals who get disease in certain period/Number of individuals in population at beginning of period
What describes what proportion of the population has a disease at a specific point in time?
Prevalence
What does prevalence depend on?
Incidence and duration
P = I * D
What measure is relevant to planning of health services or assessing need for medical care in a population
Prevalence
What is the equation for prevalence?
Existing # of individuals having disease at a specific time/Number of individuals in the population at that point in time
Does chronic disease have lower prevalence or incidence?
Lower incidence
Do acute illnesses have lower prevalence or incidence?
Lower prevalence
What is used to delineate how one set of data relates to another though a best fit line?
Regression analysis
*Regression coefficient is the slope of a line
Name examples of different types of regression analysis?
- Simple linear
- Logistic
- Poisson
- Cox proportional hazards
What is the most common survival curve method?
Kaplan-Meier curve
What does a Kaplan-Meier curve do?
Displays survival of a cohort with calculation of survival estimates upon each death or event
What is a nonparametric test to compare the survival between 2 potential Kaplan-Meier curves?
Log rank test
What is a well-recognized curve that reflects a continuous probability distribution that is bell-shaped (unimodal) and symmetrical about the mean with 2 parameters, mean and variance?
Gaussian distribution
Which continuous probability distribution most closely resembles the normal of Gaussian distribution?
T-distribution
What is the measure of dispersion or variability in a sample?
Standard deviation
What % of cases fall within 1 SD in normal distribution?
68.2%
What % of cases fall within 2 SD in normal distribution?
95.4%
What % of cases fall within 3 SD in normal distribution?
99.7%
True or False: Mean and median of a normal distribution are equal
True
What is the difference between T-distribution and normal distribution?
T-distribution is more spread out with longer tails
What distribution is right skewed and characterized by degrees of freedom?
Chi2
What distribution is right skewed used for comparing 2 variances?
F distribution
What distribution is highly skewed to the right (it is the probability distribution of a random variable whose log follows the normal distribution)?
Log normal distribution
Name 2 discrete probability distributions
- Binomial
2. Poisson
What is a confidence interval?
Range that is likely to contain the true population mean valve
What does a 95% confidence interval mean?
There is a 95% chance that the population value lies within stated limits
What indicates variability in a sample?
Standard deviation
In a normal distribution, 95% of the distribution of the sample means is within what SD of the population mean?
1.96
The size of the CI is related to what?
Sample size of study
*Larger the population, narrower the CI
How is the 95% confidence interval for the mean calculated?
Sample mean – 1.96 x SEM to sample mean + 1.96 x SEM
*SD is the SEM
Name the 2 types of applied statistics
- Descriptive
2. Inferential
What do descriptive statistics do?
Describe data in a sample
What do inferential statistics do?
Estimate whether results suggest a real difference between populations
Examples of descriptive statistics?
- Mean
- Median
- Mode
- SD
- Quartiles
- Histograms
Examples of inferential statistics?
- Student T-test
- ANOVA
- Chi2
What is a type I or alpha error?
When null hypothesis that is correct is rejected (stating a difference when there isn’t one)
What is the chance of making a type I error?
P-value
What is a type II or beta error?
When null hypothesis that is incorrect is accepted (stating no difference when there is one)
What is a type III error?
Study design that produces the right answer to the wrong question
What is the p-value?
Probability that defines how likely it is that a hypothesis is true (usually null hypothesis- no difference between groups)
What is the probability of an observed difference occurring solely by chance?
P-value
What is the usual p-value level of significance?
0.01 to 0.05
What is the method used to adjust P-value for multiple testing?
Bonferranoi adjustment
What is the power of a study?
Probability that it would detect a statistically significant difference
What is B in statistics?
Probability of accepting a hypothesis that is false
What is the equation for the power of a study?
1-B
*Probability of rejecting the null hypothesis when it is false
What is the minimum Power a study should have?
80%
What things can increase the power of a study?
- Larger significance level
- Larger effects
- Decreased variability of the observations
- Larger sample size
What is this assessment tool…Economic assessment method utilized in which costs and consequences of alternative cardiac interventions are expressed in costs per unit of health outcome. This is applicable to health programs as well as health services to determine preferred action that requires the least cost to produce a given level of effectiveness.
CEA: Cost effective analysis
What is CUA?
- Uses quality-of life measurements expressed as utilities (QALY) in the value equation.
- Disability-adjusted life year (DALY) is also a measure but is for the overall “burden of disease”
- Quantifies the impact of premature death (like QALY), but also disability on a population by combining them into a single, comparable metric
What is CBA?
Seeks to translate all relevant healthcare considerations into monetary terms by analyzing economic and social costs of medical care and benefits of reduced loss of net earnings due to preventing premature death or disability
What is a technique where results from a number of studies that are similar in nature are gathered to give one overall estimate of the effect?
Meta-analysis
List the formal steps for a meta-analysis
- Decide on effect of interest
- Check for statistical homogeneity
- Estimate average effect of interest with Cis
- Interpret the results and present the findings (forest plot)
List advantages of meta-analysis
- Refinement and reduction
- Efficiency
- Generalizability and consistency
- Reliability
- Power/precision
List disadvantages of meta-analysis
- Publication bias
- Clinical heterogeneity
- Quality differences
- Lack of independence of study subjects
What is a systematic review?
- Uses meta-analysis to render well-informed clinical decisions… essential part of evidence based medicine
- Major disease categories often have a sufficient number of randomized clinical trials for the at minimum a meta-analysis to determine the value of such an intervention
When is the risk ratio or relative risk used?
Prospective cohort studies
How is RR calculated?
Divide risk in treated/exposed group by risk in control/unexposed group
How is RR reported?
Given with a 95% CI
- Can be <1, 1 or >1
- If the CI includes 1, not statistically significant
What is RR similar to?
Odds ratio
What is the relative risk reduction?
Proportion by which the intervention reduces the event rate
*Control group risk-Intervention group risk/Control group risk
What is the absolute risk reduction?
Difference between the event rates in the intervention versus control groups
*Control group risk-Intervention group risk
What is the number needed to treat?
Number of patients who need to be treated for one to get benefit
Relationship of NNT and ARR?
NNT is reciprocal of ARR
ARR = 100/NNT
When is odds ratio used?
Retrospective case-control studies
How is the odds ratio calculated?
By comparing odds of the exposed versus control groups
*Calculated by dividing the event occurrence by the number of times that the event doesn’t happen
How is the odds ratio reported?
Given with a 95% CI
- Odds ratio can be <1, 2 >1
- If it includes 1, it isn’t statically significant
In a typical receiver operating characteristic (ROC) curve, what is the significance of the upper left corner or coordinate (0,1)?
100% sensitivity and specificity
- Percent classification
- No false negatives and no false positives
What is a ROC curve
A 2-way plot of the sensitivity (true +) against 1 minus the specificity (false + rate) for different cutoff valves for a continuous variable in a diagnostic test
What shape do you want an ROC curve to have?
Sharp upslope then taper off (versus just a straight diagonal line)
What is the measure of precision of the sample mean or how close the sample mean is likely to be to the population mean?
Standard error of the mean (SEM)
What is variance?
Square of the standard deviation
What is the coefficient of variation?
Ratio of the SD to the mean
What is a measure of spread away from the mean?
Standard deviation
What is the square root of variance?
SD
What is a measure of precision of the sample mean or how close the sample mean is likely to be to the population mean?
SEM
What is the degree of closeness of measurements to quantity’s true value?
Accuracy
What is the reproducibility of a study result with the study to be repeated under the same circumstances?
Precision
How is precision measured?
Standard error of measurement
What is a Chi squared test?
Measure of the difference actual and expected frequencies with categorical variables
What needs to be set up to calculate a Chi2 value?
Contingency Table
If there is no difference between the actual and expected values, what is the Chi2 value?
0
*Larger the difference, bigger the X2 value (and p-value accompanies X2 value)
What is the number of independent comparisons that can be made between members of a sample and is used with X2 to calculate the p-value?
Degree of freedom
In the example of some kids with SVT being treated with digoxin v. propranolol, what degrees of freedom is needed to calculate a p-value
1
*Number of independent comparisons that can be made between members of sample
What is sometime used with a Chi2 test to improve the accuracy of the p-value?
Yates continuity correction
When is a Fisher exact test used?
When numbers in a contingency table of categorical variables are small
When is a McNemar test used?
For 2 groups with paired data
What is an extension of the Chi2 test that is used when comparing several 2-way tables (like a meta-analysis)?
Mantel Haenszel test
What is a correlation coefficient?
The strength of the linear relationship between 2 variables
What is the range of a correlation coefficient?
Denoted by r and ranges from -1 to +1
What is sometimes used with correlation coefficient to correct for negatively corrected relationships?
R2
When can a correlation coefficient not be calculated?
Non-linear relationship
Outliers