CM- Biostatistics Flashcards
What are the 6 steps in the process of statistics?
- formulate a hypothesis
- design a study to test the hypothesis
- collect valid and accurate data
- manage the data effectively
- analyze the data creatively
- interpret the findings validly to answer the question
What are the 4 practical questions for analyzing data?
- what are the variables?
- predictor [independent]
- outcome [dependent] - What types of variables are they?
- what is the scale
- what is the category [dichotomous, continuous, nominal, ordinal] - If it is continuous, does it meet normality assumption?
- Independent or Paired groups?
What are the 4 categories of variables?
- dichotomous
- sex, death [yes or no] - multi-category nominal
- name, hospital, service
- infection rates by specialty - multi-category ordinal
- class rank quartiles
- age categories [15 to 24, 25-40, 40-60, etc] - Continuous [interval]
- height, weight, age
How can you tell if a continuous outcome variable is normally distributed ?
It will be a standard bell curve where the mean equals the median
When the data is skewed right, what is the better measure of the data, median or mean?
Median because the mean will be disproportionately affected by the outliers
What is the difference between an independent and paired group?
Independent is when the groups don’t have effect on each other.
Ex. ALL UTSW faculty vs. ALL medical students
Dependent is when the groups would have an effect on each other
Ex. Medical students vs. Advisors
You are performing a study and the independent variable is dichotomous. The dependent variable is also dichotomous.
What should you use to analyze the data?
What measures of association are used?
What tests of statistical significance apply?
Analyze with a 2x2 contingency table.
Measures of association:
- RR
- AR
- contingency coefficienct [phi]
Tests:
- chi-squared
- fisher’s exact test
- 95% confidence interval of RR
- McNemar’s test [maintain pairing]
What is the difference between relative risk and odds ratio?
They are calculated the same way with a 2x2 contingency table.
RR = cohort study
Odds Ratio = case control study
What is the difference between Chi2 and fisher’s exact test when evaluating statistical significance?
Chi2 is an approximation that allows you to calculate a p-value.
-Estimation, but good with large numbers
Fisher’s exact is an exact value for statistical significance [not an estimation], but it uses factorials so it is not good with large numbers
How does the 95% confidence interval of the relative risk change with sample size?
As the sample size increases the confidence interval becomes more narrow
When is McMemar’s test used?
paired observations when the independent and dependent variables are dichotomous
When you are conducting a test with:
Independent variable being multi-category nominal
Dependent variable being dichotomous.
What is used for analysis?
What is the measure of association?
What is the test of statistical significance?
Analysis: nX2 contingency table
Measure of association: uncertainty coefficient
Test of Stat Sig : Chi-square
*this will not help you see trends, just differences between the nominal groups
You are conducting a test with:
Independent variable being multi-category ordered
(ex. no, light, medium, heavy smokers)
Dependent variable being dichotomous
What is used for analysis?
What is the measure of association?
What is the test of statistical significance?
Analysis: nX2 contingency table
Measure of association: Goodman-Kruskal gamma
Test of Stat Sig:
- Chi-square
- Chi-square test for trend
You are conducting a study with:
Independent variable being continuous
Dependent variable being dichotomous.
What is used for analysis?
- recode the independent variable into categories and analyze like multi-category ordinal or nominal
- nX2 contingency table
You are performing a study where the independent variable is dichotomous and the dependent variable is continuous.
When you chart the data, the normality assumption is satisfied.
How do you analyze the data?
What is measure of association?
What is the test of stat sig?
Analyze: compare the means
Measure of association = subtract the means
Test of stat sig:
- student’s t-test
- paired t-test
You are performing a study where the independent variable is dichotomous and the dependent variable is continuous.
When you chart the data, the data is skewed to the right.
How do you analyze the data?
What is measure of association?
What is test of stat sig?
Analyze:
- transform the data [log, square root, etc] and then compare means
- use non-parametric tests
- Mann-Whitney U test, Wilcoxon Rank-Sum Test, MEdian test, Signed Rank Test [paired]
Measure of association:
Difference between transformed means
Test of Stat Sig:
- student’s T-test
- Paired T-test
You are performing a study where the independent variable is multi-category. The dependent variable is continuous.
When you chart the data, the normality assumption is satisfied.
How do you analyze the data?
What is the measure of association?
What is the test of stat sig?
Analyze:
1. compare means
Measure of association:
1. difference between means
Test of stat sig:
- ANOVA [t-test but for more than 2 groups]
- F test
You are performing a study where the independent variable is multi-category. The dependent variable is continuous.
When you chart the data, the normality assumption is NOT satisfied.
What are the 2 ways you can analyze the data?
- transform the data [log, etc] and then compare means, measure association by taking difference between means, and use ANOVA as test of stat sig,
- use nonparametric tests like
- Kruskal Wallis
- non parametric ANOVA
You are performing a study where the independent and dependent variables are both continuous.
You chart the data and the normality assumption is satisfied.
How do you analyze the data?
What is the measure of association?
What is the test of stat sig?
Analysis:
1. regression analysis [linear or multivariable]
Measure of association:
1. regression coefficient [B] and intercept [A]
Stat significance:
1. regression F test
When making a scatterplot for regression analysis of 2 continuous variables, What is on the Y and X axis?
What is B and a?
Y is the dependent variable
X is the independent variable
B is the strength of the effect of X on Y [slope]
a is the intercept
What is the p-value?
What value must it be for a study to be statistically significant?
When can a higher or lower value be used?
Probability that the observed distribution occurred by chance alone.
The higher the P value, the less real your association is.
For most studies, the p value must be below 0.05 to be “statistically significant”
Higher p-value is used with Bayesian stat when the severity of the situation is great
Lower p-values are utilized for small sample sizes
How is chi-square calculated?
What increases the value?
[AD-BC]sqN divided by [AB+CD+AC+BD]
Increases with:
- sample size
- degree of diff
How do you determine the degrees of freedom to find a chi2 value?
DF = max number of rows in the table minus one
The more degrees of freedom, the more difficult it is to get a significant result
What is the standardized mortality ratio?
cases observed/# expected x 100
It is used to test if the incidence of disease is greater than expected.