Lecture 24-32 - Intro To Biostats In Epidemiology Flashcards
What are the 3 primary levels of variable?
L24 S5
- nominal
- ordinal
- interval/ratio
What are the 3 key attributes of variables?
L24 S5
- order/magnitude
- consistency of scale
- rational absolute zero
What is nominal data?
What are its characteristics?
L24 S6
- consists of labeled variables without quantitative characteristics
- can be dichotomous or binary in nature
- no order/magnitude
- no consistency of scale
What is ordinal data?
What are its attributes?
-contains rank-able categories that are not evenly spaced
- yes order/magnitude
- no consistency of scale
What is interval data?
What are its attributes?
L24 S8
- rankable categories that are evenly spaced
- arbitrary 0 value that does not mean absence of measured value
- yes order/magnitude
- yes consistency of scale
- no rational absolute zero
What is ratio data?
What are its attributes?
L24 S8
- rankable categories that are evenly spaced
- absolute 0 value that indicates absence of measured value
- yes order/magnitude
- yes consistency of scale
- yes rational absolute zero
What is the order of specificity of data types?
In which direction(s) can you convert data types?
L25 S12
Nominal < Ordinal < Interval < Ratio
Data can only be converted down in specificity, not up
What percentage of data is within one, two, and three standard deviations of the mean in a normally distributed data set?
L26 S23
One deviation (-1 to +1): -68%
Two deviations (-2 to +2): -95%
Three deviations (-3 to +3): -99.7%
What is name given to the types of tests that are used on normally distributed data sets?
L26 S23
- parametric test
- or-
- interval test
What determines if a data set is skewed?
What makes a data set positively skewed?
Negatively skewed?
L26 S24-25
-mean and median differ from one another
Positively skewed:
- mean is higher than median
- tail goes to the right/positive direction
Negatively skewed:
- mean is lower than median
- tail goes to the left/negative direction
What does skewness represent?
L26 S35
-the measure of asymmetry of a distribution
What is kurtosis?
What do a negative, zero, and positive kurtosis represent?
L26 S37
-measure of the extent to which data clusters around the mean
Negative kurtosis:
-less cluster
Zero kurtosis:
-normal distribution
Positive kurtosis:
-more cluster
Calculating the mean on nominal and ordinal data can be done but it can’t be interpreted, why is this?
L26 S40-43
The numbers assigned to data is arbitrary and can be changed. (There is no consistency of scale and there are no units)
What is the name of the test that can be used to assess for equalness of variance between groups?
L26 S44
-Levene’s test
How do you assess data sets that are not evenly distributed?
L26 S45
- use tests that do not require normal distribution (non-parametric tests)
- transform the data to a standard value (z-score or log transformation) to make it normally distributed
What are type 1 and type 2 errors?
L26 S47-49
Type 1 error:
- when the null hypothesis is true and should have been accepted, but wasn’t
- there is no true difference between groups but it was said that there is
Type 2 error:
- when the null hypothesis is false and should have been rejected but wasn’t
- there is a true difference between groups but it was said that their isn’t
What factors should be looked at to determine if a study’s results are statistically significant?
L27 S50
- power: the ability of a test to detect if there are true differences between groups
- sample size: the greater the sample size the greater the studies ability to detect if there is a difference between groups
- p value
- confidence interval
What are the typical accepted type 1 and type 2 error rates?
L27 S51
Type 1:
-5%
Type 2:
-20%
What are some ways that p value can be interpreted as?
L27 S56
- probability of making a type 1 error if the null hypothesis is rejected
- probability of erroneously claiming a difference between groups when one does not really exist
- probability of obtaining group differences as great or greater if the groups were actually the same or equal
- probability of obtaining test statistic as high/higher if the groups were actually the same/equal
Where is it desired to see that there is no statistical difference between groups?
L27
- baseline data
- Levene’s test
What does power mean with respects to statistical significance?
L27 S50
-the ability of a study to determine if there is a true difference between groups
1 - (type 2 error rate)
What is a confidence interval?
L28 S63
-percentage of confidence that statistically includes the real relationship being compared
If the confidence interval of a ratio contains the number ____________ it is statistically insignificant.
If the confidence interval of an absolute difference contains the number __________ it is statistically insignificant.
L28 S65-66
1; 0
What factors should be included in the interpretation of a confidence interval?
L29 S65
- level of confidence
- interpretation of range
- statement of statistical significance
- statement of the groups being compared
What questions should be asked when selecting a statistical test?
L29 S91
- what is the level of data being collected (nominal/ordinal/interval)?
- what type of comparison/assessment is desired (frequencies/counts/proportion)
- how many groups are being compared (2 or >3) ?
- is the data independent or related (from the same person or not)?
What is a correlation test?
What are the correlation tests for each data level?
L29 S75
-provides a quantitative measure of the strength and direction of relationship between variable
Nominal:
-contingency coefficient
Ordinal:
-Spearman correlation
Interval:
-Pearson correlation
What is a survival test?
What are the survival tests for each level of data?
L30 S81-83
- compares proportion of event occurrence over time between groups
- “changes over time”
- “time to event”
- can be graphed as a Kaplan-Meier curve (regardless of data level)
Nominal:
-Log-Rank test
Ordinal:
-Cox-Proportional Hazards test
Interval:
-Kaplan-Meier test
What is a regression test?
What are the regression tests for each level of data?
L30 S84-86
- measure of relationship between variables to predict an outcome
- able to calculate an odds ratio
- “predict”
Nominal:
-logistic regression
Ordinal:
-multinomial logistic regression
Interval:
-linear regression
What test is used to evaluate NOMINAL data of 2 INDEPENDENT groups and >3 INDEPENDENT groups?
L31 S93
2 groups:
-Pearson’s Chi-square test
> 3 groups:
-chi-square test of independence
When there are less than 5 observations of an occurrence, Fisher’s exact test is used instead of the two listed above.
What must be done in groups of NOMINAL data of more than 3 when there is found to be a statistically significant difference between groups?
L31 S95
Post-hoc testing must be done to determine between which groups the statistically significant difference occurs.
ex. Bonferroni test of inequality (Bonferroni correction)
What test is used to evaluate NOMINAL data of 2 RELATED groups and >3 RELATED groups?
What words should indicate that data is related?
L31 S96
2 groups:
-McNemar test
> 3:
-Cochran
Indicators of related data:
- pre- vs. post-
- before vs. after
- baseline vs. end
What test is used to evaluate ORDINAL data of 2 INDEPENDENT groups and >3 INDEPENDENT groups?
L32 S97
2 groups:
-Mann-Whitney test
> 3 groups:
-Kruskal-Wallis test
What must be done in groups of ORDINAL data of more than 3 when there is found to be a statistically significant difference between groups?
L32 S99
Post-hoc testing must be done to determine between which groups the statistically significant difference occurs.
Student-Newman-Keul test:
- compares all comparisons possible
- groups must be equal in size
Dunnett test:
- compares all comparisons against a single control
- groups must be the same size
Dunn test:
- compares all comparisons possible
- can be used when groups are not equal in size
What test is used to evaluate ORDINAL data of 2 RELATED groups and >3 RELATED groups?
What words should indicate that data is related?
L32 S98
2 groups:
-Wilcoxon Signed Rank test
> 3:
-Freidman test
Indicators of related data:
- pre- vs. post-
- before vs. after
- baseline vs. end
What test is used to evaluate INTERVAL data of 2 INDEPENDENT groups and >3 INDEPENDENT groups?
L32 S100
2 groups:
-Student t-test
> 3 groups:
- Analysis of variance (ANOVA)
- Analysis of Co-Variance (ANCOVA) (used to control for confounding)
What test is used to evaluate INTERVAL data of 2 RELATED groups and >3 RELATED groups?
What words should indicate that data is related?
L32 S102
2 groups:
-Paired t-test
> 3:
- Repeated Measures of ANOVA
- Repeated Measures of ANCOVA (used to control for confounding)
Indicators of related data:
- pre- vs. post-
- before vs. after
- baseline vs. end
What must be done in groups of INTERVAL data of more than 3 when there is found to be a statistically significant difference between groups?
L32 S104-105
Post-hoc testing must be done to determine between which groups the statistically significant difference occurs.
Student-Newman-Keul test -or- Tukey test -or- Scheffe test:
- compares all comparisons possible
- groups must be equal in size
Dunnett test:
- compares all comparisons against a single control
- groups must be the same size
Dunn test:
- compares all comparisons possible
- can be used when groups are not equal in size
Bonferoni Correction:
-adjusts p value for # of comparisons (very conservative)
What does a Kappa statistic show?
L32 S106
- correlation test showing the level of consistency or agreement between different evaluators
- ranges from +1 to -1
- value of “+1” shows observers decision perfectly agree with each other
- value of “0” shows there is no relationship between observers decisions
- value of “-1” shows observers decisions perfectly oppose each other