Lecture 12: Biostatics Flashcards
biostatistics
interpreting the results from dental dental research studies and publications
Types of data
- scientific journals
- clinical study reports
- product manufacturers/representatives
- presentations at dental conferences
statistics allow us:
to understand information and make clinical decisions based on data.
How to describe data
use quantitative data
Quantitative data:
mean
median
mode
SD
mean
Average of the data. Sensitive to extreme values.
Median
Middle point of the data. Less sensitive to extreme values.
Mode
Most frequent occurring value in the data.
Standard Deviation (SD)
Measure of how much the individual data points vary around the mean.
frequency
count of a given outcome or in each category
percentage
count of a given outcome per hundred showing proportion of each category out of the total
bar chart
(you know what it looks like)
shows categorical data
histogram
normal curve
shows quantitative data
X
independent variable
Y
dependent variable
correlation coefficient
r - can lie between -1 and +1
(+) r value
as X increases, Y increases
(-) r value
as X increases, Y decreases
the closer (r) is to +1 or -1 …
the stronger the relationship
Square of Correlation
r^2
is the fraction of variation in Y explained by X
the higher r^2 …
the better the fit of the regression line
hypothesis
an explanation for certain observations
Ho:
tests the hypothesis (null)
Null hypothesis states:
there is no difference between two groups being compared or no effect of a product or intervention
result of hypothesis testing
data will either “fail to reject” or will “reject” the null hypothesis
Ha:
often the one researcher thinks is the “truth”
Ha states:
there is a difference between two groups being compared or an effect of a product or intervention
directional
u1 > u2
non-directional
u1 does not equal u2
Ho: u1 = u2
interpretation? (example)
the population mean for group 1 (men) is the same as the population mean for group 2 (women)
Ha: u1 does not equal u2
interpretation? (example)
the population mean for group 1 (men) is different than the population mean for group 2 (women)
Type I Error:
rejecting the null hypothesis that is actually true in the population
alpha
level of statistical significance in Type I Error
alpha is commonly set to:
0.05
maximum chance of 5% of incorrectly rejecting the null hypotheses when it is actually true
Type II Error:
failing to reject (accept) the null hypotheses that is actually false in the population
beta
probability of a Type II Error
Power
calculated as (1-Beta) and is related to the sample size used in the study
P-value
the probability, assuming that the null hypothesis is true, of seeing an effect as extreme or more extreme than that in the study by chance.
reject the null hypothesis if:
P-value is less than or equal to alpha
fail to reject the null hypothesis if:
P-value is greater than alpha
interpretation:
if a p-value is 0.007, this means that:
the probability of obtaining data different from the null hypothesis as those obtained in the experiment is 0.007.
Confidence intervals
a range of values about a sample statistic that we are confident that the true population parameter lies.
(Commonly 95%)
t-test
Statistical test that can be used to determine whether the mean value of a continuous outcome variable differs significantly between two independent groups.
t-test assumes:
approximate normal distribution of the variable of interest in the groups being compared.
one-sample t-test
can be used when the outcome variable of interest is only being examined in one group.
(testing difference from 0 or some given value)
matched-pair t-test
can be used when subjects are matched pairs and their outcomes are compared within each matched pair
(including where observations are taken on the same subjects before and after a given intervention)
Chi-squared test
can be used to compare the proportion of subjects in each of two groups who have a dichotomous outcome
chi-squared example
comparing the presence of periodontitis in diabetics vs. non-diabetics
Ho (for chi-squared):
there is no association between row and column variables in a two-way table
(i.e. no association between having diabetes and periodontitis)
Ha (for chi-squared):
there is an association between row and column variables in a two-way table
(i.e. there is an association between having diabetes and having periodontitis)
Analysis of Variance (ANOVA)
a statistical method that allows for comparison of several population means
ANOVA uses:
F-statistic
F-statistic
reject null hypothesis that the population means of all groups are equal if P-value of F-statistic is less than or equal to alpha (0.05)
F-statistic example
want to compare the strengths of composite A, composite B, and composite C to see if they are significantly different
clinical significance
are findings important from a clinical standpoint?
statistical significance
probability that chance is responsible of an observed difference
- p-values and/or confidence intervals
- sample size is important
(p-value says nothing about clinical relevance or quality of the study)
fundamental issue
quantifying our confidence on how well the findings reflect the truth (given that there is always a role of chance)
Two main approaches to the Fundamental Issue
- Hypothesis testing and p-values.
- Confidence Interval Estimation
Limitations of statistical inference
only tells about the role of chance or random error in making inference from your study population to the source population.
Statistical inference do/do not tell you about the role of bias or confounding?
do not
statistics do/do not tell you about causality?
do not
Bias
systematic error in the design, conduct, or analysis of a study that results in a mistaken estimate of an exposure’s effect on disease
Selection Bias
Systematic error in selecting subject into one or more of the study groups, such as cases and controls, or exposed and unexposed.
Information Bias
Errors in procedures for gathering relevant information
Examples of information bias
bias in recall
in collecting data
in interview
in reporting
Confounding
Situation is which non=casual association between a given exposure and an outcome is observed as a result of the influence of a third variable usually designated a confounding variable or confounder.
A variable is a confounder if:
- It is a known risk factor of the outcome
2. It is associated with the exposure but is not the result of the exposure
When evaluating confounding:
Is a covariate a confounder?
(ask what two questions?)
- is it associated with exposure?
2. is it causally associated with outcome?
if answered YES to evaluation of confounding questions:
step 1: calculate crude association
step 2: calculate stratum specific association
confounding is/is not an “all or none” phenomenon?
is not