Statistics/Epidemiology Flashcards
Statistical inference
- process of inferring features of the population from observation of a sample
Biases
selection bias: study groups differ with respect to determinants of outcome other than those studied
- best overcome with randomization
measurement bias: methods of measurement consistently different between groups
- ie. recall bias
confounding bias: two variables travel together and the effect of one is confused by the other
Standard error of the mean (SEM)
definition: measure of distribution of mean of samples around the population mean
ie. determines how accurate a sample of the population this is
Formula: SE= SD of sample/ square root of sample size
Confidence Interval
definition: interval which the true statistic is believed to be found within a population
ie 99% CI suggests 99% confident that the interval contains the population mean
formula: sample mean +/- 2.56 xSE= 99% CI
Z scores
definition: examines the comparison between a sample mean and a known population mean by calculation the difference between means to the SE
formula: Z= (sample mean- pop. mean)/SEM
Null hypothesis
H0: states there is no difference between the samples or populations being compared
ie. P1-P2= 0 or P1=P2
Statistical significance
purpose: how strong the evidence for a difference between 2 groups is and whether it could be obtained by chance alone
significance level= alpha
- normal level are 5%, 1% and 0.1%
- the smaller the value the less likely the difference is due to chance
P values
- the probability that a given difference is observed in the study sample when there is no difference in the population
- strength of the evidence in terms of probabilities
- p 0.05 (5%), p 0.01 (1%), 0.001 (0.1%)
- normally significant if <0.05
Type I and type II errors
type I (alpha) error: false positive
- the probability of detecting a difference when there is none
- usually set at 0.05
type II (beta) error: false negative
- the probability of not detecting a difference when one exists
- usually set at 0.02
power: depends on sig. level, size of difference, sample size
- power= (1- beta)
- the larger the power the smaller the type II error
Students t test
use: to compare the means between to small samples
t value= observed difference in means/SE of the difference in means
paired data t-tests: used to compare two small paired observations
degrees of freedom: no. of independently varying quantities that can be assigned to a distribution
Chi square
use: to determine non parametric differences in mean between two or more groups based on the Chi distribution
Chi2= Sum (observed-expected)2/expected
Correlation
correlation coefficient (r): describes the strength of the linear relationship between variables
- can range from -1 to +1
degree of association
- 0.8-1.0 strong
- 0.5-0.8 moderate
- 0.2-0.5 weak
- 0-0.2 negligible
Regression
definition: relationship between 2 variables and how one value varies depending on the other
formula: Y = a +bx
values: -infinity to +infinity
- slope of 0 represents no relationship
Rates
incidence= no. of new cases in a given period/population at risk during this period
prevalence= total no. of cases in a population at one time/total population at risk at the time
mortality rate= no. of deaths in 1 yr/total population mid-year x 1000
proportionate mortality rate= no. deaths due to cause in period of time/total no. of deaths in same time x 100
standardised mortality ratio= no. deaths in pop./expected deaths in population
- if >100 then more events are occuring than expected
Meta analysis
definition: analysis of data on two of more similar studies to determine global conclusion
- results expressed as odds ratio or relative risk
Measure of effect
absolute risk: occurence in exposed
relative risk: incident rate of exposed/incidence rate of non-exposed
- measures strength of association between exposure and outcome
attributable risk: incidence exposed- incidence non-exposed
absolute risk reduction (ARR): incidence rate in control- incidence in exposed
relative risk reduction (RRR): (1-RR) x100%
- ie. percentage of the baseline risk increased by exposure
number needed to treat (NNT): 1/RRR
- number needed to treat to prevent one event
odds ratio (OR): prob of an event/ (1- prob of an event)
- used for case-control study
hazard ratio (HR): measure of RR in survival studies
- HR>1 suggests one group is more likely to experience event