Statistics/Epidemiology Flashcards

1
Q

Statistical inference

A
  • process of inferring features of the population from observation of a sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Biases

A

selection bias: study groups differ with respect to determinants of outcome other than those studied

  • best overcome with randomization

measurement bias: methods of measurement consistently different between groups

  • ie. recall bias

confounding bias: two variables travel together and the effect of one is confused by the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Standard error of the mean (SEM)

A

definition: measure of distribution of mean of samples around the population mean

ie. determines how accurate a sample of the population this is

Formula: SE= SD of sample/ square root of sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Confidence Interval

A

definition: interval which the true statistic is believed to be found within a population

ie 99% CI suggests 99% confident that the interval contains the population mean

formula: sample mean +/- 2.56 xSE= 99% CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Z scores

A

definition: examines the comparison between a sample mean and a known population mean by calculation the difference between means to the SE

formula: Z= (sample mean- pop. mean)/SEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Null hypothesis

A

H0: states there is no difference between the samples or populations being compared

ie. P1-P2= 0 or P1=P2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Statistical significance

A

purpose: how strong the evidence for a difference between 2 groups is and whether it could be obtained by chance alone

significance level= alpha

  • normal level are 5%, 1% and 0.1%
  • the smaller the value the less likely the difference is due to chance

P values

  • the probability that a given difference is observed in the study sample when there is no difference in the population
  • strength of the evidence in terms of probabilities
  • p 0.05 (5%), p 0.01 (1%), 0.001 (0.1%)
  • normally significant if <0.05
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Type I and type II errors

A

type I (alpha) error: false positive

  • the probability of detecting a difference when there is none
  • usually set at 0.05

type II (beta) error: false negative

  • the probability of not detecting a difference when one exists
  • usually set at 0.02

power: depends on sig. level, size of difference, sample size

  • power= (1- beta)
  • the larger the power the smaller the type II error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Students t test

A

use: to compare the means between to small samples

t value= observed difference in means/SE of the difference in means

paired data t-tests: used to compare two small paired observations

degrees of freedom: no. of independently varying quantities that can be assigned to a distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Chi square

A

use: to determine non parametric differences in mean between two or more groups based on the Chi distribution

Chi2= Sum (observed-expected)2/expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Correlation

A

correlation coefficient (r): describes the strength of the linear relationship between variables

  • can range from -1 to +1

degree of association

  • 0.8-1.0 strong
  • 0.5-0.8 moderate
  • 0.2-0.5 weak
  • 0-0.2 negligible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Regression

A

definition: relationship between 2 variables and how one value varies depending on the other

formula: Y = a +bx

values: -infinity to +infinity

  • slope of 0 represents no relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Rates

A

incidence= no. of new cases in a given period/population at risk during this period

prevalence= total no. of cases in a population at one time/total population at risk at the time

mortality rate= no. of deaths in 1 yr/total population mid-year x 1000

proportionate mortality rate= no. deaths due to cause in period of time/total no. of deaths in same time x 100

standardised mortality ratio= no. deaths in pop./expected deaths in population

  • if >100 then more events are occuring than expected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Meta analysis

A

definition: analysis of data on two of more similar studies to determine global conclusion

  • results expressed as odds ratio or relative risk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Measure of effect

A

absolute risk: occurence in exposed

relative risk: incident rate of exposed/incidence rate of non-exposed

  • measures strength of association between exposure and outcome

attributable risk: incidence exposed- incidence non-exposed

absolute risk reduction (ARR): incidence rate in control- incidence in exposed

relative risk reduction (RRR): (1-RR) x100%

  • ie. percentage of the baseline risk increased by exposure

number needed to treat (NNT): 1/RRR

  • number needed to treat to prevent one event

odds ratio (OR): prob of an event/ (1- prob of an event)

  • used for case-control study

hazard ratio (HR): measure of RR in survival studies

  • HR>1 suggests one group is more likely to experience event
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Strength studies for cause and effect

A

From strongest to weakest:

  • clinical trial
  • cohort study
  • case-control study
  • cross sectional
  • case studies
  • case report
17
Q

Validity

A

sensitivity= TP/ (TP+FN) x100

  • ability to correctly detect people with disease
  • SnOUT: high sensitivity a negative test rules out the diagnosis

specificity= TN/ (TN+FP) x100

  • ability to correctly detect people without disease
  • SpIN: high specificity a positive result rules in the diagnosis

predictive value= (TP/TP+FP) x 100

-ability to detect those with disease amongst those whose test is positive

18
Q

Normal distribution

A

Normal distribution

  • Mode=median=mean

Data within each SD

1 SD: 68%

2 SD: 95%

3 SD: 99%

19
Q

Study design

A

cross-sectional study

  • freq of a disease or RF in a population at a given time
  • can not determine causation

cohort study

  • observation study of a group for development of disease
  • good for common diseases
  • can be prospective/restrospective

case-control study

  • comparison of cases with controls to determine difference in groups
  • good for rare diseases
20
Q

Clinical trials

A

stage 1: pharmacology/toxicity

stage 2: treatment efficacy

stage 3: compare with gold standard

stage 4: Post-marketing surveillance

21
Q

Formulas

A

PPV: a/(a+b)

NPV: d/(c+d)

NNT: 1( (a/a+b)-(c/c+d))

22
Q

Bioavailability

A