Epidemiology & Biostatistics Flashcards

1
Q

Define Bias

A

Systematic error in the design, management or analysis of a study that causes a mistaken estimate of the exposure’s effect on the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain design bias

A

Wrongly chosen sampling strategy or study design.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain conduct bias

A

case enrollment, follow-up or data collection is not carried out properly and has issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain analysis bias

A

The chosen statistical methods are wrong, variables could be miscategorized or modelling assumptions can be wrong.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What three types of bias are there?

A

Selection Bias, Confounders and information bias (also called measurement or misclassification bias)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define confounding

A

A confounder is a variable that influences both the dependent variable and independent variable causing a spurious association.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define sensitivity

A

The proportion of positives that are correctly identified as such. (Also called the true positive rate, the recall, or probability of detection in some fields)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define spesificity

A

The proportion of negatives that are correctly identified as such. (Also called the true negative rate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the four main aspects of infection control?

A

Surveillance (passive/active), patient contact, hygiene, education/awareness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does Normal distribution fit with standard deviation?

A

68% are within 1 SD of mean and 95.5% are in 2 SD’s and 2.3% in each tail .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Range of P-value and the usual significance level

A

range from 0-1, significance commonly 0,05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Type II error?

A

Thinking there is no difference when there in truth is difference, ie. the failure to reject null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Type I error?

A

Thinking that there is a difference, when in fact there is none, ie. the failure to accept true null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain power and what affects it

A

The ability for a test to find a difference when there really is difference, ie a true positive. Power is high if the outcome difference is large, when significance level is high, sampling variability is low and sample size is large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What can linear regression be used for? How to test for significance of the test?

A

explore the linear relationship between two continuous random variables with normal distribution and equal variance. Use p-value of the slope for significance and R-squared (between 0-1, percentages) to how well it fit’s the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Uses of logistic regression and type of curve, results and significance?

A

To model the log of the odds of the binary outcome we are interested in as a linear function of one or more predictors, X. Sigmoid curve. Results are either coefficients (beetas) or odds ratios ratios calculated from them (exp(coefficient) and confidence intervals or p-values.

17
Q

What tests can you use for testing or modeling means of continuous data?

A

t-test, Anova and linear regression

18
Q

What should you know to estimate correct sample size,?

A

Significance level (alpha=0,05, remember type I versus type II trade off - the lover level, the higher sample), Power (1-beta, sensitivity), Variance (ie how precise are your measurements), Effect size (smaller effect, larger sample)

19
Q

Ten Steps of an Outbreak Investigation

A
  1. Determine the existence of the outbreak
  2. Confirm the diagnosis
  3. Define a case and count cases
  4. Orient the data in terms of time, place, and person
  5. Determine who is at risk of becoming ill
  6. Develop a hypothesis that explains the exposure that caused disease and test this hypothesis
  7. Compare the hypothesis with the established facts
  8. Plan a more systematic study
  9. Prepare a written report
  10. Execute control and prevention measures
20
Q

For example, suppose we want to estimate the survival of premature infants that are born at 25 weeks of gestation and we create a CI that ranges from 64.3% to 89.5%. How can we interpret this value?

A

“We are 95% sure that this interval 64.3% to 89.5% contains the overall proportion of surviving infants in the population.” Or another way, “We are 95% sure that the true proportion of survival for infants born at 25 weeks of gestation is between 64.3% and 89.5%”

21
Q

Describe nominal data

A

No natural order - gender, race, blood type

22
Q

Describe ordinal data

A

Natural order - tumor scales, social class

23
Q

Describe binary data

A

0-1, disease status, diagnostic test result

24
Q

Describe categorical data and test to use with it

A

Either nominal, ordinal or binary. Summarise with counts and proportions, plot with pies and bars, analyse with confidence intervals, chi-square, mcNemar(paired), logistic regression (multiple

25
Q

What classes fall into quantitative data

A

Discrete meaning integers and continuous which can take any real number

26
Q

What methods can you use for discrete count data?

A

rates for summary, trend plots and histograms for plots and confidence intervals and poisson regression

27
Q

What methods can you use for continuous data?

A

Summarise: mean, SD, median, interquartile range.
Plot: histogram, scatter plot, box plot, dot plot
Analyse: confidence interval, T-test, anova, correlation, simple and multiple linear regression

28
Q

Time-to-Event methods?

A

Summary: median survival time, five-year survival, hazard ratio (HR)
Plot: Kaplan-meier curves
Analyse: confidence interval of HR, Long-rank test, proportional hazard regression (PHR = cox regression)

29
Q

Factors that affect estimate precision

A

variability of outcome, sample size, desired confidence level