Biostats All Flashcards

1
Q

observation

A

aka record is a row in a table of data. It represents one person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

variable

A

is a column in a table of data. It contains information about one characteristic of the person (race/gender/DOB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

quantitative/continuous variables examples and definitions

A
  • ratio-scale: is an interval variable with a true zero point (height, BP, duration of illness,#of children)
  • interval: value on a scale of equally spaced units with no true zero point (DOB, temperature)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

qualitative/categorical variables examples and definitions

A
  • nominal : values with no numerical ranking (residence). These can be dichotomous variables (alive/dead, smoker/non smoker)
  • ordinal: has values that can be ranked but are not evenly spaced (stage of cancer, education level, BMI)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

properties of frequency distributions are

A
  • central location (where the distribution has its peak)
  • spread (how widely it is dispersed on both sides of the peak)
  • shape (is it symmetrical on both sides of the peak)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how do you describe the central location

A

mean
median
mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do you describe spread

A

range
interquartile range
standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

when is a graph positively skewed

A

when its central location is to the left and its tail is to the right (aka graph is skewed to the right)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the IQR

A

it represents the central portion of distribution, from the 25 to the 75 percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to calculate standard deviation

A

Calculate the arithmetic mean.
Subtract the mean from each observation.
Square the difference. Sum the squared differences.
Divide the sum of the squared differences by n–1.
Take the square root of the value obtained.
The result is the standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define range

A

The range of a set of data is the difference between its

largest (maximum) value and its smallest (minimum) value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

define probability

A

measure of likeliness that an event occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

define odds

A

ratio of the probablity of having an event to the probability of not having an event (P/1-P)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

relationship between probability and odds

A

probability and odds are more alike the lower the absolute P (risk)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to calculate risk and odds from a table

A

risk: event/all events
odds: event/non events

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

proportion

A

a ratio in which the denominator includes the numerator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

ratio

A

is a number that expresses the relative size of two other numbers. Numerator is not in the denominator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

rate

A

occurrance of events over a specific time interval. Or the measure of frequency of some phenomena of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

prevalence

A

cases of a disease in a given pop at a specific time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

incidence

A
# of new cases of a disease during a period/ healthy pop  (at risk) at the beginning of the period
- proportion of a pop to acquire the disease in a period of time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

incidence rate

A

new cases / total person time of observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

prevalence tells you

A

probability of having the disease –> burden

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

incidence tells you

A

probability of developing the disease–> risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

risk ratio

A

risk in group 1 (group of interest) / risk in group 2 (comparison group)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

rate ratio

A

compares the incidence rates or mortality rates of 2 groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

in a case controlled study what can you measure

A

the odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

in a prospective study (like cohort or randomized) what can you calculate

A

risk ratio, rate ratio, odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

with IQR use

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

with standard deviation use

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

standard deviation and variance

A

SD is the square root of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

standard error of the mean is used to

A

calculate the confidence interval

32
Q

list the hierarchy of evidence, from least to most

A
case report
case series
ecological studies
cross sectional studies
case controlled
cohort
randomised controlled
33
Q

define clinical trial

A

a prospective study comparing the value of an intervention against a control. An investigator ASSIGNS which people get drug (treatment group) and which get placebo (comparison group)

34
Q

simple randonmized trial

A

patients are randomised to two treatments without considering their charcteristics. It is simple, useful when prognostic factors are unknown

35
Q

stratified randomnised design

A

when prognostic factors are known, and patients are grouped into prognostic categories. Within these groups patients are randomnly assigned treatments

36
Q

cross over design. Advantages? When to use? Disadvantages?

A

here patients serve as their own controls. Give them treatment for 6 weeks then don’t give them treatment for 6 weeks. and compare.

  • Use only for chronic diseases.
  • ADV: good for comparing results
  • DIS: Potential carryover effects of the drug
37
Q

factorial design

A

used to ask two or more questions in the same clinical trial.
eg: 2 treatments are studied for their relationship to response and each is given at 2 levels.

38
Q

what do you consider when taking a sample size

A

funding, ethics, eligibility criteria.
- must include an adequate number of
individuals
- consider the anticipated difference between the groups, the background rate of outcome, and probability of making some statistical errors.
- error type 1 and 2
- what is the smallest difference between treatments
- what is the variance
- Smaller anticipated differences between treatment and comparison groups require larger sample sizes

39
Q

define randomization. Why would you do it? How can you randomise? Benefit of masking and blinding? Issues with follow up?

A

assigning or ordering things via a random process. To remove or reduce bias.

  • coin toss, table of random numbers, stratified block randomization.
  • dropouts
  • lost to follow up
40
Q

compliance, non compliance, why and consequenes

A

non compliance: failure to follow the requirements of the protocol

  • reasons for it: toxic reactions to treatment, waning interest, desire to seek other therapies
  • conse: smaller difference between treatment and comparison groups than truly exists
  • how to prevent: simple regime of study to follow, enroll motivated people, make sure they are aware of things they are required to do, freqeuntly contact them throughout the study
41
Q

define interim analysis

A

analysis comparing intervention groups at any time before the formal completion of the trial. Used to stop trial if patients are at unnecessary risk

42
Q

intention to treat vs per-protocol analysis

A

ITT captures real life. Results use data from all subjects. Advantage: preserves randomization. Disadvantages: does not determine maximum potential effectiveness of a treatment
PPA: Results use data only from subjects who followed protocol. Advantages: evaluates maximum benefit of a treatment

43
Q

intention to treat vs per-protocol analysis

A

ITT captures real life. Results use data from all subjects. Advantage: preserves randomization. Disadvantages: does not determine maximum potential effectiveness of a treatment
PPA: Results use data only from subjects who followed protocol. Advantages: evaluates maximum benefit of a treatment

44
Q

main types of clinical trials

A
  • prevention trials
  • screening trials
  • diagnostic trials
  • treatment trials
  • QOL trials
  • compassionate use trials
45
Q

what is phase 4 of clinical trials, and why do we do them

A

are called post marketing studies. are used to get more information (long term side effects-thalidomide)

46
Q

difference between descriptive and analytical studies

A

descriptive: who/where/when
analytical: why - observational (cohort and case control) or experimental). Use descriptive studies to make a hypothesis and test it with analytical studies

47
Q

ecological study (for pop) (a descriptive study)

A

examines rates of disease in relation to a factor developed on a pop level (an aggregate/enviromental/global measure).
Are quick, cheap and easy to understand.

48
Q

cross sectional study (for individual) (a descriptive study)

A

take a snap shot of a pop at a point in time, measure disease prevalence in relation to exposure. For public health planning, etiological research.
Are cheap, generalised and cannot give temporal sequence.

49
Q

how to do a cohort study. What would you calculate from this? When do you use it? How do you chose a cohort?

A

take a pop, sample people without the disease, find out who was exposed and not exposed, and look to see who got the disease and who didn’t.

  • calculate measure of freqeuncy: incidence (risk), incidence rate, attack rate (outbreak)
  • if exposure is associated to outcome, to estimate risk of outcome in exposed and unexposed cohort
  • be alive, be at risk of outcome, be free of outcome at the start of study
50
Q

types of cohort studies. Absolute measures and relative measures in cohort studies? Why are rate and risk ratio sometimes different? Which one do we trust?

A

prospective (study starts before disease occurance), retrospective (study starts after disease occurance), combination

  • absolute: incidence difference
  • relative: rate ratio, risk ratio
  • they differ if the follow up times are not equal, between the 2 groups. HERE WE DONT TRUST RISK RATIO, USE RATE RATIO
51
Q

advantages of a cohort study. Disadvantages

A

ADV: temporal relationship can be inferred, can directly measure disease incidence, can examine rare exposure, multiple outcomes can be studied, less vulnerable to bias
- DIS: long, expensive, inefficient for rare outcomes, multiple exposures are difficult to asses, not suitable for diseases with long latency, exposure change

52
Q

how to perform a case control study. WHEN to use? ADV? DIS? What can you calculate here?

A

choose cases and then controls (from pop which gave rise to case), find out who was exposed and who wasn’t (questionnaire to find frequency of exposure). These are retrospective.

  • WHEN: when exposure data are expensive, disease with long latent period, rare disease, little known about disease, pop is dynamic
  • ADV: cheap, easy, quick, multiple exposure can be examined, rare/long latency can be seen
  • DIS: bias, direct incidence estimation no possible, temporal relationship unclear, multiple outcomes cannot be studied, inefficient for rare exposure.
  • Odds ratio (ad/bc)
53
Q

Types of case controlled studies

A

depend on how you select controls

  • general pop controls
  • hospital controls
  • special control (friends, spouse, siblings)
54
Q

if a measure of association (risk/odds ratio) is >1 it means? <1?

A
  • we have + association
  • inverse/protective association
  • =1 : no / neutral association
55
Q

statistical inference

A

when you measure properties of a sample (mean and SD) and use these values to infer the properties of the entire pop

56
Q

steps for hypothesis testing

A
1- null and alternative hypothesis
2- calculate test statistic
3- specify significance level
4- determine P value
5- make statistical inference
57
Q

if p value is >5% what do we do

A

accept null hypothesis

58
Q

alpha and beta errors and power of study

A

alpha: false positive results occur (5% aka significance level)
beta: probability of a false negative result (10-20%)
power of study: 1-B (80-90%)

59
Q

how to choose a test based on study (for continuous variables - height/weight/BMI)

A
  • one sample t test: tests the null hypothesis that the mean of a pop is equal to a constant value (non parametric version is sign test)
  • two sample t test: compare treatment outcome of 2 samples (if non parametric use Mann Whitney test)
  • paired t test: compare 2 non independent samples (if not parametric use wilcoxon signed rank test)
    (1- compare group with constant value. 2- compare means between 2 groups)
60
Q

tests to use for categorical data (a comparison of proportions- mortality rates)

A

use chi squared test

61
Q

correlation. What tests to use?

A

explores the association between two variables that are continuous.. Can be +/-/strong/weak.
- if data follows norm disribution: Pearsons correlation
- if data is not normal: spearmans rank correlation
They take values from -1 to +1

62
Q

confidence interval

A

gives a measure of the precision of the result from a sample. 95% CI gives the range of values which we can be 95% confident includes the true value
Probability (p value) only measure strength of evidence against he null hypothesis

63
Q

bias

A

any systematic error in the design or the conduct of an epidemiological study resulting in a conclusion which is different from the truth

64
Q

random error

A

reflects the amount of variability

65
Q

main types of bias

A
  • selection bias (healthy worker effect-a type of bias/error where the researches choose who is included in a study so results
    may not be applicable to a population outside of the study)
  • information bias (recall bias-measure bias, happens when researchers are unable to collect accurate data)
  • confounding (when the effects of two exposures have not been separated and the analysis concludes that the effect is due to one variable rather than the other
66
Q

how to fix bias

A
  • randomization
  • restriction
  • matching: people in the case group and control group are matched based on characteristics
  • stratification: is the process of dividing members of the population into groups before sampling
  • statistical modeling
67
Q

how to calculate sensitivity

A

has a high probability of detecting the disease (True Positive / (True Positive+ False Negative)

68
Q

how to calculate specificity

A

has high probability that those without the disease are eliminated (TN/TN+FP)

69
Q

positive and negative predictive values

A

Positive: how likely is someone with a positive test result to actually have the characteristic? (TP/TP+FP)
Negative: How likely is someone with a negative test result to actually not have the characteristic (TN/TN+FN)
note that sensitivity and specificity are characteristics of a test and they do not change, however positive and negative predictive values changes according to the prevalence of the disease

70
Q

how do we use GRADE

A

it is for rating the quality of a body of evidence

71
Q

odds ratio

A
  • In Cohort: is odds of a disease in exposed group vs odds of a disease in unexposed group (looking at occurrence)
  • In Case control: odds that the cases were exposed/odds that controls were exposed (looking at exposure)
72
Q

relative risk in cohort studies is

A

risk of disease in exposed / risk of disease in unexposed

73
Q

meta analysis

A

combines results from all studies to increase statistical ability to discern a small but significant clincal event

74
Q

narrative review

A

conducted by experts. Has bias

75
Q

systematic review

A

minimise bias. Conducted by researchers