Stats Flashcards

1
Q

Name 3 types of descriptive studies.

A

case report/series
ecological study-examines rates of disease on a population level
cross-sectional study-looks at exposure and disease at the same time- like a survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is equipoise?

A

in an RCT, you truly don’t know if your intervention is better than the standard of care, but you feel confident that withholding the intervention will not bring harm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an intention-to-treat analysis?

A

include all participants were assigned to a particular group in the analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is an efficacy or per protocol analysis?

A

only include participants who were compliant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an as treated analysis?

A

analysis participant data based on the treatment they actually received.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the pros and cons of retrospective vs prospective cohort study?

A

cost, time, and quality of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some strengths of cohort studies?

A
  1. efficient for rare exposures
  2. clear temporal sequence between exposure and disease
  3. good information on exposures, confounders
  4. study the effect of exposure on multiple outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some limitations of cohort studies?

A

-not good for rare disease or those with long latency
-not good for exposures that are expensive to determine

-large populations with long f/u time
-loss to follow up
-expensive and time consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how do you design a case-control study?

A

total population then identify patients who have or do not have the disease that you want to study, then compare the odds of having the disease in the exposed and unexposed groups.

individuals are chosen based on their outcome status and then exposure status is assessed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the benefits of case-control over a cohort study?

A

smaller, more efficient, with shorter follow up compared to cohort studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Case-controls are great for when…

A

-exposure data is expensive or difficult to obtain
-long latent period
-disease is rare
-population is difficult to follow
-little is known about the disease
-want to evaluate many exposures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the definition of a control in a case control study?

A

a sample from the source population that produces the cases

so NOT just everyone who doesn’t have the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of the control group in a case-control study?

A

estimate exposure distribution in the source population that gave rise to cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some limitations of case-control studies?

A

-often limited to studying a single outcome
-inefficient for rare exposures
-more opportunity for bias
-temporal sequence between exposure and outcome
-cannot calculate absolute measure of association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define cumulative incidence.

A

Proportion of population at risk that develops the disease or outcome over a specified time period

example: number of new case during the time period/the total population at risk at the start of the time period

risk of cervical cancer in 5 years is number of new cases of cervical cancer in 5 years divided by the number of people with cervix, but notcancer in the population at the time of the start of those 5 years.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define prevalence.

A

number of ppl with the disease divided by the entire population

ex: number of ppl with cervical cancer divided by the total population (not just people with a cervix)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define incident rate.

A

number of new cases of disease during the time period divided by the total person-time observation in the population at risk

example- one perosn is followed for 3 years and the other is followed for 4 years, that is 7 person years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

how do you calculate an absolute risk difference?

A

risk of the disease is the cumulative incidence in the exposed minus the cumulative incidence in the unexposed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

how do you calculate an absolute rate difference?

A

Rate difference is the incidence rate in the exposed group minus the incidence rate in the unexposed group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

how do you calculate risk ratio AKA relative risk?

A

cumulative incidence in the exposed group divided by the cumulative incidence in the unexposed group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

how to calculate an Odds ratio?

A

numerator: the number of cases in the exposed group multiplied by the number of controls in the UNexposed group

denominator: the number of cases in the UNexposed group multiplied by the number of controls in the exposed group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

define the odds of an event

A

Probability that an event will occur divided by the probability that it will occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Name two categories of error in epidemiologic research

A

random error (p value)
systematic error (bias, confounding)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Name the three necessary criteria for a variable to be a confounder.

A
  1. must be an independent predictor of the outcome, like a risk factor for the disease
  2. must be associated with exposure
  3. cannot be caused by the exposure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

name three ways to reduce confounders during study DESGIN

A
  1. randomization
  2. restrict confounders through our exclusion criteria
  3. match confounders in study groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Name 3 ways to address confounders during study ANALYSIS.

A

-Standardization-“among women over 65…”
-stratification and pool - split the data into groups and then math to pool them into one risk
-modeling- multivariable statistical models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is internal validity?

A

is the study free of bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is external validity?

A

generalizability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

name two ways to analyze data from an RCT?

A

intent to treat
per protocol

30
Q

describe categorical data
describe ordinal data
describe nominal data

A

data fits into specific categories

ordinal data- categories that have a specific order to them, (mild, mod, severe)

nominal data- there is no rank to the categories

31
Q

where is the tail on a positive skewed data?
where is the tail on a negative skewed data?

A

positive- tail is to the right
negative - tail is on the left

32
Q

define mean, median, mode

A

mean-average
median-value in the middle of the data set
mode-value that appears most often in a data set

33
Q

how do you present parametric vs non-parametric data?

A

parametric data is normally distributed so you can used the mean and the standard deviation

non-parametric data is not-normal or skewed
median and interquartile range

34
Q

Which test should you use?

categorical data
independent data
small sample size

A

fischers exact test

35
Q

Which test should you use?

categorical data
independent data
Large sample size

A

chi-square

36
Q

Which test should you use?

categorical data
paired data- like pre and post intervention data
Large sample size

A

McNemar’s

37
Q

Which test should you use?

categorical data
paired data- like pre and post intervention data
small sample size

A

McNemar’s exact test

38
Q

Which test should you use?

continuous data
parametric
3 or more groups

39
Q

Which test should you use?

continuous data
independent
parametric
2 groups comparing the means to see if there is a difference

40
Q

Which test should you use?

continuous data
independent
parametric
2 groups assessing for a correlation between the two groups

A

pearson correlation

41
Q

Which test should you use?

continuous data
independent
non-parametric
2 groups comparing the medians

A

Mann-whitney-U test AKA wilcoxon rank sum test

42
Q

Which test should you use?

continuous data
independent
non-parametric
2 groups comparing the medians to assess for a correlation

A

spearman correlation

43
Q

Which test should you use?

continuous data
independent
non-parametric
3+ groups comparing the medians

A

kruskal wallis

44
Q

Which test should you use?

continuous data
paired
parametric
3+ groups comparing the means

A

repeated measures of ANOVA

45
Q

Which test should you use?

continuous data
paired
parametric
2 groups comparing the means

A

paired t-test

46
Q

Which test should you use?

continuous data
paired
non-parametric
2 groups comparing the medians

A

wilcoxon signed rank

47
Q

What is a kaplan meier curve?

A

a graph that shows the probability of survival over time.

“ex: how long until the prolapse recurs? or how long until the patient stops the pessary”

48
Q

What test is used to compare kaplan meier curves?

A

log-rank test

49
Q

Let’s talk regressions

When do you use a linear regression and how do you report the result?

A
  1. continuous outcome
  2. mean difference
50
Q

Let’s talk regressions

When do you use a logistic regression and how do you report the result?

A
  1. independent categorical outcomes
  2. Odds ratio
51
Q

Let’s talk regressions

When do you use a log-binomial regression and how do you report the result?

A
  1. independent categorical outcomes
  2. risk ratio
52
Q

Let’s talk regressions

When do you use a poisson regression and how do you report the result?

A
  1. count data
  2. rate ratio
53
Q

Let’s talk regressions

When do you use a cox regression and how do you report the result?

A
  1. time to event data
  2. hazard ratio
54
Q

What is a type 1 error?

A

alpha
probability of rejecting a hypothesis that is true

55
Q

What is a type 2 error?

A

beta
probability of failing to reject a false hypothesis

56
Q

What type of error is a false positive?

A

alpha
rejecting the null when it is true.
saying that there is a difference when there is none

57
Q

What type of error is a false negative?

A

beta
failing to reject the null hypothesis when it is false
saying that there is no difference when there is one

58
Q

Explain the 95% confidence interval

A

if a study were to be repeated 100 times, a 95% CI would contain the true value in 95 of those studies

59
Q

What is statistical power?

A

1-beta

OR 1 minus the probability of committing a type 2 error

OR 1 minus the probability of failing to reject the null when the null is false.

typically 0.80, beta is 0.20

60
Q

What variables do you need to calculate the sample size?

A

power
alpha
difference you want to see

buffer for 20% loss of participants, desired n/0.80

61
Q

Define sensitivity.

A

probability of testing positive, given that they have the disease

true positives/total population with disease

62
Q

define specificity

A

probability of testing negative, that they do no have the disease
true negatives/ total population without the disease

63
Q

define PPV

A

probability of having the disease if they test positive

true positives/all positive test results

affected by prevalence of disease

64
Q

define NPV

A

probability of NOT having the disease if they test negative

true negatives/ all negative test results

affected by prevalence of disease

65
Q

If the prevalence of the a disease increases, what happens to the following values?

Sensitivity
specificity
PPV
NPV

A

sen and spec are the same
PPV increase
NPV decreases

66
Q

Define positive likelihood ratio.

A

likelihood of positive result in someone with the disease vs someone who does not have the disease

sensitivity/1-specificity

combo of sensitivity and specificity

higher the positive likelihood ratio, the better.

67
Q

Define negative likelihood ratio.

A

likelihood of negative result in someone with the disease vs someone who does not have the disease

specificity/1-sensitivity

the closer the result is to 0, the better.

68
Q

What would be considered a high positive likelihood ratio?

69
Q

What would be considered a low negative likelihood ratio?