Stats Flashcards

1
Q

Name 3 types of descriptive studies.

A

case report/series
ecological study-examines rates of disease on a population level
cross-sectional study-looks at exposure and disease at the same time- like a survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is equipoise?

A

in an RCT, you truly don’t know if your intervention is better than the standard of care, but you feel confident that withholding the intervention will not bring harm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an intention-to-treat analysis?

A

include all participants were assigned to a particular group in the analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is an efficacy or per protocol analysis?

A

only include participants who were compliant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an as treated analysis?

A

analysis participant data based on the treatment they actually received.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the pros and cons of retrospective vs prospective cohort study?

A

cost, time, and quality of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some strengths of cohort studies?

A
  1. efficient for rare exposures
  2. clear temporal sequence between exposure and disease
  3. good information on exposures, confounders
  4. study the effect of exposure on multiple outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some limitations of cohort studies?

A

-not good for rare disease or those with long latency
-not good for exposures that are expensive to determine

-large populations with long f/u time
-loss to follow up
-expensive and time consuming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how do you design a case-control study?

A

total population then identify patients who have or do not have the disease that you want to study, then compare the odds of having the disease in the exposed and unexposed groups.

individuals are chosen based on their outcome status and then exposure status is assessed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the benefits of case-control over a cohort study?

A

smaller, more efficient, with shorter follow up compared to cohort studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Case-controls are great for when…

A

-exposure data is expensive or difficult to obtain
-long latent period
-disease is rare
-population is difficult to follow
-little is known about the disease
-want to evaluate many exposures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the definition of a control in a case control study?

A

a sample from the source population that produces the cases

so NOT just everyone who doesn’t have the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of the control group in a case-control study?

A

estimate exposure distribution in the source population that gave rise to cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some limitations of case-control studies?

A

-often limited to studying a single outcome
-inefficient for rare exposures
-more opportunity for bias
-temporal sequence between exposure and outcome
-cannot calculate absolute measure of association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define cumulative incidence.

A

Proportion of population at risk that develops the disease or outcome over a specified time period

example: number of new case during the time period/the total population at risk at the start of the time period

risk of cervical cancer in 5 years is number of new cases of cervical cancer in 5 years divided by the number of people with cervix, but notcancer in the population at the time of the start of those 5 years.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define prevalence.

A

number of ppl with the disease divided by the entire population

ex: number of ppl with cervical cancer divided by the total population (not just people with a cervix)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define incident rate.

A

number of new cases of disease during the time period divided by the total person-time observation in the population at risk

example- one perosn is followed for 3 years and the other is followed for 4 years, that is 7 person years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

how do you calculate an absolute risk difference?

A

risk of the disease is the cumulative incidence in the exposed minus the cumulative incidence in the unexposed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

how do you calculate an absolute rate difference?

A

Rate difference is the incidence rate in the exposed group minus the incidence rate in the unexposed group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

how do you calculate risk ratio AKA relative risk?

A

cumulative incidence in the exposed group divided by the cumulative incidence in the unexposed group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

how to calculate an Odds ratio?

A

numerator: the number of cases in the exposed group multiplied by the number of controls in the UNexposed group

denominator: the number of cases in the UNexposed group multiplied by the number of controls in the exposed group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

define the odds of an event

A

Probability that an event will occur divided by the probability that it will occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Name two categories of error in epidemiologic research

A

random error (p value)
systematic error (bias, confounding)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Name the three necessary criteria for a variable to be a confounder.

A
  1. must be an independent predictor of the outcome, like a risk factor for the disease
  2. must be associated with exposure
  3. cannot be caused by the exposure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
name three ways to reduce confounders during study DESGIN
1. randomization 2. restrict confounders through our exclusion criteria 3. match confounders in study groups
26
Name 3 ways to address confounders during study ANALYSIS.
-Standardization-"among women over 65..." -stratification and pool - split the data into groups and then math to pool them into one risk -modeling- multivariable statistical models
27
What is internal validity?
is the study free of bias
28
What is external validity?
generalizability
29
name two ways to analyze data from an RCT?
intent to treat per protocol
30
describe categorical data describe ordinal data describe nominal data
data fits into specific categories ordinal data- categories that have a specific order to them, (mild, mod, severe) nominal data- there is no rank to the categories
31
where is the tail on a positive skewed data? where is the tail on a negative skewed data?
positive- tail is to the right negative - tail is on the left
32
define mean, median, mode
mean-average median-value in the middle of the data set mode-value that appears most often in a data set
33
how do you present parametric vs non-parametric data?
parametric data is normally distributed so you can used the mean and the standard deviation non-parametric data is not-normal or skewed median and interquartile range
34
Which test should you use? categorical data independent data small sample size
fischers exact test
35
Which test should you use? categorical data independent data Large sample size
chi-square
36
Which test should you use? categorical data paired data- like pre and post intervention data Large sample size
McNemar's
37
Which test should you use? categorical data paired data- like pre and post intervention data small sample size
McNemar's exact test
38
Which test should you use? continuous data parametric 3 or more groups
ANOVA
39
Which test should you use? continuous data independent parametric 2 groups comparing the means to see if there is a difference
t- test
40
Which test should you use? continuous data independent parametric 2 groups assessing for a correlation between the two groups
pearson correlation
41
Which test should you use? continuous data independent non-parametric 2 groups comparing the medians
Mann-whitney-U test AKA wilcoxon rank sum test
42
Which test should you use? continuous data independent non-parametric 2 groups comparing the medians to assess for a correlation
spearman correlation
43
Which test should you use? continuous data independent non-parametric 3+ groups comparing the medians
kruskal wallis
44
Which test should you use? continuous data paired parametric 3+ groups comparing the means
repeated measures of ANOVA
45
Which test should you use? continuous data paired parametric 2 groups comparing the means
paired t-test
46
Which test should you use? continuous data paired non-parametric 2 groups comparing the medians
wilcoxon signed rank
47
What is a kaplan meier curve?
a graph that shows the probability of survival over time. "ex: how long until the prolapse recurs? or how long until the patient stops the pessary"
48
What test is used to compare kaplan meier curves?
log-rank test
49
Let's talk regressions When do you use a linear regression and how do you report the result?
1. continuous outcome 2. mean difference
50
Let's talk regressions When do you use a logistic regression and how do you report the result?
1. independent categorical outcomes 2. Odds ratio
51
Let's talk regressions When do you use a log-binomial regression and how do you report the result?
1. independent categorical outcomes 2. risk ratio
52
Let's talk regressions When do you use a poisson regression and how do you report the result?
1. count data 2. rate ratio
53
Let's talk regressions When do you use a cox regression and how do you report the result?
1. time to event data 2. hazard ratio
54
What is a type 1 error?
alpha probability of rejecting a hypothesis that is true
55
What is a type 2 error?
beta probability of failing to reject a false hypothesis
56
What type of error is a false positive?
alpha rejecting the null when it is true. saying that there is a difference when there is none
57
What type of error is a false negative?
beta failing to reject the null hypothesis when it is false saying that there is no difference when there is one
58
Explain the 95% confidence interval
if a study were to be repeated 100 times, a 95% CI would contain the true value in 95 of those studies
59
What is statistical power?
1-beta OR 1 minus the probability of committing a type 2 error OR 1 minus the probability of failing to reject the null when the null is false. typically 0.80, beta is 0.20
60
What variables do you need to calculate the sample size?
power alpha difference you want to see buffer for 20% loss of participants, desired n/0.80
61
Define sensitivity.
probability of testing positive, given that they have the disease true positives/total population with disease
62
define specificity
probability of testing negative, that they do no have the disease true negatives/ total population without the disease
63
define PPV
probability of having the disease if they test positive true positives/all positive test results affected by prevalence of disease
64
define NPV
probability of NOT having the disease if they test negative true negatives/ all negative test results affected by prevalence of disease
65
If the prevalence of the a disease increases, what happens to the following values? Sensitivity specificity PPV NPV
sen and spec are the same PPV increase NPV decreases
66
Define positive likelihood ratio.
likelihood of positive result in someone with the disease vs someone who does not have the disease sensitivity/1-specificity combo of sensitivity and specificity higher the positive likelihood ratio, the better.
67
Define negative likelihood ratio.
likelihood of negative result in someone with the disease vs someone who does not have the disease specificity/1-sensitivity the closer the result is to 0, the better.
68
What would be considered a high positive likelihood ratio?
>10
69
What would be considered a low negative likelihood ratio?
<0.1
70