Stats Flashcards by R R

Name 3 types of descriptive studies.

case report/series
ecological study-examines rates of disease on a population level
cross-sectional study-looks at exposure and disease at the same time- like a survey

How well did you know this?

Not at all

Perfectly

what is equipoise?

in an RCT, you truly don’t know if your intervention is better than the standard of care, but you feel confident that withholding the intervention will not bring harm

How well did you know this?

Not at all

Perfectly

What is an intention-to-treat analysis?

include all participants were assigned to a particular group in the analysis.

How well did you know this?

Not at all

Perfectly

what is an efficacy or per protocol analysis?

only include participants who were compliant

How well did you know this?

Not at all

Perfectly

What is an as treated analysis?

analysis participant data based on the treatment they actually received.

How well did you know this?

Not at all

Perfectly

what are the pros and cons of retrospective vs prospective cohort study?

cost, time, and quality of the data

How well did you know this?

Not at all

Perfectly

What are some strengths of cohort studies?

efficient for rare exposures
clear temporal sequence between exposure and disease
good information on exposures, confounders
study the effect of exposure on multiple outcomes

How well did you know this?

Not at all

Perfectly

What are some limitations of cohort studies?

-not good for rare disease or those with long latency
-not good for exposures that are expensive to determine

-large populations with long f/u time
-loss to follow up
-expensive and time consuming

How well did you know this?

Not at all

Perfectly

how do you design a case-control study?

total population then identify patients who have or do not have the disease that you want to study, then compare the odds of having the disease in the exposed and unexposed groups.

individuals are chosen based on their outcome status and then exposure status is assessed

How well did you know this?

Not at all

Perfectly

what are the benefits of case-control over a cohort study?

smaller, more efficient, with shorter follow up compared to cohort studies

How well did you know this?

Not at all

Perfectly

Case-controls are great for when…

-exposure data is expensive or difficult to obtain
-long latent period
-disease is rare
-population is difficult to follow
-little is known about the disease
-want to evaluate many exposures

How well did you know this?

Not at all

Perfectly

What is the definition of a control in a case control study?

a sample from the source population that produces the cases

so NOT just everyone who doesn’t have the disease

How well did you know this?

Not at all

Perfectly

What is the purpose of the control group in a case-control study?

estimate exposure distribution in the source population that gave rise to cases.

How well did you know this?

Not at all

Perfectly

What are some limitations of case-control studies?

-often limited to studying a single outcome
-inefficient for rare exposures
-more opportunity for bias
-temporal sequence between exposure and outcome
-cannot calculate absolute measure of association

How well did you know this?

Not at all

Perfectly

Define cumulative incidence.

Proportion of population at risk that develops the disease or outcome over a specified time period

example: number of new case during the time period/the total population at risk at the start of the time period

risk of cervical cancer in 5 years is number of new cases of cervical cancer in 5 years divided by the number of people with cervix, but notcancer in the population at the time of the start of those 5 years.

How well did you know this?

Not at all

Perfectly

define prevalence.

number of ppl with the disease divided by the entire population

ex: number of ppl with cervical cancer divided by the total population (not just people with a cervix)

How well did you know this?

Not at all

Perfectly

Define incident rate.

number of new cases of disease during the time period divided by the total person-time observation in the population at risk

example- one perosn is followed for 3 years and the other is followed for 4 years, that is 7 person years

How well did you know this?

Not at all

Perfectly

how do you calculate an absolute risk difference?

risk of the disease is the cumulative incidence in the exposed minus the cumulative incidence in the unexposed

How well did you know this?

Not at all

Perfectly

how do you calculate an absolute rate difference?

Rate difference is the incidence rate in the exposed group minus the incidence rate in the unexposed group

How well did you know this?

Not at all

Perfectly

how do you calculate risk ratio AKA relative risk?

cumulative incidence in the exposed group divided by the cumulative incidence in the unexposed group

How well did you know this?

Not at all

Perfectly

how to calculate an Odds ratio?

numerator: the number of cases in the exposed group multiplied by the number of controls in the UNexposed group

denominator: the number of cases in the UNexposed group multiplied by the number of controls in the exposed group

How well did you know this?

Not at all

Perfectly

define the odds of an event

Probability that an event will occur divided by the probability that it will occur

How well did you know this?

Not at all

Perfectly

Name two categories of error in epidemiologic research

random error (p value)
systematic error (bias, confounding)

How well did you know this?

Not at all

Perfectly

Name the three necessary criteria for a variable to be a confounder.

must be an independent predictor of the outcome, like a risk factor for the disease
must be associated with exposure
cannot be caused by the exposure

How well did you know this?

Not at all

Perfectly

name three ways to reduce confounders during study DESGIN

1. randomization 2. restrict confounders through our exclusion criteria 3. match confounders in study groups

Name 3 ways to address confounders during study ANALYSIS.

-Standardization-"among women over 65..." -stratification and pool - split the data into groups and then math to pool them into one risk -modeling- multivariable statistical models

What is internal validity?

is the study free of bias

What is external validity?

generalizability

name two ways to analyze data from an RCT?

intent to treat per protocol

describe categorical data describe ordinal data describe nominal data

data fits into specific categories ordinal data- categories that have a specific order to them, (mild, mod, severe) nominal data- there is no rank to the categories

where is the tail on a positive skewed data? where is the tail on a negative skewed data?

positive- tail is to the right negative - tail is on the left

define mean, median, mode

mean-average median-value in the middle of the data set mode-value that appears most often in a data set

how do you present parametric vs non-parametric data?

parametric data is normally distributed so you can used the mean and the standard deviation non-parametric data is not-normal or skewed median and interquartile range

Which test should you use? categorical data independent data small sample size

fischers exact test

Which test should you use? categorical data independent data Large sample size

chi-square

Which test should you use? categorical data paired data- like pre and post intervention data Large sample size

McNemar's

Which test should you use? categorical data paired data- like pre and post intervention data small sample size

McNemar's exact test

Which test should you use? continuous data parametric 3 or more groups

ANOVA

Which test should you use? continuous data independent parametric 2 groups comparing the means to see if there is a difference

t- test

Which test should you use? continuous data independent parametric 2 groups assessing for a correlation between the two groups

pearson correlation

Which test should you use? continuous data independent non-parametric 2 groups comparing the medians

Mann-whitney-U test AKA wilcoxon rank sum test

Which test should you use? continuous data independent non-parametric 2 groups comparing the medians to assess for a correlation

spearman correlation

Which test should you use? continuous data independent non-parametric 3+ groups comparing the medians

kruskal wallis

Which test should you use? continuous data paired parametric 3+ groups comparing the means

repeated measures of ANOVA

Which test should you use? continuous data paired parametric 2 groups comparing the means

paired t-test

Which test should you use? continuous data paired non-parametric 2 groups comparing the medians

wilcoxon signed rank

What is a kaplan meier curve?

a graph that shows the probability of survival over time. "ex: how long until the prolapse recurs? or how long until the patient stops the pessary"

What test is used to compare kaplan meier curves?

log-rank test

Let's talk regressions When do you use a linear regression and how do you report the result?

1. continuous outcome 2. mean difference

Let's talk regressions When do you use a logistic regression and how do you report the result?

1. independent categorical outcomes 2. Odds ratio

Let's talk regressions When do you use a log-binomial regression and how do you report the result?

1. independent categorical outcomes 2. risk ratio

Let's talk regressions When do you use a poisson regression and how do you report the result?

1. count data 2. rate ratio

Let's talk regressions When do you use a cox regression and how do you report the result?

1. time to event data 2. hazard ratio

What is a type 1 error?

alpha probability of rejecting a hypothesis that is true

What is a type 2 error?

beta probability of failing to reject a false hypothesis

What type of error is a false positive?

alpha rejecting the null when it is true. saying that there is a difference when there is none

What type of error is a false negative?

beta failing to reject the null hypothesis when it is false saying that there is no difference when there is one

Explain the 95% confidence interval

if a study were to be repeated 100 times, a 95% CI would contain the true value in 95 of those studies

What is statistical power?

1-beta OR 1 minus the probability of committing a type 2 error OR 1 minus the probability of failing to reject the null when the null is false. typically 0.80, beta is 0.20

What variables do you need to calculate the sample size?

power alpha difference you want to see buffer for 20% loss of participants, desired n/0.80

Define sensitivity.

probability of testing positive, given that they have the disease true positives/total population with disease

define specificity

probability of testing negative, that they do no have the disease true negatives/ total population without the disease

define PPV

probability of having the disease if they test positive true positives/all positive test results affected by prevalence of disease

define NPV

probability of NOT having the disease if they test negative true negatives/ all negative test results affected by prevalence of disease

If the prevalence of the a disease increases, what happens to the following values? Sensitivity specificity PPV NPV

sen and spec are the same PPV increase NPV decreases

Define positive likelihood ratio.

likelihood of positive result in someone with the disease vs someone who does not have the disease sensitivity/1-specificity combo of sensitivity and specificity higher the positive likelihood ratio, the better.

Define negative likelihood ratio.

likelihood of negative result in someone with the disease vs someone who does not have the disease specificity/1-sensitivity the closer the result is to 0, the better.

What would be considered a high positive likelihood ratio?

>10

What would be considered a low negative likelihood ratio?

<0.1