Prev Med 2 Flashcards
sample vs pop + ex
subset of pop (pts w/ high bp managed by fam med dr in Montgomery County; women 21-50yo in NRV Free Clinic in Christiansburg) vs entire ppl (all pts w/ high bp in VA; all adult women)
sample types: simple random vs systemic sampling vs stratified vs cluster vs convenience vs quota vs theoretical
q member of pop has equal chance of being selected vs samples = selected over fixed pattern or time interval vs when known diff exists, samples = taken from each category that are proportional to their vol in total pop vs used when pop is known to be relatively unvarying vs selects most readily avail members of pop vs takes %age of pop vs Selecting cases for its potential representation of a
theoretical construct (incidents, slices of life, time
period, or people)
why use sample?
time limitations, cost of study
representative sample
how sample = selected, how recruitment for participation = done, how ppl = retained and provide clues to whether sample represents pop of interest
variables: quantitative vs qualitative
determine relationship b/w IV and DV in pop; can be descriptive (subjects measured once, est associations b/w variables) or experimental (subjects measured before & after tx) vs in-depth understanding of human behavior and reasons governing human behavior, relies on reasons behind various aspects of behavior; used in social sciences
random variable
anything capable of being measured; data, observations, measurements, indicators, characteristics
data types: categorical vs continuous
nominal (counted data, no scale or rank; ex: blood groups A/B/O/AB or eye color w/ numbered assignment like 1 = blue, 2 = brown, 3 = green); ordinal (order data if 2+ categories, ranked; ex: after tx pt can improve/stay same/become worse, or severity of illness can be minor/moderate/major); dichotomous/binary (counts in whole numbers, no decimals, implies which direction = favorable; ex: nml/abnml, well/sick, living/dead) vs measurements of any value along continuous scale, yes decimals; ex: baby birth wt, length of time to return lab test result, pt height, temp
descriptive statistics
simple graphical numerical techniques to summarize info about data; involves pop, sample, statistical inference (how info in sample can draw conclusions about pop), point estimate (single value used to estimate pop parameter), measures of center, measures of dispersion/spread
measures of center vs measures of dispersion/spread
represents location of data; mean, median, mode vs represents variability of data ie. spread of distribution; variance (measures variability in sample by mean of squared deviations), standard dev (measures variability away from its mean), range (diff b/w max and min value), interquartile range (diff b/w 25th and 75th percentiles; for skewed distribution)
know how to calculate standard dev w/ 68, 95, 99% confidence interval
yep
standard error vs confidence level vs margin of error
measure of variation by estimating probable error of sample mean to pop mean; ID sampling error in sampling process; standard dev/sqrt(sample size) vs you’re 95% confident that true mean in a pop is b/w [mean - 2 standard errors] and [mean + 2 standard errors] vs values above and below sample statistic in a confidence interval
nml distribution
continuous distribution, unimodal (mean/median/mode = same value), empirical rule (68% of data w/in 1 SD of mean, 95% of data w/in 2 SD of mean, 99% of data w/in 3 SD of mean)
know what L and R skewed distributions look like and where mean/median/mode goes
Lecture 10, slide 8&9. also unimodal (b/c one hump)
general def of hypothesis testing
sci inquiry into connection b/w cause and effect, sci method to test validity of claim about pop being studied, hypothesis statement specifies characteristics/parameters of process like location or spread
goal of hypothesis testing
to see if there is sufficient statistical evidence to reject null hypothesis and accept alt hypothesis
5 steps of hypothesis testing
- develop null and alat hypotheses
- est appropriate alpha lvl
- perform test of statistical significance on collected data
- compare p-value from test w/ alpha
- conclude result (reject null when p < alpha, fail to reject null when p > alpha)
know how to set up null and alt hypotheses
Lecture 10, slide 16-18
alpha lvl vs p value
highest risk of making false pos error that PI is willing to accept (if p = 0.05 –> PI accepts 5% risk of being in error when rejecting null) vs probability of obtaining more extreme values than observed test statistic, given null is true; probability of obtaining the result by chance
what to think about for statistical vs clinical significance
when p < alpha –> reject null and favor alt; but doesn’t mean it’s worth continuing study or that it’s clinically significant vs does benefit > risk for your pt? is info relevant to your pt? is there practical importance for tx effects?
factors affecting sample size
what size effect are you looking for? (takes larger sample size to find small effect than big effect); what’s the significance lvl? (you need smaller lvl for larger sample size); how much variability? (if you expects lots of variation, you need larger sample size); how much power? (if you want more power/statistically significant, you need larger sample size)
factors affecting power
effect size (larger effect size –> better chances of finding it using same resources), significance lvl (larger effect size & sample size –> alpha and power inc; power = 1-beta), sample size (larger sample size –> power inc), pop SD (more variable data –> larger sample size needed or else power dec)
what does Power answer?
if null = false, what’s the probability that data from experiment will reject null? ie. finding a significant effect when one does exist
degrees of freedom
of indep pieces of info that went into calculating estimate; # of values that are free to vary in a data set
type I vs II errors
alpha error, false pos (you said there was a sig diff but there actually wasn’t –> you take the “risk” when erroneously rejecting null) vs beta error, false neg
what are the 4 questions to determine which stat test to use?
what’s the type of data? are observations indep or dep? what’s the data distribution (any normality)? how many variables are investigated (2 variables => bivariate analysis, >2 variables => multivariable analysis)?
IV vs DV
represents quant manipulated in experiment; exposure, risk factor, predictor/regressor/explanatory vs quantity depends on IV that’s being manipulated; outcome, target, criterion
ex of parametric vs non parmetric tests
compares means; unpaired/student’s t-test, paired t-test, ANOVA, Pearson coeff vs compares medians; wilcoxon rank sum/MannWhitney U test, wilcoxon signed rank test, Kruskal-Wallis test, Spearman coeff, chi square
when to use parametric vs nonparametric test?
when collected data uses ratio or interval scale, nml distribution vs when nothing is known about parameters of variable, small sample size, nominal or ordinal data, doesn’t rely on mean or SD, non nml distribution (bimodal, outliers)
student t-test
look at diff in means b/w 2 groups
wilcoxon rank sum test
order groups from low to high and rank them, then compare medians to see if one pop = larger than the other
paired t-test vs wilcoxon signed ranked test
look at diff in means w/in same group (before/after), when data = nmlly distributed vs look at diff in medians w/in same group, when data = not nmlly distributed
ANOVA vs Kruskal Wallis test
look at diff b/w 2+ groups, good for categorical data; but need further testing to see which specific groups were significantly diff vs look at diff b/w 2+ groups when data = not nmlly distributed
chi square test vs fisher’s exact test
uses 2x2 table, compares proportions for tx using ratio of actual to expected vs uses 2x2 table for small sample size (<5 observations)
pearson coeff vs spearman coeff
looks for corr/similarity b/w 2 variables, for nmlly distributed continuous data vs looks for corr/similarity b/w 2 variables, for non nmlly distributed continuous + ordinal data, for outlier in data
r = -1 vs 1 meaning in corr
sign = direction of corr (pos or neg corr), number = magnitude/strength of association (strong or weak)
corr coeff values representing weak vs moderate vs strong corr
0-0.3 vs 0.31-0.7 vs >0.7
confounding variable
2 variables may be strongly correlated but both can actually be caused by 3rd variable => confounding variable
corr vs regression
assume no causal relationship or association vs assumes 1 variable = dep on the other
linear regression: y = mx + b
m = slope, regression coeff
x = IV
b = intercept, where line cuts y-axis
y = DV
coefficent of determination (r^2)
in corr; how much variability in DV is explained by variability in IV
ordinary least squares regression vs logistic regression
used for continuous variables like height; simple (1 variable) and multiple (mult variables) regression vs used for binary variables like pass/fail, work/no work; most common in papers b/c they’re testing if tx works/no works; used for case control studies –> based on log of odds
how to talk about controversial topics to pts?
provide patients with risks and benefits and guide them, present info that pt can understand, respect the patient’s choice, don’t talk down to the patient
IRB levels of review: exempt vs expedited vs full review
de-ID data, anonymous surveys vs collection of biospecimens from noninvasive, identifiable data vs interventions involving physical and emotional discomfort or sensitive data
p value = influenced by? low p value means?
sample size. large effect size –> result has major/clinical/practical importance
how to double dip funding?
same or overlapping topic funded by diff agencies, proposing hypotheses already resolved, grant apps shared by other PIs
odds ratio = 1 vs >1 vs <1
odds of case being
exposed is same as odds of control being exposed. No relationship between exposure and disease vs odds of case being exposed is greater than odds of control being exposed. Increase in disease with exposure vs odds of case being exposed is less than odds of control being
exposed. Decrease in disease with exposure
variance and SD = greatly affected by?
outliers
phase I vs II vs III vs IV of clinical trials
safety, drug interaxn, ADME/pharmacokinetics vs short term side effects, efficacy vs short term side effects, efficacy, new formulation, vs ongoing surveillance, new indications
ex of conflict of interest in reseaarch
acting as promotional speakers on behalf of companies, accepting gifts, research involving your pts, research limiting right to publish, giving incentives to participate in research, avoiding “bias” when researching your clinical approaches
Madsen vs Offit vs Smeeth vs Licciardone
retrospective cohort using Danish registry –> no assoc vs systematic review –> can’t dis/prove but can only educate parents that vax = good vs case control –> no assoc; odds ratio depended on age joining GPRD before or after 1st bday vs dec tx for LBP did no change for 1st 6mo then White ppl improve (significantly but not clinically relevant) LBP than Blacks