Midterm 2 Flashcards
What is the purpose of t-test?
to compare 2 groups of scores
(does not require a large sample size but it does help)
What are the appropriate types of variables for a t-test?
Independent variable: dichotomous
Dependent variable: continuous
Null hypothesis: no difference between 2 groups being compared
Alternate hypothesis: 2 groups of scores are different
What is the purpose of an UNPAIRED t-test?
when comparing 2 groups of scores, the 2 groups of scores are INDEPENDENT of each other
AKA 2 group t test, 2 sample t test, independent group t test
For an Unpaired t test, you are (adding/subtracting) the average score of one group to the score of another group
subtracting
For an unpaired/paired t test, how does t get bigger?
get more participants
What are the degrees of freedom of an unpaired t test?
N (total number of observations of the study) - 2
What are degrees of freedom used for?
t(#) where # = degrees of freedom
It is a statistic that is a shorthand indicating sample size. It tells us the shape of the distribution
If the t test was t(698) what are the degrees of freedom of this example?
df = N - 2
698 = N -2
N = 700
How do you get closer to the normal distribution graph with t distribution?
obtain MORE observations
What is the critical value of t?
the smallest absolute value of t needed for the observations to be within the alpha level of statistical significance
unique for every freedom
You would need the t value to be greater than the critical value to be statistically significant
Ex: If 2 tailed t test t(698) = 0.75 and p = 0.57 and the critical value is 1.96, does this meet critical value? Is it significant?
No. 0.75 < 1.96 so does not fall in statistically significance
It is not statistically significant because p > 0.025
What is a paired t-test?
when comparing two groups of scores that are related in pairs (AKA matched groups t test, dependent t test)
scores are paired/linked with one another in some way
ex: for every subject, left eye receives the drug and right eye receives placebo, compare dryness of each eye
Paired t test involves (adding/subtracting) 2 values in a pair and then averaging that difference.
subtracting
For the degrees of freedom of a paired t test what is the formula?
df = N - 1
N (total number of observations)
How does the graph for t distribution look?
it is infinite on a graph and NEVER touches the x axis
What is ANOVA?
Analysis of Variance
compare average score of 2 or more groups of scores
independent variable: categorical
dependent variable: continuous
(similar to t test but more groups compared)
ANOVA uses what size of groups and what is the statistic called?
small # of groups
F statistic
What is F in ANOVA?
variation between groups/ variation WITHIN groups
How is variation measured?
mean squares (MS)
What does variation between groups mean?
how different are the group means compared to the grand mean?
GRAND mean: average score of all observations from all groups
ex: hospital A vs all the hospital means
What does variation within groups mean?
not everyone in a group is identical
ex: scores within hospital A
If the groups are very different in an ANOVA test?
MS between < MS within
F>1
If the average score of each group is the same (or very similar?
MS between ~ MS within
F~1
What is the F distribution?
infinite number of f distributions
shape of graph depends on degrees of freedom
What is the critical value of F?
the value for which 5% of the area is under the curve and larger than that value
What is the formula for degrees of freedom of NUMERATOR for F?
df = #groups - 1
What is the formula for degrees of freedom of the DENOMINATOR for F?
univariate ANOVA (1IV, 1DV)
df = N - # groups
What does the numerator of df in F mean?
how many groups are being compared
ex: if comparing 4 groups, df between = 3
What does the denominator of df in F mean?
total sample size
ex: if you have 250 participants distributed among 4 groups, df within = 246
What does the univariate ANOVA F(3, 246) mean?
4 total groups being compared with 250 sample size/observations
When interpreting ANOVA with 3 or more groups, what conclusion can you draw?
you can reject the null hypothesis and there is SOME difference between the groups that you cannot exactly determine. You just know that they are not all equal (statistically significant)
How would you find out how the 3 or more groups are different in an ANOVA test?
perform post hoc tests
follow up tests/ pairwise comparisons
What are the 3 post hoc tests we need to know?
- Tukey’s test
- Fishers Least Significant Difference (LSD)
3.Scheffe’s Method
What are descriptive statistics?
statistics that help you describe characteristics of your sample.
Primarily measures the central tendency and variability
What are some descriptive statistics we have learned so far?
raw scores
arithmetic mean
median
mode
st deviation
number of participants
What are inferential statistics?
They describe the likelihood of your results occurring by chance or generalizing beyond your sample
you are inferring things beyond just your sample AKA statistical inference
What are some inferential statistics we have learned so far?
students t
point estimate
confidence interval
std error
beta
F (ANOVA)
What is absolute risk?
measure of likelihood of a certain event happening
ex: a smoked has 3% overall chance of dying of lung cancer
What is relative risk? (risk ratio)
the likelihood of disease among “exposed” compared to the likelihood of disease among “unexposed”
does not provide any info to absolute risk
ex: a smoked is 7x more likely to die of lung cancer than a non-smoker
What is the equation of absolute risk?
(participants with disease present in exposed or unexposed) / (total participants in unexposed)
What is the equation of relative risk? (ratio)
(absolute risk of disease in exposed) / (absolute risk of disease in unexposed)
What is attributable risk and how is it determined?
the amount of risk that can be attributed to the risk factor
absolute risk (exposed) - absolute risk (unexposed/baseline risk)
How do we interpret Risk ratio?
RR > 1 = positive association of risk factor and disease
RR < 1 = negative association of risk factor and disease (protective factor)
RR = 1 baseline risk or no association
If RR = 5, what does this mean?
5x risk of disease for those exposed to the risk factor vs those that are unexposed
If you have a 2x more likely of getting a disease then the change in risk increased to?
100% increase in risk (or 2x the risk)
1 = baseline so if you add 1 more to that you have a 100% increase because you doubled 1
What is the interpretation of RR of 0.80?
risk of outcome in the exposed group was reduced by 20% relative to the unexposed group
What is the interpretation of RR of 3.30?
risk of outcome in the exposed group was increased by 230% relative to the unexposed group OR the outcome was 3.3 times more likely to occur in the exposed group than in the unexposed group
A study finds that RR = 1.7 and 95% CI: 0.9-2.7. Is there a significant association?
NO because 1.0 is within the range of the CI
What is the confidence interval?
The range of values, with 95% confidence that is likely to contain the true effect
How does the relative risk relate to the CI?
if the CI contains a value of 1.0 then the null hypothesis is not rejected = not significant statistical association between risk factor and disease
How would you interpret a wide CI?
true value lies within a large range of possible values = less precise
How would you interpret a narrow CI?
true value lies within a small range of possibilities = more precise
A study finds RR = 1.7 and CI = 1.02-2.6. Is there a significant association?
YES, because 1 lies in the range of the CI
What is the definition of risk?
chance of outcome of interest out of all possible outcomes
What is the definition of odds?
the ratio of the change of outcome of interest occurring to the change of the outcome not occuring
Where does relative risk come from?
Prospective cohort studies = follow forward in time
finding people who will develop the disease
Where does the odds ratio come from?
case control studies = follow backward in time
already determine who has the disease
What is the equation for odds ratio?
odds of case (disease) / odds of control (no disease)
How do you interpret odds ratio?
similar to relative risk (in terms of CI)
What is number needed to treat?
number of people treated to have impact on one person (how likely is it that a therapy will help an individual person?)
What is the number needed to harm?
number of people treated to harm one person
What is the EER (experimental event rate) formula?
probability in treatment group that ended up with an event out of the total #
What is the CER (control event rate) formula?
how many individuals in your control group ended up with the event out of the total number in the control group
What is the absolute risk reduction (ARR)?
CER - EER (in absolute value)
NNT is in decimal form?
1/ARR
NNT is in percent form?
100/ARR
What are some considerations for NNT?
in clinical endpoint is devastating (death, heart attack), drugs with high NNT may still be indicated
NNT values are time specific so you must
- compare studies with similar time frames
- must think about timeframe in treatment decisions
In an ideal world what would you consider a good NNT?
1 because every patient with treatment benefits
What is a good range for an NNT?
2-5 = effective treatment
What is a good NNT for prophylactic treatments (especially those devastating ones)?
20-100
NNH is calculated for?
every side effect/adverse effect
ex: death, blood clot, bleeding
Do you want a high NNH or low?
high!
What is statistical power?
the probability that you will find statistical significance with a given sample size, if the alternate hypothesis is true
What is statistical power analysis?
a statistical analysis conducted BEFORE you begin a study that estimates the necessary sample size to detect a statistically significant relationship (in percentage)
As you have larger between group differences, you need a (smaller/larger) number of observations?
smaller
As you have smaller within group differences, you need (smaller/larger) number of observations?
smaller
A type 1 error is what?
reject the null hypothesis BUT there is NOT statistical significance in the test so you have a false POSITIVE
this is also known as statistical alpha
A type II error is?
you know that there IS statistical significance in the variables (falling under statistical beta) but you fail to reject the null hypothesis you you have a false NEGATIVE
you should have found something but you didnt
Are statistical beta and regression beta the same thing?
NO
For statistical alpha of 0.5%, what type of error does this fall under?
type I
What statistical power do we ideally want?
80% (higher scores=better outcome)
Most studies want to avoid what type of error?
type I (statistical alpha)
FP - don’t want to claim something works but it actually doesnt
Risk of type I error and type II error are (inversely/directly) proportional
inversely
What is power analysis?
a procedure that estimates an appropriate sample size to find an effect
usually performed with computer programs
What is the purpose of Pearson’s correlation?
measures how closely related 2 CONTINUOUS variables are (linear relationship between 2 numerical measurements)
What are the ranges for Pearson’s Correlation?
Range: -1 to +1
(+/-) indicate slope
r=-1 perfect negative correlation
r=1 perfect positive correlation
r=0 no correlation
The smallest correlation would be a Pearson’s coefficient of?
something closest to 0
How do you interpret Pearson’s r? r(698) = -0.09
there is a negative linear correlation (-)0.09
698 = degrees of freedom
df = N - 2
N = 700 observations
What is considered a small correlation?
r=0.10 or -0.10
What is considered a medium correlation?
r=0.30 or -0.30
What is considered a large correlation?
r=0.50 or -0.50
What is considered a stronger correlation?
more clustering around the line of best fit (r closer to -/+1)
How would you interpret a straight horizontal or vertical line for Pearsons correlation?
r = 0 (non zero slope)
What Pearsons coefficient would be a graph with a NONlinear graph?
nonzero slope
r=0
What is a validation study?
comparing a new test(experimental test) to the gold standard
For correlation analysis and linear regression analysis where you are comparing 2 variables you report (r/r^2)?
r
For multiple regression analysis you report?
r^2
What is the effect size of an r^2?
r^2=0.01 small
r^2=0.10 medium
r^2=0.25 large
range = 0 to 1
What is the chi squared test?
counting things and trying to see if one group has more of something than of something else
IV = categorical
DV = frequency (converted into a percent)
Chi squared in common in what type of research?
adverse events
What is analysis of covariance? (ANCOVA)
compares the scores of 2+ groups using statistic F while CONTROLLING a potential confound (COVARIATE)
IV: categorical
DV: continuous
covariate: continuous
When can you assume they used ANCOVA data?
analysis was adjusted, analysis was estimated, analysis was corrected for
What is the difference between clinical and statistical significance?
clinical significance - importance of a research result in terms of the symptom relief you can expect for your patient
statistical significance - how likely something occured to chance
What are observational studies?
hypothesis generating
Ex: case reports/studies, cross section, case control, cohort studies
What studies are level I (HIGH)?
well designed randomized controlled trials
What studies are level II (Good)?
well designed controlled un-randomized trials, cohort or case control analytic studies, multiple time series with or without intervention
What studies are level III (POOR)?
case reports/series
cross sectional studies
reports of expert committee/organizations
Random subject selection underpins what?
statistical inference which helps ensure external validity
Random assignment (prevents/promotes) selection bias, which means that differences in study outcomes are due to study treatments and not from confounding factors
prevents
Random selection refers to
sampling
Random assignment refers to
group assignment
What is sensitivity?
accuracy of test to correctly identify all individuals in a population who have a particular disease
true positive/detection rate
sensitive to disease
What is specificity?
accuracy of the screening procedure to correctly identify those who do no have the disorder
true negative
specific to health
What is considered good range to have specificity and sensitivity?
70% sensitivity and specificity
What are the basic concepts in epidemiology?
morbidity
mortality
indidence
prevalence
What is incidence?
probably that healthy people will develop a disease over a specific period of time
rate at which new disease occurs in a group of people who are disease free
AKA attack rate, risk, probability of developing a disease
come from prospective cohort studies
How do you calculate incidence?
number of new cases of disease during specific period (1yr)/size of population at risk during specific period
denominator is usually standardized (100,1000,10000)
What is prevalence?
probability of people having a specific disease at a given time (coming from cross sectional studies)
these people already have the disease from the past at a given time
How do you calculate prevalence?
number of existing cases during specified point or period/ size of population at risk during specified point or period
expressed as percentage
expressed relative age, gender, race, geographic regions