stats final Flashcards
correlation levels of measurement
IV and DV are interval and ratio
pearson-product moment correlation
interval/ratio
normal distribution
r
strength of correlation
0: null
1.0: perfect pos
-1.0: perfect neg
assumptions for correlation
scores rep population
normal distribution
has both x and y
x and y are independent measures
x and y are observed
linear relationship
interpretation of correlation
< .25 little to no
.25-.50 low to fair
.50-.75 moderate to good
> .75 strong relationship
limitations of correlations
only two variables
only linear
does not tell cause and effect
does not account for agreement
influenced by range
average values can suppress variation
coefficient of determination
square of correlation coefficient
the percent of variance in y that is explained by x
significance of coefficient
very sensitive to sample size
conventional effect sizes for r
small: .10
medium: .30
large: .50
what are non parametric statistics based on?
comparisons of rank scores
comparisons of counts or signs of scores
when do you use non parametric tests?
when you violate more than 2 parametric assumptions
what are the advantages of non parametrics
appropriate for wide range of solutions
can use with categorical data
simple computations
outliers have less effect
disadvantages of non parametrics
they waste information - collapsed data
less power - 65-95% of para counterparts
if outliers are not errors, effects may be underestimated
non para for unpaired t test
Mann-Whitney U
non para for paired t test
sign test
~ scores converted to signs
wilcoxon signed ranks test (more common)
~ gives magnitude of change
non para for IG ANOVA
kruskal-wallis ANOVA
non para for RM ANOVA
freidmans ANOVA
how to rank ties
average what the two ranks would be
spearman rank (rho) correlation coefficient
non para analog of pearson r
at least one variable will be ordinal
non normal distribution of ratio/interval data
can be used with curvilinear
spearman value
since it is correlation -1 through +1
chi-square
association between two categorical variables
goodness of fit chi square
compare observed frequencies of 1 variable to uniform frequencies
tests of association chi square
much more common
compare observed frequencies of one variable to observed frequencies of another variable
assumptions for chi square
frequencies represent individual counts
can only be part of one category
no subject is represented twice - not for paired
what is signal?
true score
what is noise?
error
define relative reliability
ratio of variability of scores to variability within scores
unitless
ICC and kappa
define absolute reliabilty
how much of a measured value is likely due to error
SEM
acceptable value of reliability
0.80
define internal consistency
how well do these questions reflect the same construct
not actually measuring
3 things that a valid test should do
discriminate among those who do or do not have it
evaluate change in magnitude
predict an outcome
concurrent validity
target test correlating to standard taken at same time
predictive validity
can target test predict standard
convergent validity
correlates with other tests of closely related constructs
divergent validity
uncorrelated with tests of distinct or contrasting constructs
ICC
for continuous scale scores
values from 0-1
measures degree of relationship and agreement
> 2 raters or ratings
higer ICC value
greater reliability
negative ICC value
divergence or disagreement
ICC model 1
raters chosen from larger population
some subjects assessed by different raters
ICC model 2
each subject assessed by same set of raters
test-retest and inter-rater
can generalize to other raters
ICC model 3
same set of raters but only represent raters of interest
only for intra-rater
cannot generalize
ICC form 1
single measurement
ICC form k
several measurements
cohen’s kappa coefficients
for categorical scale scores
ICC interpretation
> 0.90 best for clinical measurements
0.75 good
< 0.75 poor to moderate
cobach’s alpha
correlation among items and correlation of each individual item with the total score
simply how often raters agree
recommended to be between 0.7-0.9
kappa coefficient
proportion of agreement between raters after chance agreement has been removed
nominal and ordinal
interpreted like ICC
weighted kappa
best for ordinal data
can choose to make penalty worse for larger disagreements
kappa interpretation
<0.4 poor to fair
0.4-0.6 moderate
0.6-0.8 substantial
0.8-1.0 excellent
concurrent validity
do two criteria measured at same time correlate
predictive validity
can one criterion predict magnitude of the other
true positive
clinical test +
condition present
false negative
clinical test -
condition present
false positive
clinical test +
condition absent
true negative
clinical test -
condition absent
sensitivity
true pos / (true pos + false neg)
rule out
specificity
true neg / (false pos + true neg)
rule in
positive predictive value
true pos / all pos
negative predictive value
true neg / all neg
likelihood ratios
0-1 decreased probability of disease
1 null value
> 1 increases probability of disease
LR+
likelihood a positive was obtained in someone with disease compared to someone without the disease
LR-
likelihood a negative was obtained in someone with disease compared to someone without the disease
large and often conclusive shift in LR
LR+ >10
LR- <0.1
moderate shift
LR+ 5 - 10
LR- 0.1 - 0.2
small: sometimes important
LR+ 2 - 5
LR- 0.2 - 0.5
small: rarely important
LR+ 1 - 2
LR- 0.5 - 1
cohort studies
based on exposure
usually prospective
case-control study design
based on outcome
retrospective
cases selected form same population as cases
relative risk
cohort studies
odds ratio
case-control studies
RR and OR = 1
null value
RR and OR > 1
considered harmful
RR and OR < 1
considered protective
RR
disease in exposed / disease in unexposed
OR
odds of exposure among cases / odds of exposure among controls
experimental event rate
% pts in experimental group with bad outcome
control event rate
% pts in control group with bad outcome
number needed to treat
how many pts you have to provide treatment to in order to prevent one bad outcome
closer to 1 the better
if 0, NNT is infinity
smaller is better
number needed to harm
measure of adverse treatment effect
larger is better