- refers to the replicable nature of research studies/tools - high reliability does not guarantee scientific validity

- test-retest correlation - instrument is twice administered to the same population - 2-14 days in between

- measures internal consistency of a test by correlating each item with the total score and averaging the correlation coefficients - it takes values between negative intinity and 1 as maximum but only positive values make sense - arbitrary cut-off of 0.70 is used to call the evaluated test to be internally consisten

- refers to the agreement between instruments that measure the same construct - form of construct validity

Reliability and Validity Flashcards by ELLEN MCCLOY SMITH

Reliability

refers to the replicable nature of research studies/tools

- high reliability does not guarantee scientific validity

How well did you know this?

Not at all

Perfectly

Testing reliability

test-retest correlation
instrument is twice administered to the same population
2-14 days in between

How well did you know this?

Not at all

Perfectly

Cronbach’s alpha

measures internal consistency of a test by correlating each item with the total score and averaging the correlation coefficients
it takes values between negative intinity and 1 as maximum but only positive values make sense
arbitrary cut-off of 0.70 is used to call the evaluated test to be internally consisten

How well did you know this?

Not at all

Perfectly

Split-half reliability

-refers to splitting a scale into two parts and examining the correlation

How well did you know this?

Not at all

Perfectly

Interrater reliability

-measured using two or more raters rating the same population using the same scale

How well did you know this?

Not at all

Perfectly

Intraclass correlation coefficient

used for continuous variables
the proportion of total variance of the measurement that reflects true between subject variability
ranges between 0 (unreliable) and 1 (perfect reliability)
relative ICC is always higher than absolute ICC

How well did you know this?

Not at all

Perfectly

ANOVA intraclass coefficient

-used for quantitative data of more than 2 raters/groups

How well did you know this?

Not at all

Perfectly

Nominal data reliability

-if it has more than two categories then a kappa or weighted kappa can be used

How well did you know this?

Not at all

Perfectly

Validity of an instrument

-the extent to which an instrument measures what it proposes to measure

How well did you know this?

Not at all

Perfectly

Face validity

-refers to a subjective measure of deciding whether the test measures the construct of interest on its face value (what it was designed for)

How well did you know this?

Not at all

Perfectly

Construct validity

-measures whether a test really measures the theoretical construct of interest of something else

How well did you know this?

Not at all

Perfectly

Content validity

-refers to whether the contents ie each individual subscales, items or elements of the test are in line with the general objectives or specifications the test was originally designed to measure

How well did you know this?

Not at all

Perfectly

Criterion validity

-refers to performance against an external criterion such as another instrument (concurrent) or future diagnostic possibility (predictive)

How well did you know this?

Not at all

Perfectly

Concurrent validity

-refers to the ability of a test to distinguish between subjects who differ concurrently in other measures e.g those who score high on a scale of insomnia may score high on the scale of fatigue ratings

How well did you know this?

Not at all

Perfectly

Predictive validity

-the ability of a test to predict future group differences according to current group differences in score

How well did you know this?

Not at all

Perfectly

Incremental validity

-refers to the ability of a measure to predict or explain variance over and above other measures

How well did you know this?

Not at all

Perfectly

Convergent validity

Study These Flashcards

refers to the agreement between instruments that measure the same construct
form of construct validity

Discriminant validity

Study These Flashcards

refers to the degree of disagreement between two scales measuring different constructs
form of construct validity

Experimental validity

Study These Flashcards

refers to the sensitivity to change
an instrument must show the difference in results when an intervention is carried out to modify the measured domain
form of construct validity

Factorial validity

Study These Flashcards

-form of construct validity established via factor analysis of items in a scale

Precision

Study These Flashcards

degree to which a calculated central value varies with repeated sampling
the more narrow the variation the more precise the value is
random error leads to imprecision

Factors reducing precision

Study These Flashcards

wider limits of the interval

- expecting higher confidence interval

Accuracy

Study These Flashcards

-the correctness of the mean value

Precision

Study These Flashcards

-comparable to reliability while accuracy is comparable to validity

Bias

-compromises validity/accuracy

Face

-does the scale appear to be fit for the purpose of measuring the variable of interest?

Content

-does the scale appear to include all the important domains of the measured attribute?

Criterion

-is the scale consistent with what we already know (concurrent) and what we expect (predictive)?

Convergent

-does this new scale associate with a different scale that measures a similar construct?

Discriminant

-does the new scale disagree with scales that measure unrelated constructs?

Kappa

observed agreement beyond chance/maximum agreement beyond chance OR (observed agreement-agreement expected by chance)/(100%-agreement expected by chance)

Beyond chance agreement

-Kappa indicates the level of agreement that could be expected beyond chance

When is Kappa calculated?

-only for agreement on categorical variables such as presence or absence of a diagnosis

Weighted Kappa

-for ordinal variables

Bland-Altman

-used for continuous variables where pairs of score differences are plotted against the mean

Kappa values and degree of agreement

``` 0= no agreement 0-0.2=slight 0.2-0.4=fair 0.4-0.6= moderate 0.6-0.8=substantial 0.8-1.0= almost perfect ```

What is kappa statistics?

- if two investigators independently assess the same group, there will be an extent to which their results 'agree' - simple percent agreement overestimates the degree of agreement and this is misleading. This is why kappa statistics are done - kappa indicates the level of agreement that could be expected beyond chance

Interpreting kappa

- affected by the prevalence of the outcome studied -the higher the proportion of positive assessments the higher the kappa - statistical significance cannot be tested directly from kappa measurements but it guides the actual degree of agreement

Observed agreement calculation

- draw two by two table - take the two values agreed and add them together - then divide by 100

Kappa coefficient

- dependent on the prevalence of the measured condition - common disorders will have low kappa but rare disorders have low Kappa - one must look at the actual percentage afreement

Phi

- similar to kappa - all cells are utilised and statistical significance is possible - small sample size can be used

Reliability and Validity Flashcards

(41 cards)