- Test/retest - Alternate forms - Split-half - Interrater reliability

Module 6: Reliability and Validity Flashcards by Heleen O

Reliability

Reliability refers to the consistency or stability of a measuring instrument. In other words, the measuring instrument must measure exactly the same way every time it is used.

How well did you know this?

Not at all

Perfectly

Systematic errors

Problems that stem from the experimenter and the testing situation.

How well did you know this?

Not at all

Perfectly

Trait errors

Problems that stem from the participants. Were they truthfull, did they feel well?

How well did you know this?

Not at all

Perfectly

True score

The true score is what the score on the measuring instrument would be if there were no error.

How well did you know this?

Not at all

Perfectly

Error score

The error score is any measurement error (systematic or trait).

How well did you know this?

Not at all

Perfectly

Observed score

The score recorded for a participant on the measuring instrument used.

How well did you know this?

Not at all

Perfectly

Conceptual formula for observed score

Observed score = True score + Error score

How well did you know this?

Not at all

Perfectly

Random errors

Errors in measurement that lead to measurable values being inconsistent when repeated measurements of a constant attribute or quantity are taken

How well did you know this?

Not at all

Perfectly

Conceptual formula for reliability

Reliability = True score / (true score + error score)

How well did you know this?

Not at all

Perfectly

Correlation coefficients

A correlation coefficient measures the degree of relationship between two sets of scores and can vary between -1.00 and +1.00. The stronger the relationship between the variables, the closer the coefficient is to either 1.00 or +1.00.

How well did you know this?

Not at all

Perfectly

Positive correlation

A positive correlation indicates a direct relationship between variables: When we see high scores on one variable, we tend to see high scores on the other

Graph going from left bottom to right top

How well did you know this?

Not at all

Perfectly

Negative correlation

A negative correlation indicates an inverse, or negative, relationship: High scores on one variable go with low scores on the other and vice versa.

Graph going from left top to right bottom

How well did you know this?

Not at all

Perfectly

Rules-of-thumb correlation coefficient

.29 = none to weak
.30-.69 = moderate
.70-1.00 = strong

How well did you know this?

Not at all

Perfectly

Types of reliability

Test/retest
Alternate forms
Split-half
Interrater reliability

How well did you know this?

Not at all

Perfectly

Test/retest reliability

One of the most often used and obvious ways of establishing reliability is to repeat the same test on a second occasion. The correlation coefficient needs to be high on both tests for the reliability to increase as well.

How well did you know this?

Not at all

Perfectly

Practice effects

Study These Flashcards

Some people get better at the second testing, and this practice lowers the observed correlation

Alternate form reliability

Study These Flashcards

using alternate forms of the testing instrument and correlating the performance of individuals on the two different forms

Split-half reliability

Study These Flashcards

where you give one group one half of the test and the other group the other half.

Interrater reliability

Study These Flashcards

Here you test how consistent the assessment of two or more raters or judges are

Con: you need to test the reliability of the interraters between them.

Conceptual formula interrater reliability

Study These Flashcards

Interrater reliability = Number of agreements / number of possible agreements x 100

Cronbach’s alpha

Study These Flashcards

A measure of the internal consistency as a kind of average correlation between the items

(I.e. measuring one participant in group 1 and comparing that to another participant from group 1 to see if they match up)

Rules-of-thumb Cronbach’s alpha

Study These Flashcards

> .80: reliability = good.
.60 - .80: reliability = sufficient.
Less than .60 = insufficient

Validity

Study These Flashcards

Validity refers to whether a measuring instrument measures what it claims to measure. The extent to which the observations reflect what we want to measure, i.e. the extent to which the observation reflects the concept or construct under investigation.

Differences validity and reliability

Study These Flashcards

Reliability refers to observations (scores)
Validity refers to conclusions based on observations.
Reliability concerns random measurement error.
Validity issues have to do with systematic error. E.g. our policemen are only 90 meters apart instead of 100, So, the observation does not reflect what we want to measure.

E.g. John scores higher on the IQ-test than Peter. Reliability: are we sure their true scores are different? Validity: Is John more intelligent than Peter?

Statistically significant

What is important for validity coefficients is that they are statistically significant at the .05 or .01 level (with p-value)

7 types of validity

- Content validity - Face validity - Criterion validity * Concurrent validity * Predictive validity - Construct validity - Statistical Conclusion validity - Internal validity - External validity * Population validity * Ecological validity

Content validity

Looks at the content of tests. Does it cover a representative sample of the domain you are researching?

Face validity

Face validity is whether or not a test looks valid on its surface (not the content!). Does the operationalization appear to be valid on it’s surface?

Criterion validity

Criterion validity measures how accurately an instrument the behavior or ability predicts. There are two types of criterion validity. • Concurrent validity is used to estimate present performance. Is the test for bipolar disorder good at distinguishing people with and without depression? • Predictive validity is used to estimate future performance. Is the personality test a good predictor for study success?

Construct validity

Assesses the extent to which a measuring instrument accurately measures a theoretical construct or trait that it is designed to measure. Some examples of theoretical constructs or traits are verbal fluency, neuroticism, depression, anxiety, intelligence, and scholastic aptitude. The conclusion that were made, can they be concluded from the research they did?

(Statistical) Conclusion validity

Do the observations allow for the conclusion that variables are related?

Internal validity

Does the operationalization allow for the conclusion that variables are causally related?

External validity

Extent of generalizability of the conclusions • Population validity: does the sample allow for conclusions about the target population? • Ecological validity: does the procedure followed in the study allow for conclusions about more natural circumstances?

Cons of test/retest-reliability

Con: - Practice effects - Individuals may remember how they answered previously, both correctly and incorrectly. In this case we may be testing their memories and not the reliability of the testing instrument

Cons alternate form reliability

Con: - Difficult to make them parallel, same number of items, difficulty, etc. - Practice effects (not as much as test/retest)

Con split-half reliability

Con: - Helps with reliability of itself, but not over time (the test is not being done twice in its entirety). - Difficult to divide the items equally

Module 6: Reliability and Validity Flashcards

(36 cards)