First Exam Flashcards
Reliability
How consistent is the entire instrument, the closer it is to 1, the more reliable the instrument is
Psychometric theory looks at 2 things
The entire test (reliability) and the other side looks at item quality (non-dichotomous & dichotomous)
How do you construct an instrument?
- looking at the entire test (reliability) and item quality (non-dichotomous & dichotomous)
Entire test had 4 different types of reliability
-inter-rater, test-retest, internal consistency, parallel forms
Non-dichotomous and how it relates to variance
-you want higher variance to have a better normal curve, the more items you increase the variance
Validity
Accuracy, all of probability is based on infinity
Reliability and error
Error can affect the consistency of scores
2 types of error
Systematic error & random error
Systematic error
Errors that occurs consistently because of a particular characteristic of the person being tested (reading proficiency)
Random error
Errors that occur by chance (black out, distraction) (distraction) (more common)
Different types of random error
Content differences, subjective scoring and temporal instability
Content differences (content based)
Non-standardized administrations (may inadvertently speak differently when administering test) ex: court ordered testing or a child using restroom during test
Subjective scoring
Raters difference- subjective viewing of the client maybe different
Temporal instability
One day test taker had the flu ex: things change day to day, the first day of testing went good, but second day there was an earthquake their performance went down
What are some ways to decrease measurement error
Writing clear items, making test instructions easily understood, adhering closely to the prescribed conditions for administering a instrument, training raters on themselves, make subjective scoring rules as explicit as possible
Where does most measurement error come from?
It has to do with the person administering the test, but it will change as one becomes more experienced
Test-retest reliability (coefficient of stability)
When you take a single group of subjects and you repeatedly test on the same instrument at different times
What is the gold standard for test-retest reliability
2 weeks between the first test and the second test, this is where you get optimal test-retest reliability
In test-retest reliability what is the difference between the shorter and longer gap?
Longer the time gap lower correlation, shorter the gap we get more similar factors that contribute to the error
Artificial inflation
When researchers use the shorter gap to get a better correlation
Parallel forms reliability
Assessing if two forms of the same instrument produce similar results when testing the same person (sometimes hard to achieve)
What is form A & form B (parallel forms reliability)
How reliable they are with one another, the have these two spots to eliminate the practice affects
What is a key problem with parallel forms reliability?
Difficult to randomly divide and hard to create large number of items
What is a key problem with parallel forms reliability?
Difficult to randomly divide and hard to create large number of items
What is a key part of parallel forms reliability?
Developing a large number of items and then randomly divide them into test
Coefficient of equivalence
How correlated the scores are of a persons taking similar tests with two different forms of
When should the two forms for parallel forms reliability be sent out?
They should be administered at least 2 weeks apart
What happens if correlations between two testing is lower than .2?
There is significant measurement error
What happens if you administer the forms on the same day for parallel forms reliability?
test may reflect state rather than trait and you will not have a statically significant difference
Internal consistency reliability
How related items are within the entire scale and within the subscales
What do we want with internal consistency reliability?
The content should be similar for the reliability to be high, you need adequate number of items and want the item to underlie appropriately a particular construct
Different types of internal consistency reliability
Split half reliability, Kudar Richardson #20 (KR 20), Cronbach Alpha
Split half reliability
Split the examinees scores into halves and then correlate the scores of both halves
How does split half reliability look like in speeded tests?
May produce artificially high internal consistency for odd and even split, if he/she runs out of time
How to get good idea of split half reliability
They will take the odd questions and split those in half with the even questions, this will allow a better idea of split half reliability
What are some problems with split half reliability?
Natural order of test taking (content is not the same with the first half as the second half) & Issue of a timed test (some people don’t get to the second half)
Kudar Richardson #20 (KR 20)
Formula that allows for split half reliability that is done under the assumption that the questions are scrambled
How does Kudar Richardson stop a confound in your test?
By stopping the natural order
The drawbacks of KR 20
Only works with dichotomous scaling systems (only allows for right or wrong question responses)
Cronbachs Alpha
Can be used to assess internal consistency for those tests that have different scoring systems
When can and how can cronbachs alpha be used?
Can be used on any scoring system and allows for scrambling of the questions, used more than any other measure of internal consistency, equivalent to all split half correlations
Internal consistency & cronbachs alpha
High coefficient alpha does not always mean that you are measuring only one factor or latent construct (unidimensionality)