Lecture 3 - Reliability Flashcards

Question

How much the item scores in a test correlate with one another on average (e.g. Cronbach's alpha, KR-2-)

Answer 1

Internal consistency.

Answer 2

Inter-rater reliability.

Answer 3

Test-retest reliability.

Answer 4

Alternate-forms reliability.

Answer 5

Conceptually, this is the average correlation between the items on your scale. If all items on questionnaire are measuring the same thing, do individuals give consistent responses?

Answer 6

Inter-item consistency | Internal coherence

Answer 7

Scores from the items in a questionnaire that measure the same thing are consistent.

Answer 8

Scores are inconsistent. Unreliable test.

Answer 9

Internal consistency.

Answer 10

When there is more than two possible outcomes to a question.

Answer 11

Select Analyze; Scale; Reliability Analysis; select model 'Alpha' Select all items in your scale. Click OK

Answer 12

1. Split questionnaire in half. 2. Calculate total score for each half. 3. Compute bivariate correlation between total scores for each half 4. Repeat with every possible split-halves of the questionnaire 5. Work out the average of all split-half correlations 6. Adjust the correlation using the Spearman-Brown formula.

Answer 13

Since the questionnaire has been split it will reduce the reliability of the questionnaire because your reducing the number of sample points. There is a relationship between the number of items and reliability. So spearman-Brown formula corrects this.

Answer 14

Kuder-Richardson 20. A measure of internal consistency used when the answers are dichotomous (eg. true/false, yes/no, correct/incorrect)

Answer 15

KR-20. Even though you've got four things to choose between, there are only two ways it can go. You can either get it right or wrong (two outcomes).

Answer 16

Correlation between scores on the same test by the same people done at two different times.

Answer 17

Hazard perception test - video clip - might recognise the hazard within the video when re-watching. Can use alternate forms of the same test (different but equivalent stimuli), and counterbalance the two versions (e.g. half participants see version 1 then version two, and vice versa)

Answer 18

The correlation between scores on 2 versions of the same test by the same people done at the same time.

Answer 19

Parallel forms have the same mean, standard deviation and correlation with other measures. Alternate forms are only similar in content and difficulty.

Answer 20

Correlation between two versions of the same test.

Answer 21

The correlation between scores on the same test by the same people provided by two different examiners.

Answer 22

1. Homogeneity/heterogeneity of the test 2. Static vs dynamic characteristics 3. Restriction of range/variance 4. Speed tests versus power tests 5. Criterion-referenced tests

Answer 23

A Homogenous test measures one variable (measuring the same thing) whereas a Heterogenous tests measure different variables, eg. DASS measures depression, stress and anxiety through 3 different subscales.

Answer 24

Test of extroversion - all questions measuring extroversion, unless you had some underlying theory where you wanted to split extroversion into different subscales.

Answer 25

Personality Inventory. Extroversion and Neuroticisim. Measuring different personality traits.

Answer 26

Intelligence (relatively static)

Answer 27

State of being - anxiety, fatigue etc.

Answer 28

Because test-retest reliability assumes that the thing being measured remains the same.

Answer 29

If the scores in our sample are inappropriately restricted in the amount they can vary then this will affect the correlation. And ALL our reliability estimates are based on correlations

Answer 30

Speed tests focuses on the speed of response rather than level of difficulty, whereas power tests focuses more on the level of difficulty of response.

Answer 31

Because people tend to get all the questions they attempt correct, but they just don't have the time to attempt all the questions. This gives an invalid correlation between the items.

Answer 32

Use alternate forms or test-retest reliability

Answer 33

The internal consistency is not a good estimate of reliability. Test-retest or alternate form reliability measures are better.

Answer 34

Yes. In some pass/fail tests virtually everyone might pass

Answer 35

Because there might be a restriction in range, and since there is no variation in scores it'll be a problem using any of the reliability estimates as they are derived from assessing score variance.

Answer 36

Spearman-Brown formula

Answer 37

``` rsb = spearman brown adjusted reliability n = number of times new test is longer than original (number of items in new test divided by number of items in old test) rxx = reliability of original test (correlation) before adjusting ```

Answer 38

rsb = 0.75

Answer 39

Measures oral reading, comprehension, and fluency of students aged 6-12 years. May also be used to diagnose reading difficulties in older readers.

Answer 40

Participants are asked to read a selection of stories, then complete a comprehension test on the story. Examiners note number of errors and the time taken for participants to read the whole story. (1. Reading accuracy, reading rate and comprehension)

Answer 41

We need to assume that the distribution will be approximately normal (i.e. a bell shape).

Answer 42

Standard deviation of the distribution of scores if an individual takes multiple attempts of a certain score. (How spread out the distribution is from the mean) Tells us the likely margin of error in the individual's test scores.

Answer 43

The scores are really spread out. The more spread out your scores are, the less certain you are about what the real value of what you're measuring is.

Answer 44

By adding and subtracting the SEM (standard error of measurement) from their actual score.

Answer 45

95%, +/- 2 standard deviations

Answer 46

False. We can only ever estimate. Hard to test one client thousands of times.

Answer 47

Standard deviation of a lot of test-taker's. Multiplied by the square root of 1 minus the reliability of the test.

Answer 48

We assume that the SEM will be the same for everyone who takes the test (in real tests, they sometimes give a different SEM for different age groups etc. in manual)

Answer 49

95% confidence interval is around 2 SEM from the individual's score or the mean.

Answer 50

Range of scores that is likely to contain a person's true score (margin of error)

Answer 51

95% of scores fall within 2 SD of the 'mean' which is their actual score. Therefore the actual score is +/- (2 X SEM)

Answer 52

SEM is 2.12. (15 x square root of 1-0.98) 105 +/- (2x2.12). Therefore scores range from 101 to 109 (i.e. their true IQ score is 95% likely to be in that range).

Answer 53

A measure used to calculate whether there is a statistically significant difference between 2 scores. At 95% confidence interval, scores need to differ by 2 SEdiff.

Answer 54

1. Their own score on the same test at a different time (e.g. clinical psychologist - intervention had significant effect on client's happiness?) 2. Their score on another test of the same thing. 3. Someone else's score on the same test 4. Someone else's score on another test

Answer 55

True. e.g. a z score

Answer 56

Yes. (i.e. just put the same values in both places in the formula.

Answer 57

The reason that it's 2 standard error the differences is because of the normal distribution.

Answer 58

We can say that the two scores are significantly different from one another at a 95% confidence level.

Answer 59

Score are NOT significantly different from one another at a 95% level of confidence.

Answer 60

Clinical psychologist measuring client happiness before intervention, then after intervention. Change in client's happiness score needs to be more than 2 SEdiff for it to be significant (there was a change).

Answer 61

Calculate SEdiff = 5.6 Difference between score 1 and score two is 9 units. Need to score 2 times SEdiff (11.2) to be significantly different. So you can't tell them apart with this test.

Answer 62

Pretty much the same thing as SEdiff except the formula is different and more commonly used in clinical settings.

Answer 63

Work out difference between two scores (so 9 units) and divide by the Standard Error of the Difference. If the Reliable Change Index is greater than 1.96 (i.e. 2 standard errors of the difference) then you have a statistically significant change.

Answer 64

The average correlation between the items on your scale. In other words, to what extent do individuals give consistent responses across items if all the items in your scale are measuring the same thing.

Answer 65

Reading accuracy, reading rate and comprehension.

Lecture 3 - Reliability Flashcards

(99 cards)