Standardised methods Flashcards

1
Q

What does the reliability of a measure refer to?

A

Is it as free as possible from random error?
- accurate and consistent
- free from random error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the validity of a measure refer to?

A

Does it measure what it says it measures?
- free from random AND systematic error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does unidimensionality of a measure refer to?

A

Are we measuring just the one thing we want to measure or have we ended up measuring other things too?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does discrimination of a measure refer to?

A

How well do our items distinguish between levels of the thing we’re measuring?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does equivalence of a measure refer to?

A

Does the measure perform the same way for different groups of people?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is norm-referencing?

A

How are scores distributed in the population?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are measures standardised?

A
  • rigorously tested for validity and reliability
  • Norm-referenced (compare scores against population norms)
  • Often delivered in tightly controlled ways
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the equation for observed scores?

A

Observed score = true score +/- error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are random errors?

A

Usually small deviations above or below true score
E.g., you measure a table three times using the same tape measure and get slightly different values: 174.6cm , 174.2 cm, 174.4 cm
If we take a number of measurements, the sum and mean of random errors should tend towards zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are systematic errors?

A

Unlike random errors, systematic errors do not cancel each other out with multiple measurements: they accumulate
E.g. The plastic tape measure that you use to measure the table has been stretched out from years of use. It consistently underestimates the true length of the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an example of random error and systematic error in a questionnaire?

A

Random error- today 5- strongly agree, next week 4- mostly agree
Systematic error- Administering the questionnaire during Covid-19 pandemic when very few are socialising regularly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When do systematic errors occur?

A

When items are supposed to measure just one dimension of a construct (unidimensional) but in fact, measures more than 1.
eg. intended dimension: extraversion
Unintended dimension: testing environment (during pandemic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How could random error be reduced?

A
  • repeat measurements and average them (not as simple for psychological variables)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can systematic errors be reduced?

A
  • Use multiple measurements, each with different downsides (nuisance factors) - variable of interest is measured consistently but nuisance factors are not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does it mean to be consistent/dependable? (reliability)

A
  • across time and context
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is test-retest reliability?

A

If you measure something at one point in time, will it remain consistent at another point in time?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is parallel form reliability?

A

Will measured characteristic be the same when using multiple versions fo a measure?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is internal consistency?

A

Are all items doing just as good a job as one another in measuring the psychological construct of interest?
- operationalization
If measuring the same thing: will be highly correlated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a strength of test-retest reliability?

A

Demonstrates that the measure is temporarily stable

20
Q

What are 3 weaknesses of test-retest reliability?

A
  • Based on total score
  • What about emotion or motivation?
  • How long between testing sessions?
21
Q

What is a strength of parallel forms reliability?

A

Reduces risk of learning effects when evaluating reliability over time

22
Q

What is a measure of internal consistency?

A

Cronbach’s Alpha

23
Q

What does Cronbach’s alpha measure?

A
  • Mean correlation between items in a subscale
  • Number of items in a subscale
24
Q

What is the maximum value of Cronbach’s alpha?

A

1 (higher=more reliable)

25
Q

What Cronbach’s alpha value indicates reliability for research purposes?

A

> 0.7

26
Q

What is the split-half technique?

A

Another way of quantifying internal consistency
- compare scores across 2 halves of a measure
eg. questionnaire has 20 items – does total score of first 10 items correlate with total score of second 10 items?

27
Q

What are 2 strengths of internal consistency?

A
  • It’s essential! Poor internal consistency can only be due to items measuring different things
  • Rubbish in, rubbish out…
28
Q

What are 2 weaknesses of internal consistency?

A
  • If you increase number of items, Cronbach’s alpha increases
  • Extremely high Cronbach’s alpha values might be bloated - too narrow a range of questions were asked
29
Q

What is inter-rater reliability largely used for?

A

Coding of observational data
- could be subjective - hard criteria to interpret
- could be objective - clearer criteria

30
Q

What is internal validity?

A

Can causal relationship be explained by other factors?

31
Q

What is external validity?

A

Can we generalise to other situations/populations?

32
Q

What is construct validity?

A

How well we are measuring what we want to measure

33
Q

What is translation validity?

A

Is the operationalisation a good reflection of the construct?

34
Q

What is criterion validity?

A

How well does the measure agree with some external standard?

35
Q

What is face validity?

A

Does the instrument appear to measure the construct?
- not based on theoretical concepts

36
Q

What is content validity?

A

To what extent do the items actually represent the whole of the construct dimension that we are trying to measure?

37
Q

What are the 4 sub-categories of criterion validity?

A
  • predictive validity
  • concurrent validity
  • convergent validity
  • discriminant validity
38
Q

What is predictive validity?

A

Does a score on the measure predict the value of another variable in the future?

39
Q

What is concurrent validity?

A

Does the measure now correlate with info from a related measure?

40
Q

What is convergent validity?

A

Does the measure correlate with another variable that it should theoretically be related to?

41
Q

What is discriminant validity?

A

Does the measure correlate with a conceptually unrelated construct? (BAD)

42
Q

What are 2 ways to score subscales?

A
  • sum scores (if scale is not equally weighted)
  • mean scores across subscale (unequal number of items in subscales but equal weighting)
43
Q

How do you get standardised scoers?

A
  • collect large amounts of data from sample that are representative of population
  • Convert raw scores to standard scores (such as z-scores- how far a score is from the mean using standard deviations)
  • 50% of ppts will score lower than the mean and 50% above the mean)
  • an ongoing process
44
Q

Why do you want to use standardised scores?

A
  • Gives a reference point for where a score on a measure lies compared to population
  • can assess people against these norms
45
Q

What are extreme scores (ref. standardised scores)?

A

Certain number of SDs below mean (usually 1.5/2)

46
Q

What are advantages of using standardised measures?

A

Rigorous design process
- start with hundreds of possible measures (items)
- Often initial item list evaluated by expert panel
- often subject to factor analysis
Validity and reliability repeatedly tested
Tests have descriptive statistics for population norms which you can use to compare with your own

47
Q

Why is adapting an existing measure risky?

A
  • Even a slight alteration of wording can impact how people answer
  • cant lay claim to original questionnaire’s reliability or validity after adaptation
  • Adapted questionnaires should undergo some pre-testing to evaluate reliability and validity - must report these findings