Standardised methods Flashcards by Lauren Heaton

What does the reliability of a measure refer to?

Is it as free as possible from random error?
- accurate and consistent
- free from random error

How well did you know this?

Not at all

Perfectly

What does the validity of a measure refer to?

Does it measure what it says it measures?
- free from random AND systematic error

How well did you know this?

Not at all

Perfectly

What does unidimensionality of a measure refer to?

Are we measuring just the one thing we want to measure or have we ended up measuring other things too?

How well did you know this?

Not at all

Perfectly

What does discrimination of a measure refer to?

How well do our items distinguish between levels of the thing we’re measuring?

How well did you know this?

Not at all

Perfectly

What does equivalence of a measure refer to?

Does the measure perform the same way for different groups of people?

How well did you know this?

Not at all

Perfectly

What is norm-referencing?

How are scores distributed in the population?

How well did you know this?

Not at all

Perfectly

How are measures standardised?

rigorously tested for validity and reliability
Norm-referenced (compare scores against population norms)
Often delivered in tightly controlled ways

How well did you know this?

Not at all

Perfectly

What is the equation for observed scores?

Observed score = true score +/- error

How well did you know this?

Not at all

Perfectly

What are random errors?

Usually small deviations above or below true score
E.g., you measure a table three times using the same tape measure and get slightly different values: 174.6cm , 174.2 cm, 174.4 cm
If we take a number of measurements, the sum and mean of random errors should tend towards zero.

How well did you know this?

Not at all

Perfectly

What are systematic errors?

Unlike random errors, systematic errors do not cancel each other out with multiple measurements: they accumulate
E.g. The plastic tape measure that you use to measure the table has been stretched out from years of use. It consistently underestimates the true length of the table

How well did you know this?

Not at all

Perfectly

What is an example of random error and systematic error in a questionnaire?

Random error- today 5- strongly agree, next week 4- mostly agree
Systematic error- Administering the questionnaire during Covid-19 pandemic when very few are socialising regularly

How well did you know this?

Not at all

Perfectly

When do systematic errors occur?

When items are supposed to measure just one dimension of a construct (unidimensional) but in fact, measures more than 1.
eg. intended dimension: extraversion
Unintended dimension: testing environment (during pandemic)

How well did you know this?

Not at all

Perfectly

How could random error be reduced?

repeat measurements and average them (not as simple for psychological variables)

How well did you know this?

Not at all

Perfectly

How can systematic errors be reduced?

Use multiple measurements, each with different downsides (nuisance factors) - variable of interest is measured consistently but nuisance factors are not

How well did you know this?

Not at all

Perfectly

What does it mean to be consistent/dependable? (reliability)

across time and context

How well did you know this?

Not at all

Perfectly

What is test-retest reliability?

If you measure something at one point in time, will it remain consistent at another point in time?

How well did you know this?

Not at all

Perfectly

What is parallel form reliability?

Will measured characteristic be the same when using multiple versions fo a measure?

How well did you know this?

Not at all

Perfectly

What is internal consistency?

Are all items doing just as good a job as one another in measuring the psychological construct of interest?
- operationalization
If measuring the same thing: will be highly correlated

How well did you know this?

Not at all

Perfectly

What is a strength of test-retest reliability?

Study These Flashcards

Demonstrates that the measure is temporarily stable

What are 3 weaknesses of test-retest reliability?

Study These Flashcards

Based on total score
What about emotion or motivation?
How long between testing sessions?

What is a strength of parallel forms reliability?

Study These Flashcards

Reduces risk of learning effects when evaluating reliability over time

What is a measure of internal consistency?

Study These Flashcards

Cronbach’s Alpha

What does Cronbach’s alpha measure?

Study These Flashcards

Mean correlation between items in a subscale
Number of items in a subscale

What is the maximum value of Cronbach’s alpha?

Study These Flashcards

1 (higher=more reliable)

What Cronbach's alpha value indicates reliability for research purposes?

> 0.7

What is the split-half technique?

Another way of quantifying internal consistency - compare scores across 2 halves of a measure eg. questionnaire has 20 items – does total score of first 10 items correlate with total score of second 10 items?

What are 2 strengths of internal consistency?

- It’s essential! Poor internal consistency can only be due to items measuring different things - Rubbish in, rubbish out…

What are 2 weaknesses of internal consistency?

- If you increase number of items, Cronbach's alpha increases - Extremely high Cronbach's alpha values might be bloated - too narrow a range of questions were asked

What is inter-rater reliability largely used for?

Coding of observational data - could be subjective - hard criteria to interpret - could be objective - clearer criteria

What is internal validity?

Can causal relationship be explained by other factors?

What is external validity?

Can we generalise to other situations/populations?

What is construct validity?

How well we are measuring what we want to measure

What is translation validity?

Is the operationalisation a good reflection of the construct?

What is criterion validity?

How well does the measure agree with some external standard?

What is face validity?

Does the instrument appear to measure the construct? - not based on theoretical concepts

What is content validity?

To what extent do the items actually represent the whole of the construct dimension that we are trying to measure?

What are the 4 sub-categories of criterion validity?

- predictive validity - concurrent validity - convergent validity - discriminant validity

What is predictive validity?

Does a score on the measure predict the value of another variable in the future?

What is concurrent validity?

Does the measure now correlate with info from a related measure?

What is convergent validity?

Does the measure correlate with another variable that it should theoretically be related to?

What is discriminant validity?

Does the measure correlate with a conceptually unrelated construct? (BAD)

What are 2 ways to score subscales?

- sum scores (if scale is not equally weighted) - mean scores across subscale (unequal number of items in subscales but equal weighting)

How do you get standardised scoers?

- collect large amounts of data from sample that are representative of population - Convert raw scores to standard scores (such as z-scores- how far a score is from the mean using standard deviations) - 50% of ppts will score lower than the mean and 50% above the mean) - an ongoing process

Why do you want to use standardised scores?

- Gives a reference point for where a score on a measure lies compared to population - can assess people against these norms

What are extreme scores (ref. standardised scores)?

Certain number of SDs below mean (usually 1.5/2)

What are advantages of using standardised measures?

Rigorous design process - start with hundreds of possible measures (items) - Often initial item list evaluated by expert panel - often subject to factor analysis Validity and reliability repeatedly tested Tests have descriptive statistics for population norms which you can use to compare with your own

Why is adapting an existing measure risky?

- Even a slight alteration of wording can impact how people answer - cant lay claim to original questionnaire's reliability or validity after adaptation - Adapted questionnaires should undergo some pre-testing to evaluate reliability and validity - must report these findings

Standardised methods Flashcards

(47 cards)