Week 4 - Test Construction Flashcards

1
Q

Rational-empirical approach

A

relies on both reasoning from what is known about psychological construct, and collecting and evaluating data about how the test and items actually behave when administered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Empirical approach

A

relies on collecting and evaluating data about how each of the items from a pool of items discriminates between groups who are through to show or not show the measured attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Steps (5)

A
  1. Test Conceptualisation
  2. Test Construction
  3. Test Tryout
  4. Item Analysis
  5. Test Revision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Steps (specific)

A
  1. Specify Attributes
  2. Check literature for existing test
  3. Choose measurement model
  4. Write and edit items
  5. Administer and analyse response
  6. Select ‘best’ items for test
  7. Check reliability and validity
  8. Norm
  9. Prepare test manual
  10. Publish test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Specification of the attribute

A

Attribute, construct, latent trait, test specification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Attribute

A

consistent set of behaviours, thoughts of feelings of a characteristic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Construct

A

a specific idea or concept about a psychological process or underlying trait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Latent Trait

A

involves the strong assumption that there is only one dimension underlying the attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Test specification

A

a written statement of the attribute or construct that the test constructer is seeking to measure and the conditions under which it will be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Literature search

A
  • See how others have approached the problem in the past
  • Identify theories or other constructs that may be relevant
  • Obtain a clear, theory-informed conceptualisation and definition of the target construct
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Literature search questions

A

○ Do psychological traits and states exist
○ Can they be measured
○ Test behaviour is predictive
○ What are tests strengths/weaknesses/errors
○ Is it fair and will benefit society

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Types of measure

A
  • Nominal - classification or categorisation - categories themselves are not meaningful - no relationship between categories (includes yes/no questions)
  • Ordinal - classification but in some sort of rank - not units of measurement, not evenly spaced
  • Interval - equal intervals between each number
  • Ratio - as per interval, but has a true zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Models of measurement

A

formal statement of observations of objects mapped to numbers that represent relationship among the objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Trace line

A

a graph of the probability of response to an item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Classical test theory, Item Response theory (CTT, IRT)

A

not actually sure what these mean?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Differential item functioning

A

possibility that a psychological test item will behave differently for different groups of respondents

17
Q

Item writing and editing

A
  • Items should be relevant and representative
  • Plan for item writing - a plan of the number and type of items that are required for a test
  • Over inclusion of items is recommended at this point
  • 5th - 7th reading level
18
Q

Item writing guidelines

A
  • Use straight forward language
  • Avoid double barrelled items
  • Avoid slang and colloquial expressions that can quickly become obsolete
  • Consider if using positive and negative words is a good idea
  • Write items that majority can respond to appropriately
  • Ask about sensitive issues using straightforward and non-judgemental language
    Phrasing is consistent with response options
19
Q

Likert Scale

A

Typically provides the test-taker with 5 or 7 possible responses along a continuum

Pros - degree of trait can be measured, lots of information, easy to use, works best with strong statements
Cons - odd vs even number of responses

20
Q

Binary choice scale

A

two options, ie. True/false, yes/no

Pros - easy to construct and score, quick to administer, a lot of questions
Cons - Allows guessing, only suits dichotomous content, content not as rich

21
Q

Paired comparisons

A

two options, on a basis of some rule, with each option assigned value (0 or 1)

22
Q

Comparative scaling

A

sorting or ranking stimuli according to a rule

23
Q

Written/essay formats

A

Pros - written communication, complex and imaginative, information generated not recognised

Cons - narrow content, bluffing possible, hiding behind good writing, scoring is time consuming, inter-rater reliability issues

24
Q

Test try out

A
  • Administer test on representative sample
  • Use standardised instructions
  • Data is then used to narrow down number of items
25
Q

Item Analysis

A

Properties to investigate:
Item difficulty/distribution, Dimensionality (i.e. factor analysis), Item reliability, Item validity, Item discrimination

26
Q

Item Validity

A

the extent to which the score on an item correlates with an external criterion relevant to the attribute

27
Q

Performance of items

Item Difficulty Index

A

Performance on each individual item should differ (i.e. 100% correct is bad)

Item difficulty index = examinees who answered correctly/total number of examinees

  • High index = low difficulty
  • The probability of guessing correctly is taken into account when deciding the optimal item-difficulty index
28
Q

Item distributions

A
  • Consider removing items with skewed distribution - these are items most people will answer in the same way
  • Keep items with high variance/distribution
  • Keep items with a mean close to centre of range of possible scores
29
Q

Dimensionality

A

Some items may not have a common underlying variable or they may have several underlying variables (go to factor analysis)

30
Q

Factor analysis

A
  • New scale development usually starts with exploratory factor analysis (EFA) to identify a manageable number of factors
  • Confirmatory factor analysis (CFA) used when number of factors is known
  • Determine the number of underlying latent variables or constructs
  • Help condense information
  • Define the content or meaning of the factors
  • Helps identify items that are performing better or worse
31
Q

Factor analysis decisions

A
  • Number of factors to extract – Eigenvalues (> 1) – Scree Plot
  • Rotation - oblique (factors are correlated) or orthogonal (factors are uncorrelated)
32
Q

Item reliability

A
  • a measure of internal consistency

Are the items homogeneous? - Correlation between the score for the test item and the scale score (item-scale correlations), inter-relatedness (Cronbach’s alpha)

33
Q

Item-discrimination index

A
  • Does the item separate high and low scorers
  • Comparison of top and bottom performers on the test
  • Commonly calculated using a point-biserial correlation
34
Q

Test revision

A

Once test has been revised, it needs to be tried out and go through analysis again

Existing tests - tests can ‘age’ (interpretations, domains, stimuli change; word meanings, test norms, theories behind test) - may need to be reviewed

35
Q

Cross-validation

A

Collection of additional criterion-related validity data
- Is the test applicable to this population?

Validity Shrinkage: Often lower validity the second time around ▪ Inevitable ▪ Generally a slight difference ▪ Eliminating chance results ▪ Near enough is good enough!

36
Q

Norming

A

In a representative population
- General population vs specific population

Move on to creating a test manual/instructions and publication