WEEK 10 - Testing Flashcards

1
Q

What are the steps in test questionnaire construction?

A
  1. Define the test
  2. Selecting a scaling method
  3. Constructing the items
  4. Testing the items
  5. Revising the test
  6. Publishing the test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is a test defined?

A
  • Test/questionaire
  • Item
  • Measure
  • Already been a test developed?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an item in relation to a test?

A
  • Generic word for various forms of content in a psychological test or questionnaire
  • Measurement of attribute
  • Carefully selected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In defining a test, how do you establish what you are seeking to measure?

A
  • Develop clear idea or specification of the attribute
  • Existing theory as a guide
  • Write a document containing specifications for the development of items that includes:
  • Clear definition of attribute
  • outcome of a literature
  • If more than one attribute is to be measured, a specific specification is needed for each
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In defining a test, it is costly and time consuming to develop a new test/questionnaire. Where would you go to find existing mental tests?

A

Mental measurement yearbook:

  • Commercial product released every 5 years
  • Contains info about tests purpose, publisher, pricing, population and scoring
  • Includes only commercially available tests and those in English
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Kaufman and Kaufman model of the test definition process?

A
  1. Measure attribute/construct from a strong theoretical and research basis
  2. Must have capacity to distinguish between different attributes
  3. Yield scores that are translatable to an intervention
  4. Include novel tasks or questions
  5. Be easy to administer and objective to score
  6. Be sensitive to the diverse needs of the groups being assessed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the types of data?

A

Categorical

  • Gender
  • Age band/group
  • Political party

Numerical

  • Discrete
  • no.of children
  • Assignment mark
  • Coffees in one day
  • Continuous
  • Weight
  • Voltage
  • Length
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is nominal measurement?

A
  • A group you put someone in is categorical

- Assign number based on the group the person belongs to but the numerical value is meaningless

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is ordinal measurement?

A
  • Still categorical, but in ranking order

- Tells order but not the distance between each point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is interval measurement?

A
  • Where continuous data is obtained
  • eg. temperature
  • Equal distance between points
  • no true 0
  • People can provide responses according to an ordered response option scale
  • Also referred to as a likest-type scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a ratio measurement?

A
  • continuous
  • starting point of zero
  • difference between points are meaningful
  • Ratio scales are rare in psychological measurement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is included when constructing the items of a test?

A
- Item format 
Related to scaling method of choice 
Dozens of choices available 
- Types of formats
MCQ
T/F
Force-choice
Likert
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some limitations with MCQs?

A
  • Difficult to construct items

- Provides cues for correct response (does not assess free recall)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some limitations of true/false questions?

A
  • Answers may reflect social desirability more than personality traits
  • Not much variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the strengths for forced-choice methodology?

A
  • Often used in personality tests

- Overcomes the problems of t/f questions in social desirability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the problems with forced-choice methodology?

A

People don’t always fit in either category

17
Q

What are the strengths to using likest-type scales?

A
  • One of the most used
  • Can better account for individual differences
  • Good for assessing attitudes and perceptions
  • Reduces desirability bias
18
Q

What are some problems with the likest-type scales?

A
  • is it consistently measuring the construct in question
  • Are all of the item appropriate and contributing to the overall interpretation of the test
  • Assumes strength, intensity of an attitude is linear
  • People don’t always fit Ito specified option
  • Social desirability can still occur
19
Q

How may you test the items in a questionnaire to make sure they are reliable?

A
  1. Conduct a pilot study to ensure the items are clear and easily understood
  2. Administer questionnaire to large participant sample
  3. Do some number crunching in special statistical software
  4. Investigate psychometric properties for individual items
    * item characteristics
    * Create statistically sound sub scales
    * Throw out non-performing items
  5. Determine reliability and validity for the sub scales/overall test
20
Q

How is a test revised?

A
  • Using the new developed test/questionnaire, collect data in a new sample
  • Repeat previous steps
  • Make necessary refinements
    Cross-validate
  • does test perform just as well in new sample?
  • Obtain feedback from examinees or participants
21
Q

What is involved with publishing a test?

A
  • Produce testing materials
  • Develop a technical and users manual that includes:
    Background info
    Development history
    Administration instructions
    Reliability
    Validity
    Normative info
  • Publish a scientific paper
22
Q

What are the 3 main concepts of testing?

A

Standardisation and norms
Validity
Reliability

23
Q

What is standardisation and norming?

A

The process of administering a test to a representative sample for the purpose of establishing norms is referred to as standardising a test.

24
Q

What are standardisation groups?

A
  • Once we have an individual’s score, we want. to know where that score fits in comparison with the individual’s peers
  • Large groups of people are tested and their scores are used to work out test norms
  • We can use the mean and SD of this to work out where an individual sits in comparison to others
  • Depending on the purpose of the test, the standardisation group might be quite specific or general
  • Norms also might change over time (eg. Flynn effect)
25
Q

What are percentiles?

A

When we have a group of scores, we can also work out where a score fits in a distribution
* can be done for specific sub-categories aswell

26
Q

What is test validity?

A
  • Reflects a test’s ability to assess the construct it was designed to measure
27
Q

What are the types of test validity?

A
  • COntent validity
  • Construct validity
  • Criterion-related validity
28
Q

What is content validity?

A

Determined by the degree to which items on the test are representations of the domain of behaviour the test purports to measure

29
Q

Describe construct validity

A

The appropriateness of the inference about. the underlying construct

30
Q

What is a construct?

A

A theoretical, intangible quality or trait in which individuals differ

31
Q

What are the psychometric approaches to understanding the construct validity of tests?

A
  • Identiies groups of items that intercorrelate highly
  • Correlation
  • Factor analysis
32
Q

What is factor analysis in relation to construct validity?

A

Statistical technique to determine the pattern of correlations or variability amongst the items; correlated items or items that share variance form factors or dimension

  • Factors represent underling abilities
  • Factors in a test can be correlated with factors in other alternative tests
33
Q

What is correlation?

A

Statistical measure to indicate the extent to which 2 variables are related

  • Measured on a scale to -1 to 1
  • 0= no correlation, 1= high positive -1= high negative
34
Q

What is criterion validity?

A

The extent to which the test predicts or is related to an outcome
eg. does performance on a IQ test predict academic success?

35
Q

What is reliability in relation to tests?

A

Concerns measurement consistency or the ability of a test to produce consistent results

  • Is it consistently measuring the construct in question?
  • Are all of the items appropriate and contributing to the overall interpretation of the test?
36
Q

What are the two types of reliability?

A

Internal

External

37
Q

What is internal reliability?

A

Concerns the extent to which a measure is consistent to itself
- Also referred o as internal. consistency

38
Q

What is external reliability?

A

Concerns the extent to which a measure varies from one use to another

test-retest reliability: Stability over time
Inter-rater reliability: the degree to which different. rater’s give consistent estimates of the same behaviour

39
Q

What are some sources of error in tests?

A

Test construction
Test administration
Test scoring and interpretation