Week 9 Flashcards

1
Q

What’re psycho metrics?

A

Psychometric is a branch of psychology that deals with the design, administration, and interpretation of quantitive tests for the measurement of psychological variables such as intelligence, aptitude, and personality traits.

  • normally starts with a theoretical construct (eg a theory of personality)
  • a test can then be designed that should measure the construct of interest. Then we can use statistics to check that the test does mea
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Steps in test/questionnaire construction

A
  1. Defining the test
  2. Selecting a scaling method
  3. Constructing the items
  4. Testing the items
  5. Revising the test
  6. Publishing the test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the first part of developing a test: 1. Defining a test?

A

What is a test/questionnaire?

  • a set of items that allows measurement of some attribute of an individual:
  • problems for which correct answers must be found
  • questions about behaviours, feelings or thoughts.
  • questions about attitudes, preferences

What is an item?

  • generic work for the various forms of content in a psychological test or questionnaire (eg a question, if there’s a 50 word test, it’s a 50 item test):
  • permits measurement of an attribute
  • carefully selected (not random)

What is it you are seeking to measure?

  • develop clear idea or specification of the attribute
  • use existing theory as a guide where possible
  • write a document containing specifications for the development of items. It will include: clear definition of the attribute, outcome of a literature search of central theoretical claims and findings, if more than one attribute is to be measured a separate specification is needed for each.

Has there already

Kaufman and Kaufman model of the test definition process

  1. Measure attribute/construct from a strong theoretical and research basis
  2. Must have capacity to distinguish between different attributes
  3. Yield scores that are translatable to an intervention
  4. Include novel tasks or questions
  5. Be easy to administer and objective to score
  6. Be sensitive to the diverse needs of the groups being assessed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Number two of creating a test/questionnaire: 2. Selecting a scaling method

A

Measurement - the assignment of numbers to objects according to a set of rules

Types of data
Four fold classification system (Stevens, 1951)
1. Nominal: just a group you put someone in - it is categorical. You can assign number based on the group the person belongs, but the actual numerical value assigned to each group is meaningless (Dog person = group 1)

  1. Ordinal - still categorical, but in a ranking order - eg. 1st place, 2nd place in a race. Tells us the order, but we cannot tell the distance between each point
  2. Interval - this is where you can obtain continuous data. Eg temperature (in Celsius). Equal distance between points, but there is not a true 0 point (eg 20 degrees isn’t twice as hot as 10 degrees). Also, people can provide responses according to an ordered response option scale. This is often also referred to as a Likert- Type scale (eg 1=strongly agree, 2=agree, 3= disagree, 4= strongly disagree). (Assume that they are equal distances apart).
  3. Ratio - this is a continuous measurement with a starting point of zero. Differences between points on the scale are meaningful (eg weight, length). Ratio scales are rare in psychological measurement. Is there any meaning zero in psychology? Not really.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Step three of creating a test: 3. Constructing the items

A

Initial questions in test and questionnaire construction:
- homogeneity vs. heterogeneity:
Does it allow for attributes to be reliably measured (homogeneity)?
Does it allow for adequate differentiation of people (heterogeneity)?

Expect to construct excess items in the beginning, because not all items will be deemed suitable

Table of specifications:
Details the information and tasks the person is being assessed on eg. Content-by-process matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Multiple choice format

A

Common for students exams/tests to test knowledge or ability

Sometimes used for psychological tests (eg reading the eyes in mind)

Usually one correct answer

Permits quick scoring

Problems:
Difficult to construct items (eg need good distraction items)
Provides cues for the correct response (don’t assess free recall)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True/false questions

A

Often used in personality tests
Easy to understand and answer

Problems:
answers may reflect social desirability
Doesn’t permit much variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Forced choice methodology

A

Often used in personality tests
Overcomes the problems of true/false questions in social desirability

Problems: people don’t always fit in either category (same problems as true/false)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Likert- Type scales

A
  • one of the most used types of response formats
  • can better account for individual differences (many shades of answers)
  • good for assessing attitudes and perceptions
  • reduces social desirability bias

Problems:

  • consistently measuring the construct in question?
  • are all of the items appropriate and contributing to the overall interpretation of the test?
  • assumes strength/intensity of an attitude is linear
  • people don’t always fit into specified options
  • social desirability can still occur
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Tips for writing good test/questionnaire items

A
  • questions should be simple and to the point
  • use of words with a clear meaning (eg most vs. the majority of)
  • avoid double barrelled questions (eg is your lecturer easy to listen to and has a good sense of humour)
  • offer an ‘out’ for questions that domt apply (eg middle response option in a Likert scale)
  • avoid offering too few or too many options
  • don’t ask questions that are too hard to remember (more common events =shorter window of recall)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Step four of a test: 4. Testing the items

A
  1. Conduct a pilot study (small exploratory study) to first ensure the items are clear and can easily be answered (refine items if needed)
  2. Administer questionnaire to a large participant sample m
  3. Do some number crunching in special statistical software
  4. Investigate psychometric properties for the individual items
    - item characteristics (eg difficulty levels)
    - create statistically sound subscales (eg factor analysis)
    - throw out non-performing items
  5. Determine reliability and validity for the subscales / overall test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Step five of developing a test: 5. Revising the test

A
  1. using the new developed test/questionnaire collect data in a new sample
  2. Repeat previous steps (from testing the items)
  3. Make necessary refinements
  4. Cross- validate (does the test perform just as well in new sample?)
  5. Obtain the feedback

Testing the items and revising the test may go around and around and around

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Step six of developing a test: 6. Publishing the test

A

Produce testing materials

Develop a technical and users manual

  • background information
  • development history
  • administration instructions
  • reliability
  • validity
  • normative information

Publish a scientific paper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standardisation group

A

Once we have an individuals score we want to know where that score fits in comparison with the individuals peers

Large groups of people are tested and their scores are used to work out test norms
-we can use the mean and standard deviation of this to work out where an individual sits in comparison to others

Depending on the

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Score to percentile conversions

A

Percentiles: when we have a group of scores, we can also work out where a score fits in a distribution

This can be done for specific sub categories as well

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Validity of tests

A

Validity reflects a tests ability to assess the construct it was designed to measure

  • is the test measuring what you think it is measuring?
  • can you draw meaningful conclusions from it? (Eg IQ test scores predict ability to succeed in school)
17
Q

Content validity

A

Content validity is determined by the degree to which items on the test are representative of the domain of behaviour the rest purports to measure

The exam for this part of the course would have low content validity if it were made up entirely of questions about
-the WAIS

18
Q

Construct validity

A

Construct validity is the appropriateness of the inferences about the underlying construct

Construct: a theoretical, intangible quality or trait in which individuals differ.

Is the construct you are examining in your test (eg extroversion) related to other tests that measure similar constructs (ie that also measure outgoingness)?

19
Q

Psychometric approaches to understanding an instruments construct validity.

A

Identifies groups of items that intercorrelate highly (eg vocab + verbal reasoning)

Correlation: statistical measure to indicate the extent to which two variables are related
-measurement on a scale from -1 to 1. (0= no correlation, 1= high positive, -1= high negative)

Factor analysis: statistical technique to determine the pattern of correlations or variability amongst the items; correlated items or items that share variance form factors or dimensions.

  • factors represent underlying abilities
  • factors in a test can be correlated with factors in other alternative tests
20
Q

Validity of tests

A

Criterion validity is the extent to which the test predicts or is related to an outcome

Does performance on IQ tests predict academic success?

21
Q

Internal reliability

A

Concerns the extent to which a measure is consistent within itself

22
Q

External reliability

A

Concerns the extent to which a measure varies from one use to another use

Test-retest reliability: stability over time
-if I do am intelligence test today and again in 6 months time, will I get the same score?

Inter-rated reliability: the degree