Week 7: Tests Flashcards
What are psychometrics?
The branch of psychology concerned with testing and measurement
What is classical test theory?
Observed score = true score + error
What is the goal of testing?
To produce useful tests and reduce the capacity for error as much as possible
List some examples of test construction that could influence error
- poorly worded questionnaires
- extreme statements
- confusing words and expression
- culturally confusing terms
What are some factors that could influence errors in test administration?
- deviation from instructions
- testing environment
- test administrator
- participant derived error
Examples of participant derived error
- hostile or uncooperative
- donkey vote
- misunderstanding of instructions
- temporary illness or condition that might impact test scores
- contrast your motivations in collecting data with participant motivations in taking part in research
What are the types of validity
- face
- construct
- content
- criterion
Face validity
Does the test meet participants expectations and appear to measure what it is supposed to?
Content validity
Does the test assess the whole content area of what it’s meant to be assessing?
List some useful non-specific bases to cover in generating content
- emotions
- thoughts
- behaviours
- interpersonal relationships
What are four ways of categorising constructs?
- component analysis
- factor analysis
- cluster analysis
- taxometric analysis
Describe component analysis
About taking a bunch of things and reducing them into one thing
Describe factor analysis
Finding the underlying constructs of items
Describe cluster analysis
Used for grouping participants based on current data or predicted outcomes
What is the aim of cluster analysis?
To create groups that are homogenous as possible
Describe taxometric analysis
- attempts to answer continuum vs category questions
What is null hypothesis significance testing?
Process of logic examining the probability that a selection of data indicating an effect is drawn from the same sampling distribution as a selection of data not demonstrating an effect (the hypothesis that there is no effect occurring)
When do we reject the null?
When the probability is small (e.g.
What are some short comings of significance?
- higher sample size likely to bias towards lower p value
- lower p value tied to stronger effect, so doesn’t necessarily detect small but consistent effects
Discuss how to cope with assumption violations
- run the original tests and qualify this in the discussion
- modify the distribution using a transformation
- identify tests that have different assumptions the might be more appropriate (e.g. non parametric)
What is the ‘optimal’ solution for assumption violations?
- run the statistics in a raw form and one where you compensate for violated assumptions
- only report a modified data if the pattern of results substantially differs
Why is significance controversial?
- significant data is reported and biased in publication
How do computerised tasks typically operate?
Examining number of errors (accuracy) or reaction time (speed)
List some methodological decisions for testing
- number of items
- evenness of content distribution
- format of response scale
- mid point option in response scale
- use of reverse coding
- instructions framing questionnaire
- time period responses is asked to reflect on