Week 1 - Psychological Tests Flashcards
What year did David Wechsler publish an individual adult test of intelligence?
1948
What is the one thing that all psychological tests are considered to have in common?
They are tools that psychologists use to collect data about people. More specifically they are objective procedures for sampling and quantifying human behaviour in order to make inferences using standardised stimuli
Define criterion-referenced test
A psychological test that uses a predetermined empirical standard as an objective reference point for evaluating the performance of a test taker
Define norm-referenced test
A psychological test that uses the performance of a representative group on the test for evaluating the performance of the test taker
Define psychometric properties
The criteria that a test has to fulfil in order to be useful; they include how accurate and reproducible the test scores are and how well the test measures what it intends to measure
What are some limitations of psychological tests?
- They are only tools - they cannot make decisions for test users
- They are often used in an attempt to capture the effects of a hypothetical construct
- Tests can become obsolete due to the continual development and refinement of theories, technology and the passage of time
- Can sometimes disadvantage a subgroup or culture due to their experience or language background
A psychological test can be thought of as
A sample of items relevant to the construct of interest
In terms of decision theory, the base rate involves
The sum of false negatives and valid positives
Test-retest reliability might be found to be low
Because the construct being measured varies from time to time
Donald McElwain and George Kearney were responsible for developing the
Queensland test
A psychological test can become obsolete when
- Society changes to render the tests norms obsolete
- Psychological theory develops to render the basis of the test obsolete
- Society changes to render the content of items less appropriate
Construct validity is usually restricted to
Achievement tests
Tests used as a sample of behaviour require
The direct performance of the behaviour of interest
Test reliability can be calculated if
- Equivalent forms of the test are available
- The items of the test are intercorrelated
- The test is split into halves
The relationship between reliability and test length
Is non-linear with larger numbers of items being required at higher levels of reliability
What is the difference between psychological testing and psychological assessment?
When we talk of psychological testing we are referring to the process of administering a psychological test and obtaining and interpreting the test scores. Psychological assessment is broader and takes into account other forms of information as well as test results
Define self-report tests
A psychological test that required test takers to report their behaviour or experience
Most common when the interest is in typical behaviour (I.e. Personality and attitudes)
Define performance tests
A psychological test that requires test takers to respond by answering questions or solving problems
Are used to assess the limits of what a person can do (I.e altitudes or abilities)
Define psychometrics
Is concerned with psychological measurement and theories that underpin it
What needs to be considered before administering a psychological test?
- Ensure the test is appropriate - age, education, ethnicity
- Ensure a suitable venue
- Check all test materials are present and intact
- Ensure adequate time is spent becoming familiar with the test
Define a culture-fair test
This is a test where there is no systematic distortion of scores resulting from differences in the cultural background of the test taker
There must be an equivalence across cultures in what is termed the tests construct validity and in it’s predictive or criterion validity
Define norms
Tables of the distribution of scores on a test for specified groups in a population that allow interpretation of any individuals score on a test by comparison to the scores of a relevant group
Define item score
The score for each item on a test
Define raw score total
The total score on the test found by summing item scores
Define criterion-referencing
A way of giving meaning to a test score by specifying the standard that needs to be reached in relation to a limited set of behaviours
Define norm referencing
A way of giving meaning to a test score by relating it to the performance of an appropriate reference group for the person
Defund linear transformation
A transformation that preserves the order and equivalence of distance of original set of scores
Define z-score
A linear transformation of test scores that expresses the distance of each score from the mean of the distribution of scores in units of the standard deviation of the distribution
Define percentile
An expression of the position of a score in a distribution of scores by dividing the distribution into 100 equal parts
What is the deviation IQ?
A term that Wechsler used to capture the essential link between his metric for intelligence and the z score
What are the three main ways to determine percentiles?
- Graphic interpolation
- Arithmetic calculation
- Reading from the tables of the normal curve
What is the Flynn effect?
Refers to a steady increase in scores on IQ tests since about the 1930s - James Flynn
Define reliability
The consistency with which a test measures what it purports to measure in any given set of circumstances
Define domain-sampling model
A way of thinking about the composition of a psychological test that sees the test as a representative sample of the larger domain of possible items that could be included in the test
Define classical test theory
The set of ideas, expressed mathematically and statistically, that grew out of attempts in the first half of the 20th century to measure psychological variables; and that turns on the central idea of a score on a psychological test comprising both true and error score composition
Who devised the first of the modern intelligence tests?
Binet - he proposed a method of quantifying intelligence in terms of the concept of mental age
What is the standard error of measurement?
An expression of the precision of an individual test score as an estimate of the trait it purports to measure
Define reliability coefficient
An index (often a Pearson product moment correlation coefficient) of the ratio of true score to error score variance in a test as used in a given set of circumstances
I.e the proportion of observed score variance that is due to true score variance
Define split-half reliability
The estimate of reliability obtained by correlating scores on the two halves of a test formed in some systematic way (odd v. Even)
What does the spearman-brown formula tell us
It purports to tell us about an otherwise unknown state of affairs
Define Cronbach’s alpha
An estimate of reliability that is based on the average inter correlation of the items in a test
Define test-retest reliability
A long-standing approach used by researchers seeking to evaluate reliability because it’s meaning is intuitively obvious.
It is the estimate of reliability obtained by correlating scores on the test obtained on two or more occasions. Traits considered stable should correlate highly
Define generalisability theory
A set of ideas and procedures that follow from the test construction specifying the desired range of conditions over which the consistency of precision of a test is to hold.
It asks the user to specify what generalisation they are seeking to make and then ask whether there are data that support such a generalisation
Define inter-rather reliability
An estimate of reliability based on the degree of agreement among raters with respect to the quantification of the construct of interest
What rules of thumb did Nunnally (1967) give for assessing reliability?
.5 or better for test development
.7 or better for using a test in research
.9 or better for use in individual assessment
Define equivalent forms reliability
The estimate of reliability of a test obtained by comparing two forms of a test constructed to measure the same construct
How can reliability be improved?
Extending the sample, that is, lengthening the test
Defin validity
The extent to which the test measures what it purports to measure
Define construct validity
Construct validity sees the test as an operation for giving a construct meaning and asks how well it does that
Define content validity
The meaning that can be attached to a score on a psychological test on the basis of inspection of the material that constitutes the test
How is predictive validity evaluated?
It is evaluated in terms of the extent to which scores on the test allow us to estimate scores on a criterion external to the test itself
Define concurrent validity
A form of predictive validity in which the index of social behaviour is obtained close in time to score on the psychological test
I.e prediction when the time interval is minimal
What is the standard error of estimate
It can be thought of as the standard deviation of the distribution of the differences between actual and predicted scores
Define valid positive decisions
Those decisions where the person is predicated to show the characteristic of interest and this is in fact the case
Define valid negatives
Are those in which the prediction is they the person does not show the characteristic of interest and this is the case
What are false positives
Those decisions in which the prediction is that the person has the characteristic but in fact does not
Define false negative
Are those decisions in which the prediction is that the person does not have the characteristic of interest
Define convergent and discriminate validity
The subjection of a multi trait-multimethod matrix to a set of criteria that specify which correlations should be large and which small in terms of a psychological theory of the constructs
Define a factor
A linear combination of the elements of a data matrix
Cronbach and Meehl argued that tests can be invalidated
By correlating too highly with tests of constructs they are not supposed to be measuring
A psychological test can be thought of as
A sample of items relevant to the construct of interest
Generalisability theory
Is a broader theory of reliability than domain sampling theory
Psychological tests are
Important tools for psychological research
Content validity is usually restricted to
Achievement tests
The Army Alpha and Beta tests were developed under the leadership of
Robert Yerkes
The standard score is the basis of which derived score used in psychological testing?
The stand score, the standardised score and the t score
The standard error of estimate can b determined from the knowledge of the
Correlation between test and criterion and the standard deviation of the criterion
interpreting and integrating interview data into the psychological report inevitably involves what?
clinical judgment
In the 1960’s C. Rogers emphasised understanding the proper interpersonal ingredients necessary for optimal therapeutic relationship. Which of the following is NOT one of these?
warmth
genuineness
positive regard
directiveness
A written psychological report is preferred over a verbal report because
it provides and enduring record
How many composite scores can be derived from the core subtests of the WAIS-IV?
five
What are the areas usually covered in a mental status examination?
appearance, orientation, affect, thought content and process, and insight
What is a psychological test?
An objective procedure for sampling and quantifying human behaviour to make an inference about a particular psychological construct using standardised stimuli
Why do we need psychological tests?
Human judgment is subjective and fallible. There are many factors that can effect human judgment such as stereotyping, personal bias, positive and negative halo effect, errors of central tendency.
Psychologists consider psychological tests better than personal judgment in informing decision making in many situations because of the nature and defining characteristics of these tests
What is the difference between psychological testing and assessment?
Testing - the process of administering a psychological test and obtaining and interpreting the test scores
Assessment - a broad process of answering referral questions to which includes but is not limited to psychological testing
What is a construct?
A hypothetical entity with theoretical links to other hypothesised variables, that is postulated to bring about the consistent set of observable behaviours, thoughts or feelings that is the target of a psychological test
What is reliability?
Reliability is the consistency with which a test measures what it purports to measure in any given set of circumstances
What are some of the different types of reliability?
inter-rater test retest split half Cronbach's alpha parallel form
What factors can affect the reliability of test results?
environmental and time factors
social desirability
individual factors
motivational factors
What is validity?
The validity of a test has been traditionally defined as the extent to which the test measures what it purports to measure
What are the different types of validity?
face validity social validity content criterion construct predictive concurrent
Z scores have a M of ___ and a SD of ___
0; 1
T scores have a M of ___ and a SD of ___
50; 10
Standard scores have a M of ___ and a SD of ___
100; 15
e.g. IQ score
If a person receives a SS of 100 on an intelligence test, what does this mean?
The client is in the middle of the average range in the intelligence test
If a person receives a percentile rank of 2 on an intelligence test what does this mean?
98% of people scored higher.
The person is in the bottom 2%
In the context of psychological testing, what does standardisation mean?
The process of administering a test to a representative sample of test takers for the purpose of establishing norms.
What are norms?
Tables of the distribution of scores on a test for specified groups in a population that allow interpretation of any individual’s score on the test by comparison to the scores for a relevant group
What are the different types of norms?
demographic background socio economic status culture ethnicity age language location
Ideally norm samples should be …………
matched to the client
relevant
optimal size (2000 excellent)
When test administration is not followed exactly as outlined in a test manual, what are the possible implications?
Comparison to the norms will not be accurate - the conclusions drawn may not be relevant
Error is an inherent part of psychological testing. True of False? Explain why.
TRUE
- systematic error
- non-systematic error
Psychological tests diagnose individuals. True or False? Explain why.
FALSE
- should not be used in isolation