Psych Testing Flashcards
What is validity?
-
Test Standards Definition: “Validity refers to the degree to which evidence and theory support the interpretations of tests scores entailed by proposed uses of tests”
- Test standards = current framework/operational guidelines
- Validation is the joint responsibility of the test developer and the test user
- Test developer should present a rationale for recomended use and interpretation, accompanied by evidence and theory.
What are some past definitions of validity?
-
1954 Criterion-based view: A test is valid for anything it corrolates with. Validity was a static property.
- Problems: Some tests used for different purposes in different groups.
-
1966 Tripartite view: Criterion validity (concurrent and predictive), Content validity (relevant and representative of domain) and Construct validity (convergent/discriminant)
- Problems: Based on nomological network. If the test isn’t valid, is the theory or the test wrong? Overemphasis on different forms of validity (often indistinct) and correlations as proof.
- Updated in 1985 to included consequences of testing.
What is the current 5 source view of validity?
-
Unchanged from 1999: Unitary form of validity based on evidence from multiple sources to support an argument of what test scores mean.
- No diff types of validity, validity is a property of interpretation not test
- Evidence from 5 sources:
- Content: Relevance and representativeness of content
- Response Processes: if intended to measure a process, this should be provable ie not affected by manipulations
- Internal Structure: Factor analysis should match theory
- Relationship to other variables: Convergent/discriminant, test-criterion, and generalisation across situation (pop, conditions) and purpose (type of jobs)
- Consequences of testing: consider intended and unintended consequences of testing (eg naplan funding vs competition)
What are some of the major purposes of psychological testing?
- Classification: Selection (education and employment), Screening, Certification, Placement
- Diagnosis/Treatment planning: Clinical, Educational (giftedness/learning difficulties), neuropsychological deficits
- Coaching/training: Insight (self-knowledge), career-counselling, coaching
- Legal Application: Diminished responsibility, special dispensations, compensation claims
- Research
- Program Evaluation
What are some of the main tests used to measure intelligence and aptitude?
- Aptitude tests differ a bit from intelligence since they relate to trainable output
-
Individually administered tests: Primarily for children, diagnosis, emphasis on rapport
- Stanford Binet: Verbal and non-verbal factors 5 areas.
- Weschler scales: 3 tests for different ages, subscales and tests vary.
- Woodcock Johnson: achievement and intelligence test batteries.
-
Group administered tests:
- ASVAB: armed services aptitude
- GAMSAT: graduate australiam medical admissions test
- Ravens Progressive Matrices
What is Hollands vocational interest model?
- Holland’s vocational interest tests 6 domains
- Realistic: practical, hands-on, tool-oriented
- Investigative: analytical, intellectual, scientific, explorative
- Artistic: creative, independent, chaotic
- Social: cooperative, supporting, helping, healing
- Enterprising: competitive, leadership, persuading
- Conventional: detail-oriented, organising, clerical.
- These domains are arranged in a hexagon, in order of correlations between them (lowest correlations opposite)
What are some applications of psychological tests?
-
Neuropsychology: Checklists for frontal lobe dysfunction
- Luria-Nebraska Neuropsychological battery (attenton, language, memory, spatial, executive function), the mini-mental state exam (MMSE)
-
Health Psychology: McGill Pain questionnaire, Beck Depression inventory
- Alcholism: TWEAK (tolerance, worry, eye-opener, amnesia, cut-down)
- Forensic Assessment: malingering, assessment for insanity plea, child custody
How can psychometric tests be used for selection and training in the workplace?
-
Selection: Important to match selection criteria with job requirements. Steps:
- Job analysis: What tasks are required?
- Write job description: What qualities does the person need?
- Test candidate pool,
- Select best candidate
-
Score Feedback for training: Focus on profile not raw scores, focus on developmental planning
- Compensatory strategies: reshape problem, externalise
- Developmental activities: 1. Deliberate practice 2. training 3. Mentoring, 4. Goal setting, 5. Plan, monitor, evaluate.
What four factors affect reliability?
-
People taking the test: Reliability is based on variability people people: large SD = strong reliability
- Match person to test to avoid floor/ceiling effects
-
Test Characteristics: Bandwidth vs fidelity (more specific test = higher reliability)
- Don’t sacrifice content coverage for reliability
-
Item Characteristics: internal consistency affected by # of items and correlation between them.
- A reliable test either has many items with small rs, or few with strong rs.
- Method used to estimate reliability: test-retest vs internal consistency etc. Consider appropriateness of method (ie if construct expected to change over time).
What is reliability?
-
Reliability = the ratio of the true score variance to the observed score variance inclusive of error. Aim for
- .9 for high stakes, .7 for research, .6 if multiple measures
- Validity is dependent on reliability: the maximum correlation between 2 variables is determined by the error in the test.
How does reliability relate to test length?
-
Reliability increases as the number of items increases:
- Spearman Brown formula: predicted reliability is a function of test length and existing reliability (assuming equal reliability of items).
-
Balancing test length:
- Too long a test causes boredom, exhaustion, loss of motivation
- Problems with short tests: previous item exposure, inadequate domain sampling.
- Solution: Adaptive testing and Computerised adaptive testing
What is adaptive testing? What are the advantages and disadvantages?
-
Testing is adapted to the persons level of ability: previous responses determine next questions.
- Used in major batteries like stanford binet
-
Computerised adaptive testing (CAT): Computer algorithm used to select futher items according to a rule.
- Used in large scale testing where security is important (ASVAB, TOEFL)
-
CAT Advantages: Tests are shorter but just as reliable:
- economic advantage, fewer problems with motivation, easier to maintain test security
- CAT Disadvantages: Substantial preparation and outlay needed (v.large item pool, analysis of difficulty, algorithms), requires computers.
What are anchoring vignettes?
-
Problems with self-rating scales: there are significant variations in responding styles individually and culturally
- extreme vs conservative responders, tendency to ‘agree’ with statements
-
Anchoring vignettes: vignettes of hyperthetical people are given to be rated.
- The average rating is then subtracted from self-rating
-
Examples: significant cross-cultural discrepancies have been solved using anchoring vignettes such as:
- relationship between teacher helpfulness and achievement
- relationship between conscientiousness and life expectancy
What are situational judgement tests?
-
SJTs give situations and require the respondent to choose the best response option
- can be typical or maximal performance (would/should)
- Seen as more engaging (higher face-validity)
- Show lower adverse impacts than IQ tests
-
Development of SJTs: use subject matter experts
- Collect critical situations from SMEs, and summarise these into items
- Collect responses from everday and SME
- Score answers based on SME opinions
- test items and select the most reliable ones
What are the different motivations and situations that influence response distortion?
-
High stakes situations are prone to faking:
- Faking Good: employment selection, internet dating and educational selection. NEO-PI-R example “I strive for excellence”
-
Faking Bad: Legal (benefits/diminished resp), Education (special comp), military (discharge, special duties, conscription)
- Estimated faking in 30% personal injury cases, instructions on faking dropped to opposing military in WWII (both sides)
-
Types of faking: Conscious and unconscious biases
- Self-deceptive enhancement: Linked to Egoistic Bias. Value = agency, strong, competent, exaggeration of status (social, physical etc)
- Self-deceptive denial: Linked to Moralistic Bias. Value = communion, good kind. Deny socially deviant impulses/behaviours.
- Impression management: conscious bias