Psych Testing Flashcards

Question 1

Q

What is validity?

Answer

A

Test Standards Definition: “Validity refers to the degree to which evidence and theory support the interpretations of tests scores entailed by proposed uses of tests”
- Test standards = current framework/operational guidelines
Validation is the joint responsibility of the test developer and the test user
- Test developer should present a rationale for recomended use and interpretation, accompanied by evidence and theory.

Question 2

Q

What are some past definitions of validity?

Answer

A

1954 Criterion-based view: A test is valid for anything it corrolates with. Validity was a static property.
- Problems: Some tests used for different purposes in different groups.
1966 Tripartite view: Criterion validity (concurrent and predictive), Content validity (relevant and representative of domain) and Construct validity (convergent/discriminant)
- Problems: Based on nomological network. If the test isn’t valid, is the theory or the test wrong? Overemphasis on different forms of validity (often indistinct) and correlations as proof.
- Updated in 1985 to included consequences of testing.

Question 3

Q

What is the current 5 source view of validity?

Answer

A

Unchanged from 1999: Unitary form of validity based on evidence from multiple sources to support an argument of what test scores mean.
- No diff types of validity, validity is a property of interpretation not test
Evidence from 5 sources:
1. Content: Relevance and representativeness of content
2. Response Processes: if intended to measure a process, this should be provable ie not affected by manipulations
3. Internal Structure: Factor analysis should match theory
4. Relationship to other variables: Convergent/discriminant, test-criterion, and generalisation across situation (pop, conditions) and purpose (type of jobs)
5. Consequences of testing: consider intended and unintended consequences of testing (eg naplan funding vs competition)

Question 4

Q

What are some of the major purposes of psychological testing?

Answer

A

Classification: Selection (education and employment), Screening, Certification, Placement
Diagnosis/Treatment planning: Clinical, Educational (giftedness/learning difficulties), neuropsychological deficits
Coaching/training: Insight (self-knowledge), career-counselling, coaching
Legal Application: Diminished responsibility, special dispensations, compensation claims
Research
Program Evaluation

Question 5

Q

What are some of the main tests used to measure intelligence and aptitude?

Answer

A

Aptitude tests differ a bit from intelligence since they relate to trainable output
Individually administered tests: Primarily for children, diagnosis, emphasis on rapport
- Stanford Binet: Verbal and non-verbal factors 5 areas.
- Weschler scales: 3 tests for different ages, subscales and tests vary.
- Woodcock Johnson: achievement and intelligence test batteries.
Group administered tests:
- ASVAB: armed services aptitude
- GAMSAT: graduate australiam medical admissions test
- Ravens Progressive Matrices

Question 6

Q

What is Hollands vocational interest model?

Answer

A

Holland’s vocational interest tests 6 domains
1. Realistic: practical, hands-on, tool-oriented
2. Investigative: analytical, intellectual, scientific, explorative
3. Artistic: creative, independent, chaotic
4. Social: cooperative, supporting, helping, healing
5. Enterprising: competitive, leadership, persuading
6. Conventional: detail-oriented, organising, clerical.
These domains are arranged in a hexagon, in order of correlations between them (lowest correlations opposite)

Question 7

Q

What are some applications of psychological tests?

Answer

A

Neuropsychology: Checklists for frontal lobe dysfunction
- Luria-Nebraska Neuropsychological battery (attenton, language, memory, spatial, executive function), the mini-mental state exam (MMSE)
Health Psychology: McGill Pain questionnaire, Beck Depression inventory
- Alcholism: TWEAK (tolerance, worry, eye-opener, amnesia, cut-down)
Forensic Assessment: malingering, assessment for insanity plea, child custody

Question 8

Q

How can psychometric tests be used for selection and training in the workplace?

Answer

A

Selection: Important to match selection criteria with job requirements. Steps:
1. Job analysis: What tasks are required?
2. Write job description: What qualities does the person need?
3. Test candidate pool,
4. Select best candidate
Score Feedback for training: Focus on profile not raw scores, focus on developmental planning
- Compensatory strategies: reshape problem, externalise
- Developmental activities: 1. Deliberate practice 2. training 3. Mentoring, 4. Goal setting, 5. Plan, monitor, evaluate.

Question 9

Q

What four factors affect reliability?

Answer

A

People taking the test: Reliability is based on variability people people: large SD = strong reliability
- Match person to test to avoid floor/ceiling effects
Test Characteristics: Bandwidth vs fidelity (more specific test = higher reliability)
- Don’t sacrifice content coverage for reliability
Item Characteristics: internal consistency affected by # of items and correlation between them.
- A reliable test either has many items with small rs, or few with strong rs.
Method used to estimate reliability: test-retest vs internal consistency etc. Consider appropriateness of method (ie if construct expected to change over time).

Question 10

Q

What is reliability?

Answer

A

Reliability = the ratio of the true score variance to the observed score variance inclusive of error. Aim for
- .9 for high stakes, .7 for research, .6 if multiple measures
Validity is dependent on reliability: the maximum correlation between 2 variables is determined by the error in the test.

Question 11

Q

How does reliability relate to test length?

Answer

A

Reliability increases as the number of items increases:
- Spearman Brown formula: predicted reliability is a function of test length and existing reliability (assuming equal reliability of items).
Balancing test length:
- Too long a test causes boredom, exhaustion, loss of motivation
- Problems with short tests: previous item exposure, inadequate domain sampling.
Solution: Adaptive testing and Computerised adaptive testing

Question 12

Q

What is adaptive testing? What are the advantages and disadvantages?

Answer

A

Testing is adapted to the persons level of ability: previous responses determine next questions.
- Used in major batteries like stanford binet
Computerised adaptive testing (CAT): Computer algorithm used to select futher items according to a rule.
- Used in large scale testing where security is important (ASVAB, TOEFL)
CAT Advantages: Tests are shorter but just as reliable:
- economic advantage, fewer problems with motivation, easier to maintain test security
CAT Disadvantages: Substantial preparation and outlay needed (v.large item pool, analysis of difficulty, algorithms), requires computers.

Question 13

Q

What are anchoring vignettes?

Answer

A

Problems with self-rating scales: there are significant variations in responding styles individually and culturally
- extreme vs conservative responders, tendency to ‘agree’ with statements
Anchoring vignettes: vignettes of hyperthetical people are given to be rated.
- The average rating is then subtracted from self-rating
Examples: significant cross-cultural discrepancies have been solved using anchoring vignettes such as:
- relationship between teacher helpfulness and achievement
- relationship between conscientiousness and life expectancy

Question 14

Q

What are situational judgement tests?

Answer

A

SJTs give situations and require the respondent to choose the best response option
- can be typical or maximal performance (would/should)
- Seen as more engaging (higher face-validity)
- Show lower adverse impacts than IQ tests
Development of SJTs: use subject matter experts
- Collect critical situations from SMEs, and summarise these into items
- Collect responses from everday and SME
- Score answers based on SME opinions
- test items and select the most reliable ones

Question 15

Q

What are the different motivations and situations that influence response distortion?

Answer

A

High stakes situations are prone to faking:
- Faking Good: employment selection, internet dating and educational selection. NEO-PI-R example “I strive for excellence”
- Faking Bad: Legal (benefits/diminished resp), Education (special comp), military (discharge, special duties, conscription)
  - Estimated faking in 30% personal injury cases, instructions on faking dropped to opposing military in WWII (both sides)
Types of faking: Conscious and unconscious biases
- Self-deceptive enhancement: Linked to Egoistic Bias. Value = agency, strong, competent, exaggeration of status (social, physical etc)
- Self-deceptive denial: Linked to Moralistic Bias. Value = communion, good kind. Deny socially deviant impulses/behaviours.
- Impression management: conscious bias

Question 16

Q

What are some methods for detecting faking?

Answer

A

Lie Scales: Paulhus balanced index of desirable responding (BID-IR). Ask socially aversive but universal questions. eg “I’ve never wanted to swear”
- Problem: May be measuring personality. Neuroticism, conscientiousness and aggreableness all corrolate strongly.
Response time rubrics: Longer response time = faking
- But faking totally can be quicker for some people
Over-claiming technique: Paulhus - rate familiarity with concepts, some terms don’t exist (act as foils). Compare confidence of real terms to foils.
- Works well but limited in concepts you can test
Bayesian truth serum: For each item, estimate proportion of people who would give same answer. False consensus effect - honest answers will over-estimate number of others who share belief

Question 17

Q

What are some methods for reducing faking?

Answer

A

Warnings: Best warnings are consequence based but can also be based on detection, reasoning (best interest), educational (validity of test) or moral.
Forced Choice: Test takers forced to choose between 2 desirable alternatives (which is more like you).
- Results are only relative though (ipsative) cannot compare actual levels
Verifiable statements: Less likely to fake information that is easily verified. I work more than needed vs how many hours overtime did you work
Other reports: referees or friends lower levels of faking (but still there)
Implicit measurement techniques: eg implicit associations test

Question 18

Q

What are three paradigms in faking research? What has been found about faking levels?

Answer

A

Group Comparison: compare job applicants to other samples
- Measure lower limits of faking (not everyone will fake)
- There may be real group differences
- Findings: changes in OCEAN levels found, largest variation for N and C
Instructed faking: Compare scores under “answer honestly” to “maximise your scores” Instruction type varies - imagine you’re applying to X (can affect result)
- Honest answers still have self-deceptive biases.
- Findings: Huge changes found in OCEAN, particularly for N and C
Incentive manipulation: Compare no stakes conditions to reward conditions. eg top 10% get $10.
- But hard to mimic real life reward levels.
Conclusion: people fake but not maximally

Question 19

Q

What are the reccomendations for dealing with faking in high stakes situations?

Answer

A

Social desirability scales should not be useed as indicators of faking:
- can indicate real personality factors and exclude good candidates
If faking is detected, re-test or interpret with caution:
- risk of false positives
Try to minimise rather than detect faking
Use personality to screen-out lowest scorers rather than screen in best
Neutralise evaluative content of items

Question 20

Q

What are the reasons for measuring job performance?

Answer

A

Decision making about individuals: high performance (promotions, bonuses, probational periods) as well as low performance (retention, termination, layoffs)
Organisational Planning: Benchmarking performance, identifying developmental needs, assisting in goal identification
Legal requirements for the profession: legal requirements for certain levels of performance (eg doctors), legal defense of hiring/firing decisions
Feedback: individual, team and organisational
Evaluation of procedures or changes: did selection processes work? did training work? other changes

Question 21

Q

What are some examples of subjective measurements of job performance?

Answer

A

Subjective measures: rating scales filled out by employee or supervisor
- Graphic rating scales: tick along a physical scale
- Behaviourally anchored rating scales (BARS): developed for a specific job dimension within a specific job. Each scale point lists example behaviours. Can the employee do X?
- Behavioural observation scale: Developed for specific job, have you observed the worker perform these behaviours?
- Checklists: list of behaviours, tick the ones that are observed

Question 22

Q

What are some objective measures of job performance? What are some problems with them?

Answer

A

Objective measures of job performance:
- Production Counts: eg number of bricks laid
- Biodata: eg absenteeism
Problems with objective data:
- Production counts sometimes not possible (eg a nanny)
- Doesn’t always take quality into account
- Production is dependant on situational variables as well as the worker (eg # customers served)

Question 23

Q

What are some issues with rating measures of job performance?

Answer

A

Correlation between raters: meta-analyses have shown variations
- Harris and Schaubroeck: self/peer = .36, self/supervisor = .35 but peer/supervisor =.62 (reasonable)
- Conway and Huffcutt: Both reliability and agreement higher for low complexity jobs, non-managerial jobs.
  - Reliability highest for supervisors (lowest subordinates)
  - correlations between sources are lower than harris but same pattern
Sources of error in rating scales
1. Social desireablitiy (faking)
2. Leniency/severity errors (response styles personal thresholds for high/low ratings)
3. “Halo” or “horns” effect: impression based on one quality
4. Recency effects
5. Causal attribution errors: effort>ability, actor/observer bias
6. Personal bias (pregnant, race, age)

Question 24

Q

What is the difference between task and contextual performance?

Answer

A

Task Performance: activities that contribute to an organisations technical core
- tasks required by formal job role
- Lower correlations with personality
Contextual performance: Activities that contribute to the social and psychological core of the organisation
- tasks are discretionary and not explicitly stated
- Higher correlations with personality

Question 25

Q

What is job satisfaction and how can it be improved?

Answer

A

Job satisfaction is the positive and negative feelings and attitudes about ones job.
- Shows small correlation to job performance (r=.30)
Measurements of job satisfaction: Global or Specific measures
- Job description index: 5 facets (job, supervision, pay, promotions, coworkers.
Ways to increase job satisfaction:
- Work factors: 1.Job rotation 2.Job enlargement (add more tasks) 3. Job enrichment (add responsibility)
- Pay factors: 1. Perception of fairness, 2. skill/knowledge-based pay, 5. merit based pay (bonuses, commission), 4. profit sharing.
- Hours/flexibility: 1. compressed work weeks (3x12hr days), 2. flexitime.

Question 26

Q

What is the best way to conduct a performance review?

Answer

A

Two parts: 1. Performance Assessment 2. Performance Feedback
8 Feedback Principles:
1. Descriptive (not evaluative)
2. Specific (not general)
3. Appropriate (considers needs of employer, worker + situation)
4. Directed toward changable behaviours
5. Well-timed (immediate is better)
6. Honest (not manipulative, self-serving)
7. Understood by both parties
8. Pro-active (specific directions for change)

Question 27

Q

How do personality factors relate to work performance? What 4 factors influence this relationship?

Answer

A

Job proficiency vs training proficiency:
- Only C predicts job proficiency at a non-trivial level
- E,C and O all predict training proficiency (still small effect)
Productivity vs subjective ratings:
- Only C shows non-trivial relationship with productivity
- C and E show non-trivial relationships for subjective ratings
- Inverted ratings for A and N
Task vs contextual performance:
- All facets predict contextual more than task (except for O)
- C is strongest predictor (particularly ach striving, dutifulness)
Cultural factors: Salgado compared metadata from europe and america
- Similarity: C is strongest predictor
- Differences: A rather than E predicted training proficiency, Low N predicted job performance and proficiency

Question 28

Q

How do EI and intelligence relate to job performance? What factors moderate this?

Answer

A

Correlations: Intelligence is highly related to job performance (>.50). All streams of EI are lower; ability is lowest followed by self-efficacy and trait being strongest.
EI is moderated by the degree of emotional labour - trait EI in high emotional labour jobs is stronger than IQ
Job performance facets: Task performance vs Organisation citizenship behaviours (OCB) vs Counterproductive workplace behaviours(CWB):
- Stream 1: lowest for all 3, slightly higher for task
- Stream 2: Strongest for OCB, still strong for other 2
- Stream 3: Strongest for OCB, but stronger than 2 for CWB

Question 29

Q

What is the affective pathways model?

Answer

A

A proposed mechanism through which EI relates to work performance
- correlations between facets is linked to the emotions felt at work (.39)
EI predicts regulation of emotion:
- Pathway through positive affect leads to OCB: (moderate effect sizes, significant for streams 2,3)
- Pathway through negative affect leads to reduction of CWB (significant for all streams effect size varies though)

Question 30

Q

What did Schmidt and Hunter find to be the best predictors of workplace performance?