assessment Flashcards
appraisal
process of assessing or estimating attributes. I.e. observations, survey, interviews
psychometrics
study of psych measurement
subjective format of test
relies mainly on scorers opinion, personal bias can impact rating.
halo effect
When a trait which is not being evaluated influences researchers rating on another trait. i.e. when someone is attractive, they may get higher rating in a test
objective test
raters judgment plays no or little role in scoring.
forced choice
items are known as recognition items (2 or more options). Person must respond with certain given answers (i.e. multiple choice)
difficulty index
indicates percentage of ind who answered each item incorrectly. From 0-1 to indicate percentage.
ipsative measures
measure compares traits within the same individual. Do not compare a person to other persons who took the same instrument. Person taking ipsative measure is measured in response to own standar of behavior. Measure points out highs and lows that exist within same ind. I.e. mr. johns depression is improving.
normative test
each item is independent of all other items. Can compare persons to each other who take same test
power test
evaluates level of mastery without a time limit. Can do power test even if it has time limit as long as time limit allows most people to complete test in time.
Achievement test: measures typical performance.
spiral test
Items get progressively more difficult (think of spiral staircase)
cyclical test
Several sections that are spiral in nature (each section gets harder as you go).
test battery
a horizontal test. Several measures are used to produce results that could be more accurate than those derived from merely using a single source.
vertical test
Versions for different developmental/education levels.
parallel forms (aka equivalent forms)
when a test has two interchangeable forms
most critical factors in test selection
validity and reliability
validity
whether the test measures what it says it does. The number one factors in choice selection. A valid test is always reliable.
5 types of validity
Content (aka rational/logical): Does the test sample the behavior under scrutiny. I.e. IQ test that didn’t sample entire range of
Construct: Ability ot measure a theoretical construct like intelligence, talent, etc. Any trait you cannot measure or observe is a construct.
Concurrent: How well test compares to other well established instruments with same purpose.
Predictive: Ability to predict future behavior.
Consequential: social implications of using tests.
convergent validity
Relationship/correlation of a test with an independent measure or trait.
discriminant validity
test will not reflect unrelated variable
criterion validity
could be predictive or concurrent
face validity
the extent that a test looks or appears to measure the intended attribute
incremental validity
describe process by which a test is refined and becomes more valid as contradictory items are dropped. Test’s ability to improve predictions when compared to existing measures. Test provides additional valid information that was not attainable via other procedures
synthetic
Derived from “synthesized”. The helper or researcher looks for tests that have been shown to predict each job element or component, then combines these tests to create one with synthetic validity.
reliability
how consistent a test measures an attribute. Second most important concern in test selection. Tests can be reliable, but not valid.
reliability coefficient
0-1 (1= perfect and only occurs in physical measurements). In psychological tests, 0.90 is considered excellent. 0.70 is considered somewhat typical. Tells you the percentage of the score that is accurate.
.80 is considered acceptable for admissions to jobs, etc.
true variance aka coefficieint of determination
the square of the correlation/reliability coefficient (i.e. if correlation is .7, the true variance is .49 or 49%).
test-retest reliability
giving the test to the same group of people two times and correlating the scores.
equivalent or alternative forms of reliability
Giving the same population alternate forms/parallel forms of the identical test. Reliability correlation coefficient is found on the two sets of scores.
split-half correlation method
Dividing a test into two halves and then finding the correlation between the half scores to find the reliability coefficient.
inter rater/inter observer method
aka “scorer reliability” . Used with subjective tests to see if the scoring method provides same score despite different scorers. Several raters assess the same performance.
consistency reliabilty
homogeneity or internal consistency or inter-item consistency.
counterbalancing
Necessary practice for equivalent forms reliability testing. Half individuals get parallel form A first and half get form B, then switch for the second.
stability
the ability of a test score to remain stable or fluctuate over time if the client takes the test again
physical measurements
more reliable than psychological measurements
IQ
intelligence quotient
IQ formula
MA/CA x 100 - mental age divided by chronological age. “Ratio IQ”
Francis Galton
Did research and concluded that intelligence was normally distributed. Felt intelligence was a unitary faculty. Believed exceptional mental abilities were genetic.
JP Guildford
isolated 120 factors that added up to intelligence, and thoughts on convergent and divergent thinking. Convergent = divergent thoughts are combined into singular concept. Divergent - ability to generate novel ideas.
Kuder Richardson coefficients of equivalence
Helps to find out if each item on the test is measuring same thing as every other item
Lee Conbachs alpha coefficient
also used to determine if each item is measuring the same thing as every other item
cross-validation
When a researcher further examines the criterion validity of a test by administering the test to a new sample. In most cases the cross-validation coefficient is indeed smaller than the initial validity coefficient (aka “shrinkage”).
stanford-binet IQ test
1905, 30-question standardized test for school children. Binet used daughters as subjects. Also cited as pioneer for his work in inkblots. Used age-related tasks in which 50% of the children of a certain age could answer successfully. Was created to dfiferentiate children with and without intellectual disability.
SAS
standard age score. in the current version of the IQ test, it has a mean of 100 and standard deviation of 16
standard test
scoring and administration procedures are formal and well delineated. Include quantitative information related to “standards” of performance.
wechsler iq test
added performance skills to Binet test.
WPPSI
Wechsler preschool and primary scale of intelligence ages 2-7.
WAIS-IV
Wechsler adult intelligence scale: 16-90 yrs.
Mean is 100, standard deviation is 15 (16 for Stanford Binet)
Based on neurocognitive research and Cattel-Horn-Carroll leading theory of human intelligence.
Can be scored online
60-90 mins to complete.
10 subject areas
4 index scores
Can measure iq from 40-160 (Standford Binet can do up to 180)
WISC-IV
Wechsler intelligence scale for children: 6-16 yrs
merrill-palmer scale of mental tests
intelligence test for below 7 yrs.
group iq tests
i.e. Otis-Lennon, Lorge-Thorndike, California Test of Mental Abilities. Advantage is that group tests are quicker to administer. Disadvantage, less accurate and lower reliability.
group iq test movement
began in WWI with Army Alpha and Army Beta.
culture fair test
items are known to the subject regardless of his or her culture
jacob ertl
invented an electronic machine ot analyze neural efficiency
raymond cattell
fluid & crystallized intelligence. Crystallized: content. Fluid: content-free reasoning.
arthur jensen
1969 article in harvard ed review that suggested the closer people are genetically, the more alike their IQs will be. Controversy re: statemnet that African Americans were “bred for strength rather than intelligence”
robert williams
Demonstrated that African Americans excelled when given a test that was culturally relevant/used language African Americans knew & understood.
larry p v Wilson Riles
accusing wechsler & binet of racism.
projective test
client is shown something neutral. Examiner bias is common. Some formats of projective tests:
Association- i.e. “waht comes to mind?”
Completion- “finish this sentence”
Construction- i.e drawing.
16 pf
Raymond Cattell- personality factor questionnaire developed for people 16+ and measuring key personality factors.
factor analysis tests or inventories
analyze data outside of a given theory.
james mckeen cattell
coined term “mental test” and worked with Galton and Willhelm Wundt.
myers briggs
reflects work of carl jung
oscar buros
mental measurements yearbook
projective measures
favored by psychodynamic clinicians which rely on unconcious mind
aptitude test
assesses potential and predicts
achievement test
assesses what has been learned, what you know or how well you can currently perform
thematic apperception test (TAT)
uses pictures, 31 cards, intended for ages 4+, projective test
rotter incomplete sentence blank (RISB)
projective, client completes incomplete sentence with feeling
bender gestalt II
expressive, projective measure. Lauretta bender namesake, ages 4+, actually bender visual motor gestalt test.
interest inventory
work best with ind who are of high school age+ (interests become stable around age 25), tend to emphasize prof positions and minimize blue collar jobs, reliable, not threatening to test taker. Problem->ppl try to answer questions n socially acceptable manner (i.e. social desirability)
association for assessment and research in counseling (AARC)
1 of 20 ACA divisions
guilford zimmerman temperment survey (GZTS)
personality measure for ppl who do not have sever psychiatric disabilities
california personality inventory (CPI)
similar to above, shares questions with MMPI
acquiescence
when a client always agress with something
deviation
When an ind purposely or when in doubt gives unusual responces
standard error of measurement
how accurate or inaccurate a test score is. Tells counselor what would most likely occur if same individual takes same test again. Lower standard error is better. X=T+E (x=obtained score, t is true score and e is error)
social loafing
phenomenon in which a person in a group puts forth less effort than they would if they were attempting to accomplish goal individually.
spearman brown formula
used to estimate the impact that lengthening or shortening a test will have on its reliability coefficient.
CPT
current procedural terminology codes. Used to let insurance companies know what services you are providing
reactive effect
impact of self monitoring to bias results of self report
informal assessment technique
journals, case notes, interviews, professional staffings, checklists, sociograms of groups.
infant IQ test
more unreliable, toddler iq test can pick up abnormalities like intellectual disabilities
item difficulty index
anges from 0.0 to 1.0. The higher the index number, the greater the number of examinees who will answer the question correctly (or rather, the greater the number, the easier the question is to answer)
buckley amendment
college student can view record including test data, a parent can view infant iq tests at preschool,client can demand correction on something they discover while reading their file. Persons over 18 can inspect their own record and those of their children
FERPA
family educational rights and privacy act. Info cannot be released without adult consent. Children over 18 can view own records. Schools and programs that dont receive federal funds are exempt from ferpa guidelines.
lewis terman
americanized the binet. Assoc. With Standford (binet became Stanford Binet in US).