Assessment Flashcards

1
Q

Item difficulty index

A

percentage of people who got an item CORRECT

The lower the score, the more difficult the question is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Who Americanized the Binet?

A

Lewis Terman
at Stanford University
thus, Stanford-Binet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The Buckley Amendment

A

AKA FERPA (Family Education Rights and Privacy Act of 1974)
Those over 18 can view their school record (including test data)
Can view their children’s test data
Can demand corrections to their file
Educational testing information cannot be released without adult consent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Projective tests may also be called

A

self-expressive (e.g., sentence completion or word association)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

reactivity

A

clients/participants monitoring their own behavior and thus giving inaccurate answers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you increase reliability?

A

Increase the test’s length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Spearman-Brown Prophecy formula

A

used to estimate the impact that lengthening or shortening a test will have on a test’s reliability coefficient

when estimating split-half reliability, the Spearman-Brown prophecy formula can be used to compensate mathematically for the shorter length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Aptitude-achievement tests

A

GRE, MAT, MCAT, SAT

Ex: GRE attempts to predict graduate school performance but also tests level of current knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Generally, school selection tests assess

A

aptitude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The ACA division for testing

A

AMECD

Association for Measurement and Evaluation in Counseling and Development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interests and abilities are ____ correlated.

A

not highly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Bender Visual Motor Gestalt Test

A

named after Lauretta Bender
expressive projective measure, though known most for its ability to discern whether brain damage is present. Suitable for ages 4+ — client copies 9 geometric figures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interest inventories

A

Work best with high-school age and beyond

Interests are not stable until age 25

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Aptitude tests

A

assess Potential and Predict (aPtitude)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tests that analyze data outside of a given theory

A

factor-analytic tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Raymond Cattell

A

developed 16 Personality Factors

Responsible for defining fluid and crystallized intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

James Cattell

A

coined the term “mental test”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Projective tests use one of 3 formats

A

Association (word)
Completion (sentence)
Construction (draw a person)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Projective tests use ___ stimuli

A

vague

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

MMPI-2 for adolescents

A

MMPI-A

suitable for 14 to 18 y.o.s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Arthur Jensen

A

tremendous controversy for his 1969 Harvard Educational Review article
Said Whites score 11-15 IQ points higher than Blacks because due to slavery, Blacks were bred for strength rather than intelligence
Said that heredity contributes 80% to IQ and environment only 20%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Robert Williams

A

made the BITCH (Black Intelligence Test of Cultural Homogeneity) to demonstrate that Blacks often excel when given a test with questions familiar to their community. Argued that IQ tests were part of “scientific racism”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

John Ertl

A

weirdo who claimed he invented an electronic machine to take the place of paper and pencil IQ tests. It literally had a strobe light on it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Group tests are ___ accurate and have __ reliability, compared to individual tests

A

less accurate and less reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Means and SDs of Weschler and Binet IQ tests

Difference between the two

A

Weschler: M-100 SD-15
Stanford-Binet: M-100 SD-16

Binet seems to be not the best for adults and so Weschler is most popular

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Forms of the Weschler IQ tests

A

WPPSI - preschool and primary; ages 2.6-7
WAIS - age 16+
WISC - for children; 6-16.11 years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

A 9 year old task on the Binet is one that X of 9-year-olds could answer correctly

A

50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Today’s Binet is scored…

A

with a Standard Age Score (SAS)
Mean of 100
SD of 16

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

IQ is calculated by

A

Mental Age / Chronological Age X 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Alternatives to the split-half method of measuring internal consistency (inter-item consistency) of a test

A

Cronbach’s alpha

Kuder-Richardson-20 or KR-21

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Cross-validation

A

When a researcher further examines a test’s criterion validity by administering the test to a new sample.
This helps ensure the test is applicable to other populations who will take the exam. Helps guard against error factors, which are likely to be present if the original sample was small.
The cross-validation coefficient will likely be smaller than the initial validity coefficient. This is called “shrinkage”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

J. P. Guilford

A

isolated 120 factors that added up to intelligence
Two of the dimensions are divergent thinking (coming up with new ideas) and convergent thinking (when divergent thoughts and ideas are combined into a singular concept)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Charles Spearman

A

in 1904, said that two factors were applicable to any mental task:
G - general ability
S - specific ability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Francis Galton

A
cousin of Darwin!
first intelligence theory
believed intelligence was a single "unitary" factor and that exceptional abilities were genetic and ran in families
eugenics :/
Hereditary Genius (1869)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

coefficient of determination

A

variance of one factor accounted for another;
square the correlation
ex: same test is given to the same group of people and the correlation between the administrations is .70. The % of shared variance is .70 squared, which is .70x.70 = .49 (49%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

For psychological tests, an acceptable reliability coefficient is X. For admissions to jobs/schools (achievement), it is X.

A

Psycholgical - .70 reliability is good

For admissions to jobs/schools (achievement) - .80 or even .90

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

A reliability coefficient of .70 means…

A

70% of the obtained score on the test represented the true score
30% of the obtained score could be accounted for by error

AKA 70% is true variance while 30% is error variance.

(NOT that 70% of ppl who are tested will get their true score)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

A reliable test is ___ valid.

A valid test is ___ reliable.

A

A reliable test is not always valid.

A valid test is always reliable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Incremental validity (2 definitions)

A

The process by which a test is refined and becomes more valid as contradictory items are dropped

ALSO refers to a test’s ability to improve predictions when compared to existing measures. When a test has incremental validity, it gives you additional good info that wasn’t available from other tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

According to the 1974 committee that drafted Standards for Education and Psychological Tests, face validity is ___

A

not required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

A construct is any trait that ___

A

you cannot measure or observe directly

42
Q

What is the #1 consideration in test construction

A

Validity

43
Q

5 types of validity

A
content validity
construct validity
concurrent validity
predictive validity
consequential validity
44
Q

content validity

A

AKA rational or logical validity

does the test examine the behavior under scrutiny?

45
Q

construct validity

A

a test’s ability to measure a theoretical construct (like intelligence, self-esteem, etc)

46
Q

predictive validity

A

AKA empirical validity

test’s ability to predict future behavior

47
Q

concurrent and predictive validity may be lumped under ___ validity.

A

criterion validity, which is an estimate of the extent to which a measure agrees with a gold standard (i.e., an external criterion of the phenomenon being measured)

48
Q

concurrent validity

A

relationship between an instrument’s results and another currently obtainable criterion (give a new depression assessment to people you already know are depressed)

49
Q

consequential validity

A

tries to ascertain the social implications of using tests

50
Q

horizontal vs vertical tests

A

horizontal - assess for different things (math, language)

vertical - versions for different age brackets or levels of education (preschooler, middle-school math assessments)

51
Q

spiral test

A

the items get progressively more difficult

52
Q

cyclical test

A

you have several sections that spiral in nature, the items within each spiral get progressively more difficult

53
Q

ipsative test

A

does not measure absolute strengths
measures a person’s progress in relation to themselves
comparing their score to another person’s is meaningless
items are independent of one another

54
Q

convergent validity

A

established when measures of constructs that theoretically should be related are observed to be related
e.g., scores on GAD are related to another anxiety measure

55
Q

discriminant validity

A

established when measures of constructs that are not theoretically related are observed to have no relationship

56
Q

standard error of estimate

A

a statistic that gives the expected margin of error in a predicted criterion score due to the imperfect validity of the test

57
Q

validity coefficient

A

correlation between a test score and the criterion measure

58
Q

a person’s observed score (X) = ?

A

true score + error

X = T + e

59
Q

standard error of measurement

A

SEM
used to estimate how scores from repeated administrations of the same instrument to the same individual are distributed around the true score. SEM is computed using the SD and reliability coefficient:

SEM = SD{sq rt of (1 - r)}

60
Q

factors that influence reliability

A

test length
homogeneity of test items (reliability goes up when items are homogenous)
range expansion (reliability is lowered by a restriction of range)
heterogeneity of test group (higher reliability)
speed tests (high reliability because nearly everyone gets everything right)

61
Q

reliability coefficient

A

reliability is expressed in this coefficient

closer to 1.00, the more reliable the scores

62
Q

NOIR

A

nominal scale - no order or equal intervals
ordinal - order, but no equal intervals
interval - equal intervals, but no true 0
ratio - equal intervals, true 0

63
Q

Semantic differential

A

Good _ _ _ _ _ _ Bad
place a mark between where they feel
Like a Likert scale but no #s?

64
Q

Thurstone scale

A

Agree or Disagree only

65
Q

Guttman scale

A

measures the intensity of a variable because items are presenting in a progressive order so that a respondent who agrees with one statement will also agree with all previous, less extreme items

66
Q

percentile rank

A

indicates the % of scores falling at or below a given score

range from 1 to 99+ and have a mean of 50

67
Q

z-score

A
mean = 0
SD = 1

z = (X - M)/SD

68
Q

T-score

A
mean = 50
SD = 10

T = 10(z) + 50

69
Q

deviation IQ

A

also known simply as standard score (SS) because they are used to interpret scores from achievement and aptitude tests

mean = 100
SD = 15

SS = 15(z) + 100

70
Q

stanine

A

mean = 5
SD = 2
range from 1 to 9
round up to a whole #

stanine = 2(z) + 5

71
Q

normal curve equivalent

A

developed for US department of education and used to measure student achievement
1 to 99
mean = 50
SD = 21.06

NCE = 21.06(z) + 50

72
Q

___ tests are usually used in high stakes testing

A

criterion-referenced (have you learned X curriculum)

73
Q

Mental Status Exam

A
AAMMTPTJI
Appearance
Attitude
Movement and behavior
Mood and affect
Thought content
Perceptions
Thought process
Judgment and insight
Intellectual functioning and memory
74
Q

suicide assessment acronyms - 3

A

PIMP (Plan, Intent, Means, Prior attempts)
SLAP (Suicidal ideations, Lethality, Access, Plan)

SAD PERSONS (sex, age, depression, previous attempt, ethanol abuse, rational thought loss, social supports lacking, organized plan, no spouse, sickness)

75
Q

types of test bias

A

examiner bias - examiner’s beliefs or behavior influence test administration
interpretive - interpretation of results is unfair
response - when clients answer one thing to all questions
situational - testing conditions
ecological - global systems affect (e.g., giving all students a test in English)

76
Q

Army Alpha vs. Army Beta

A

Alpha - English speakers
Beta - non-English speakers
used to test intelligence of military recruits during WWII

77
Q

Arthur Otis

A

developed the first scientifically reliable intelligence test for groups
Otis Group Intelligence Test

78
Q

Frank Parsons

A

father of vocational guidance and counseling

79
Q

NBCC and ACA ethical guidelines for assessment

A
  1. competence to use and interpret
  2. informed consent
  3. release of results to qualified professionals
  4. instrument selected
  5. conditions of administration
  6. scoring and interpretation of assessments
  7. obsolete assessments and outdated results
  8. assessment construction
80
Q

the Joint Committee on Testing Practices (JCTP) developed…

A

Rights and Responsibilities of Test Takers
Test User Qualifications
Code for Fair Testing Practices in Education

81
Q

IDEA

A

Individuals with Disabilities Education Improvement Act of 2004
rights of students with disabilities to receive testing at the expense of the public school testing
right to an IEP (individual education program)

82
Q

ADA

A

Americans with Disabilities Act (1990)
employment testing must accurately measure a person’s ability to perform relevant job tasks
people with disabilities get appropriate accommodations for testing

83
Q

Carl D. Perkins act

A

Vocational and Technical Education Act of 1984
provides vocational assessment, counseling, and placement for low SES, disabled, single parents, those with limited English proficiency, incarcerated individuals

84
Q

Civil Rights Act of 1964 and 1972, 1978, and 1991 ammendments

A

assessments used to determine employability must relate strictly to the duties outlined in the job description and cannot discriminate based on race, color, religion, pregnancy, gender, or origin

85
Q

criterion validity

A

effectiveness of an instrument in predicting an individual’s performance on a specific criterion

86
Q

item discrimination

A

Performance of the top quarter of total scores minus the bottom quarter
An item has good discrimination when high-scorers get it right and low-scorers get it wrong (positive item discrimination)
items with 0 and negative item discrimination are poor

87
Q

classical test theory

A

observed score = true score + error

88
Q

item response theory

A

importance of applying mathematical models to the data collected from assessments to see how well individual items work
AKA modern test theory

89
Q

construct-based validity model

A

AKA unified construct model
validity is a holistic construct, it doesn’t have specific components like classical test theory would believe it has (e.g., the 3: content, criterion, and construct validity)

90
Q

what are the 3 types of test theory

A
classical
item response (AKA modern)
construct-based validity model
91
Q

criterion-referenced assessment

A

provide info about a person’s score by comparing it to a predetermined standard or set criterion
e.g., A = 90-100; B=80-90, and so on
NCE and CPCE are criterion-referenced assessments
as opposed to norm-referenced tests which make meaning by comparing a person’s score to the norm group

92
Q

achievement vs. aptitude tests

A

achievement - what one has learned at the time of testing

aptitude - what a person is capable of learning (GRE, SAT)

93
Q

ASVAB

A

Armed Services Vocational Aptitude Battery

the most widely used multiple aptitude test in the world. Measures aptitude for military and civilian jobs

94
Q

Luis Thurston

A

unlike Charles Spearman’s two-factor approach to intelligence (g, s - general and specific factors), Luis Thurston identified 7 mental abilities

95
Q

Howard Gardner

A

theory of multiple intelligences — 8

96
Q

Cattell-Horn-Carroll (CHC)

A

theory of cognitive abilities - the most empirically validated theoretical model of intelligence
intelligence is hierarchical and consists of 3 strata:
general intelligence “g”
broad cognitive abilities
narrow cognitive abilities

97
Q

high-stakes testing usually uses ___ assessment

A

criterion-referenced

98
Q

performance assessments

A

non-verbal form of assessment
client completes a task
good for foreign language speakers
ex: Draw-a-Man test; (Raymond) Cattell Culture Fair Intelligence Test; Test of Non-Verbal Intelligence (TONI)

99
Q

computer-adaptive testing

A

the computer adapts the test structure and items to the examinee’s ability level
ex: GRE

100
Q

the 3 main types of validity

A

content
criterion
construct

101
Q

____ is the most widely used intelligence test

A

Weschler scales