Testing Flashcards
ppl believe test can see & tell everything about someone
protection of privacy
clients need to be informed of what will be shown (& not shown) from the test
need to know the nature of what is being done
informed consent
only give test that are relevant to case
issue of relevance
he created OLSAT that assesses verbal & nonverbal reasoning abilities in students that are related to success in school
test was used for students grades K-12
Arthur Otis
Otis-Lennon School Ability Test (OLSAT)
created the WAIS-III, WISC-IV, & WPSSI-III
developed a point scale & used subtest w/c was guided by a focus on the global nature of intelligence of the individual
the overall IQ obtained from the Weschsler scale represents an index of general mental ability
David Wechsler
perceived intelligence as a form of biological adaption to one’s environment
cognitive development is divided into 4 major periods, each characterized by stages & sub stages
each stage represents a form of cognitive organization that is more complex than the proceeding one
Jean Piaget
he criticizes psychometric testing as a way to predict what a person can do if asked to do it
he believes that the best approach in to study the way ppl do their work in order to find how they do it best
his position on using competencies in assessment in on job performance
David McClelland
this is a proportion obtained by comparing the performance of 2 subgroups of test takers
the 2 groups together can represent the entire set of the test takers, w/ch is only a portion of the total group
discrimination index for a test
the correlation b/w individuals’ responses to a particular item on a test & their score on a criterion
validity coefficient for each item on a test
to detect & remove items that fail to discriminate by doing so, enhances reliability & validity on the test
item analysis
appropriate for maximal performance test (achievement & aptitude) b/c the analysis requires that test items be scored as correct/incorrect
most common measure = % of test takers who answer the item correctly
item difficulty
indicates the extent to w/c diff. types of ppl answer an item in diff. ways
it approaches each individual item as a separate measure of the characteristic being tested
the usefulness of a time depends on its ability to measure what ever the test as a whole is measuring
can be used to determine w/c items best measure the construct or content under study
item discrimination
in a multiple choice question, an incorrect alternative is called a disaster
evaluates the % of ppl selecting each incorrect alternative to determine is the distracters are useful
distracter power
this is a proportion obtained by comparing the performance of 2 subgroups of test takers
the 2 groups together can represent the entire set of the test takers, w/c is only a portion of the total group
discrimination index for a test
the correlation b/w individuals’ responses to a particular item on a test & their score on a criterion measure
validity coefficient (for each item on a test)
a statistical procedure that can be used to study the internal structure of a test, by using pattern of item correlations to identify the # of different factors or characteristics measured by the test
factor analysis
current lvls of skill & knowledge based on previous learning & experience
what you already have learned
achievement
ability to use existing skills & knowledge to learn new skills & knowledge
your potential for learning
aptitude
describing samples & making inferences about population parameters, helps communicate info about test scores, & assist in drawing conclusions about test scores
also determine the amount of error present in test scores
statistical assessment
describes the person
designed to measure attributes of the person, such as intelligence, motor skills, reading ability, adaptive behavior etc. & psychological assessment produces a test score or rating that is said to reflect the attribute
clinical assessment
test now, then again later
test-retest reliability
all items are measuring the same things (homogeneous), all drawn from the same domain
internal consistency
2 different versions of the same test
alternate form reliability
when you compare 1/2 of the test to the other 1/2 of the test
the logic is that if items are = representative of a particular domain, ppl should perform = on them
split half reliability
a valid test is one that measures knowledge & characteristics that are appropriate for its purpose
its it measuring what it’s supposed to?
validity
do the patterns of correlation w/ other measures make the theoretical sense
construct
test should correlate most highly w/ test of the same thing (highest correlation/sameness)
congruentt
things that come together similar
2 tests should correlate at an intermediate lvl, should show correlation at the middle (middle correlation)
convergent
separate (least correlation)
discriminate
using correlation for criterion & construct needs to be more direct measure of construct – not another test (predictive, concurrent) using linear regression to performance on another measure
criterion-related
criterion occurs in future
predictive validity
criterion taken at the same time as the test
criterion in the here and now
concurrent validity
are all different aspects of construct adequately represented
content validity
face value, subjectively does appear to be measuring what its supposed to
face validity
ratio of…
(VAR of true scores) / (VAR of total scores)
expresses the degree of consistency in a measurement of test scores
range from 1.00 (perfect) to 0.00 (absence)
reliability coefficient