Test Construction Flashcards

1
Q
An examiner administers and
scores the same test numerous
times without deviating from the
procedure in order to reduce
the possibility of measurement
error. This exemplifies what?
A

Standardization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
The scores of a representative
population sample on a test that an
examiner compares an individual's
scores to are referred to as
\_\_\_\_\_\_\_\_; while they allow for
comparisons on a person's
performance on different tests, they
do not provide the ultimate standard
of performance.
A

Norms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
A psychological test that is
regarded as \_\_\_\_\_\_\_\_ is
administered, scored, and
interpreted independent of
the subjective judgment of
the examiner.
A

Objective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
The SAT and GRE are
examples of \_\_\_\_\_\_\_\_ tests,
as they provide information
about a person's best possible
performance, while the MMPI-2
and PAI are \_\_\_\_\_\_\_\_ tests,
providing information about a
person's usual experience.
A

Maximum
performance;
typical
performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

________ tests assess the difficulty level
an examinee can attain (e.g., Information
from WAIS), ________ tests assess the
person’s response rate (e.g., Digit
Symbol from WAIS), and ________ tests
help determine whether an individual can
attain a certain level of acceptable
performance (e.g., test of reading skills).

A

Power;
speed;
mastery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
A \_\_\_\_\_\_\_\_ occurs when an instrument
cannot take on a value higher than some
limit due to the measure not including
enough difficult items, resulting in all
high-achieving examinees getting similar
scores (test is too easy); conversely, a
\_\_\_\_\_\_\_\_ occurs when an instrument
cannot take on a lower value and thus all
low-achieving examinees get similar
scores (test is too hard).
A

Ceiling
effect; floor
effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
In contrast to normative
measures, these types of
measures require individuals to
use their own frame of
reference to compare 2 or more
desirable options and choose
the one that is most preferred.
A

Ipsative

measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
\_\_\_\_\_\_\_\_ is the consistency of
a test, or the degree to which a
test provides the same results
under the same conditions;
\_\_\_\_\_\_\_\_ refers to the degree
that a test measures what it
claims to be measuring.
A

Reliability;

validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A perfectly reliable test would yield every
examinees’ ________ every time it was
administered, as this would indicate the
examinees’ actual ability on whatever the
test is measuring; however, a test is
never perfectly reliable due to ________,
which is random and can be caused by
environmental noise, examinee’s mood
on testing day, and any other number of
factors.

A

True score;
measurement
error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The most commonly used methods of estimating
reliability of a test use a correlation coefficient,
referred to as the ________, ranging in value
from 0.0 to +1.0, where coefficients closer to 0.0
indicate less reliability and values closer to +1.0
indicate increasing reliability; the coefficient is
not squared to determine the proportion of
variability, unlike other correlation coefficients,
rather it is interpreted directly.

A

Reliability

coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A researcher administers the same
instrument to the same group of
college students on 2 separate
occasions; following the second
administration, the researcher
correlates on the first and second
administrations. What type of
reliability is the researcher
attempting to obtain?
A

Test-retest
reliability (or
“coefficient of
stability”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
TRUE or FALSE: It is not
recommended to use the
test-retest coefficient when
attempting to obtain
reliability for a test that
measures attributes that
are unstable (e.g., mood).
A
TRUE: Low coefficients, in
such cases, would likely
be more a reflection of the
attribute's unreliability
rather than the test's
unreliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A researcher administers one
form of a test on one day, then
administers an equivalent form
to the same group of people at
a later date/time. What type of
reliability is being sought in this
example?
A

Alternate forms
reliability (or “coefficient
of equivalence;”
parallel-forms reliability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When correlations are obtained among individual
test items, ________ reliability is being
assessed; the 3 methods for obtaining this
reliability include ________ (involves dividing
test into 2 parts then correlating responses from
the 2 parts), ________ (used when test items are
dichotomously scored- e.g., “true/false”), and
________ (used for tests with multiple-scored
items- e.g., “never/rarely/sometimes/always”).

A
Internal consistency (or
"coefficient of internal
consistency"); split-half;
Kuder-Richardson
Formula 20; Cronbach's
coefficient alpha
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
While the split-half reliability
coefficient usually lowers the
reliability coefficient
artificially, the \_\_\_\_\_\_\_\_ can
be used to correct for the
effects of shortening the
measure.
A

Spearman-Brown

prediction formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
Measures of internal
consistency are not
good at assessing
reliability for
\_\_\_\_\_\_\_\_ tests.
A

Speed tests, as the
correlation would
be spuriously
inflated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q
Instruments that rely on
rater judgments would be
best to have high
\_\_\_\_\_\_\_\_ reliability, which
is increased when scoring
categories are \_\_\_\_\_\_\_\_
and \_\_\_\_\_\_\_\_.
A
Inter-rater (interscorer);
mutually exclusive (a
particular behavior belongs to
a single category); exhaustive
(categories cover all possible
responses/behaviors)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q
The \_\_\_\_\_\_\_\_ estimates the
amount of error to be expected
in an individual test score and
is used to determine a range,
referred to as a/an \_\_\_\_\_\_\_\_,
within which an examinee's true
score will likely fall.
A

Standard Error of
Measurement;
confidence
interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the
formula for the
standard error of
the measurement?

A
SDx√1-rxx (SDx =
standard deviation
of test scores; 
= reliability
coefficient)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q
What is the probability that a
person's true score lies within a
range of plus or minus 1
standard error of measurement
(SEM) of their obtained score?
How about plus or minus 1.96
(2) SEM? And finally, plus or
minus 2.58 (2.5) SEM?
A

68% of the
time; 95% of
the time; 99%
of the time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q
TRUE or FALSE:
Hypothetically, a test
with a reliability
coefficient of +1.0 would
have a standard error of
measurement of 0.0.
A

TRUE: A test
with perfect
reliability will
have no error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q
The standard error of
measurement is \_\_\_\_\_\_\_\_
related to the reliability
coefficient (rxx) and
\_\_\_\_\_\_\_\_ related to the
standard deviation of test
scores (SDx).
A

Inversely;

positively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What reliability
coefficient, when
practical, is the
best to use?

A

Alternate-forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q
Classical test
theory states that
an observed score
reflects \_\_\_\_\_\_\_\_
plus \_\_\_\_\_\_\_\_.
A

True score
variance;
random error
variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q
Methods of recording behaviors include
\_\_\_\_\_\_\_\_ recording (elapsed time that
behavior occurs is recorded), \_\_\_\_\_\_\_\_
recording (number of times behavior
occurs is recorded), \_\_\_\_\_\_\_\_ recording
(rater notes whether subject engages in
behavior during given time period), and
\_\_\_\_\_\_\_\_ recording (all behavior during
an observation session is recorded).
A

Duration;
frequency;
interval;
continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q
Simply put,
\_\_\_\_\_\_\_\_ refers to
the degree a test
measures what it
purports to measure.
A

Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q
A depression scale that
only assesses the affective
aspects of depression but
fails to account for the
behavioral aspects would
be lacking what type of
validity?
A
Content validity, which
refers to the extent to
which test items
represent all facets of
the content area being
measured (e.g., EPPP)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q
TRUE or FALSE: Content
validity assessment
requires a degree of
agreement between
experts in the subject
matter, thus it includes an
element of subjectivity.
A
TRUE: Tests should
also correlate highly
with other tests that
measure the same
content domain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q
In contrast to content validity,
\_\_\_\_\_\_\_\_ occurs when a test
appears to valid by examinees,
administrators, and other
untrained observers; it is not
technically a type of test
validity.
A

Face

validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q
A personality test that
effectively predicts the
future behavior of an
examinee has what
type validity?
A
Criterion-related validity,
which is obtained by
correlating scores on a
predictor test to some
external criterion (e.g.,
academic achievement,
job performance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q
Criterion-related validity is assessed
using a/an \_\_\_\_\_\_\_\_ to determine the
relationship between the predictor and
the criterion; for interpretation this value
can be squared, producing the
"\_\_\_\_\_\_\_\_," which indicates the
proportion of variability in the criterion
that is explained by variability in the
predictor.
A

Correlation
coefficient;
coefficient of
determination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q
The process of \_\_\_\_\_\_\_\_ validation
involves the predictor and the criterion
being collected at the same time,
providing information regarding a test's
usefulness for predicting a given current
behavior; \_\_\_\_\_\_\_\_ validation involves a
waiting period between collection of
predictor scores and criterion data,
providing information regarding a test's
usefulness for predicting future behavior.
A

Concurrent;

predictive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q
When interpreting a
person's predicted score
on a given criterion
measure, the \_\_\_\_\_\_\_\_
will determine within what
range of scores their
actual score will likely fall.
A

Standard
Error of
Estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q
The standard error of measurement
constructs a confidence interval
around an examinee's \_\_\_\_\_\_\_\_
score (using a reliability coefficient),
while the standard error of estimate
does the same for an examinee's
\_\_\_\_\_\_\_\_ score (using a validity
coefficient).
A

Obtained;

predicted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q
Interviewees are given an aptitude test
(predictor) to predict work success
(criterion), with hiring contingent on
achieving a certain minimum score,
called a/an \_\_\_\_\_\_\_\_ score. The
manager then rates performance on work
tasks, an indication of success, and only
those who score above a certain
\_\_\_\_\_\_\_\_ are deemed successful.
A

Predictor
cutoff;
criterion cutoff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q
Scoring above both the predictor and
criterion cutoff points produces
\_\_\_\_\_\_\_\_; scoring above the predictor
cutoff point but below the criterion cutoff
point produces \_\_\_\_\_\_\_\_; scoring below
the predictor cutoff point but above the
criterion cutoff point produces \_\_\_\_\_\_\_\_;
and scoring below both the predictor and
criterion cutoff points produces
\_\_\_\_\_\_\_\_.
A
True positives (valid
acceptances); false
positives (false
acceptances); false
negatives (invalid
rejections); true negatives
(valid rejections)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q
Some factors contributing
to a low validity coefficient
include the validation
group being \_\_\_\_\_\_\_\_ or
the predictor and/or
criterion being \_\_\_\_\_\_\_\_.
A

Homogenous;

unreliable

38
Q
When a test has a different
validity coefficient for one group
compared to another, the
variables affecting validity are
called \_\_\_\_\_\_\_\_ variables;
when this is the case, the test
is said to have \_\_\_\_\_\_\_\_.
A

Moderator;
differential
validity

39
Q
This is the process
whereby an already
validated test is
re-validated with a
different sample of people
than the original validation
sample.
A

Cross-validation

40
Q
What term is used to
describe the reduction
that occurs in a
criterion-related validity
coefficient after
cross-validation?
A

Shrinkage

41
Q
The greatest shrinkage occurs
when the original validation sample
is \_\_\_\_\_\_\_\_, the original item pool
is \_\_\_\_\_\_\_\_, the number of items
retained is \_\_\_\_\_\_\_\_ relative to the
items in the item pool, and/or item
are not chosen based on \_\_\_\_\_\_\_\_
or \_\_\_\_\_\_\_\_.
A
Small; large; small;
previously formulated
hypothesis;
experience with the
criterion
42
Q
\_\_\_\_\_\_\_\_ is one way a predictor might
end up looking more valid than it actually
is, which occurs when predictor scores
themselves influence any person's
criterion status (e.g., manager is aware
that factory worker did well on predictor,
this knowledge positively influences
manager's ratings on criterion
performance).
A

Criterion

contamination

43
Q

How is criterion
contamination
prevented?

A
Criterion raters
should have no
prior knowledge of
examinees'
predictor scores
44
Q
Theorized psychological variables
(e.g., personality, intelligence) that
are abstract and not directly
observable are referred to as
\_\_\_\_\_\_\_\_, hence \_\_\_\_\_\_\_\_
provides an indication of the degree
to which an instrument measures or
correlates with such variables.
A

Construct;
construct
validity

45
Q
A newly developed test of
personality has a high
correlation with the MMPI-2 and
a low correlation with the
Wechsler Memory Scale,
indicating the test has both
\_\_\_\_\_\_\_\_ validity and
\_\_\_\_\_\_\_\_ validity, respectively.
A

Convergent;
discriminant/divergent
- both are forms of
construct validity

46
Q
TRUE or FALSE: The only time
a low correlation coefficient
provides evidence of high
validity is when discriminant
validity is indicated due to there
being a low correlation between
2 tests that measure different
constructs.
A
TRUE: In all other
cases, high validity
is indicated by a
high correlation
coefficient
47
Q
What complex procedure for
assessing convergent and
discriminant validity requires
the assessment of 2 or more
traits (e.g., personality,
depression) by 2 or more
methods (e.g., self-report, peer
rating)?
A

Multitrait-multimethod

matrix

48
Q

When using the multitrait-multimethod
matrix, ________ validity is indicated
when tests that measure the same traits
are highly correlated, even when different
methods of measurement are used;
conversely, ________ validity is indicated
when tests that measure different
constructs are minimally correlated, even
when the same method of measurement.

A

Convergent;

discriminant

49
Q

The ________ coefficient is a reliability
coefficient, as it indicates the correlation between
itself and the measure; correlations between two
measures that measure the same trait using
different methods are called ________
coefficients; correlations between two measures
that measure different traits using the same
method are called ________ coefficients; and
correlations between 2 measures that measure
different traits using different methods are called
________ coefficients.

A

Monotrait-monomethod;
monotrait-heteromethod;
heterotrait-monomethod;
heterotrait-heteromethod

50
Q
When assessing validity using the
multitrait-multimethod matrix, convergent
validity is indicated when there is a high
\_\_\_\_\_\_\_\_ correlation, while discriminant
validity is indicated by a low \_\_\_\_\_\_\_\_
correlation and further confirmed by a
\_\_\_\_\_\_\_\_ heterotrait-heteromethod
correlation.
A

Monotrait-heteromethod;
heterotrait-monomethod;
low

51
Q
\_\_\_\_\_\_\_\_, often used to assess the
construct validity of a test or tests,
involves reducing a larger set of
variables into fewer classified sets
of variables based on the construct
that is primarily "picked-up" by each
measure; each variable is
correlated with every other variable,
creating a \_\_\_\_\_\_\_\_
A

Factor
analysis;
factor matrix

52
Q
The main purpose of factor analysis
is to reveal how many and to what
degree underlying constructs, also
called \_\_\_\_\_\_\_\_ due to the fact that
the analysis does not directly intend
to measure them, can account for
scores on a larger number of tests.
A

Latent

variables

53
Q
In a hypothetical factor analysis, the
factor matrix indicates a correlation
coefficient of .68 between the
depression subscale of the MMPI-2
and Factor II. What term is used to
describe the correlation between
the depression subscale and Factor
II?
A
Factor loading, which refers to
the correlation between a given
test and a given factor (e.g., the
depression subscale loads .68
on Factor II); it can be square
to determine proportion of
variability
54
Q
\_\_\_\_\_\_\_\_ determines
the proportion of
variance of a test that is
attributable to the
factors; it is the sum of
squared factor loadings.
A
Communality
(h-squared) - not
the case when
oblique rotation is
used
55
Q
The amount of variability in a
test that can be explained by
whatever traits are represented
by the factors is referred to as
\_\_\_\_\_\_\_\_, while variance that
is specific to the test and not
explained by the factors is
referred to as \_\_\_\_\_\_\_\_.
A
Common variance
(represents
communality); unique
variance (represents
specificity)
56
Q
In a factor analysis, these values
indicate the amount of variance in
all the tests accounted for by the
factor; they are analyzed to
determine whether or not the factor
is accounting for a significant
amount of variability in the tests.
A

Eigenvalues
(or explained
variance)

57
Q
If a factor analysis is
performed on 8 tests,
what is the largest the
sum of the
eigenvalues can be?
A
Since the sum of the
eigenvalues can be no
larger than the number
of tests included in the
factor analysis, the
answer is 8
58
Q
A procedure that facilitates
factor matrix interpretation
is \_\_\_\_\_\_\_\_, which
involves re-dividing the
test's communalities so
that a clearer pattern of
loadings emerges.
A

Rotation

59
Q
Two general rotation strategies
include \_\_\_\_\_\_\_\_ for factors that
are uncorrelated (independent of
each other) and \_\_\_\_\_\_\_\_ for
correlated factors; the decision as
to which one is used is based on
the researcher's theoretical
assumptions.
A

Orthogonal;

oblique

60
Q
When construct validity is
being assessed using factor
analysis, a high correlation
between a test and a factor
the test is expected to
correlate highly with is
referred to as what?
A

Factorial

validity

61
Q
While factor analysis assumes
variance in a variable is
composed of \_\_\_\_\_\_\_\_,
\_\_\_\_\_\_\_\_, and \_\_\_\_\_\_\_\_,
principle components analysis
assumes variance is composed
of \_\_\_\_\_\_\_\_ and \_\_\_\_\_\_\_\_.
A
Communality;
specificity; error;
explained
variance; error
variance
62
Q
Factor is to factor
analysis as \_\_\_\_\_\_\_\_
or \_\_\_\_\_\_\_\_ is to
principal components
analysis.
A

Principal
component;
eigenvector

63
Q
What method might a
researcher who is
interested in developing a
taxonomy (classification
system) of different
personality characteristics
use?
A

Cluster

analysis

64
Q
In \_\_\_\_\_\_\_\_ analysis, only interval and
ratio data can be used and researchers
typically have an a priori hypothesis
about what traits a set of variables
measure; by contrast, \_\_\_\_\_\_\_\_ can be
performed using any type of data
(interval, ration, nominal, ordinal) and is
not designed for studies where the
researcher has an a priori hypothesis.
A

Factor analysis;

cluster analysis

65
Q
TRUE or FALSE: A
reliable test is not
always a valid test,
though a valid test must
be a reliable test.
A
TRUE: Reliability is
a necessary but
not sufficient
condition for
validity
66
Q
The \_\_\_\_\_\_\_\_ coefficient
is less than or equal to the
square root of the
\_\_\_\_\_\_\_\_ coefficient; it
cannot be any higher, thus
the latter sets a \_\_\_\_\_\_\_\_
on the former.
A

Validity;
reliability;
ceiling (or
upper-limit)

67
Q
A researcher discovers a test
has low reliability; however, she
is interested in what the validity
coefficient of the predictor
would be if both the predictor
and the criterion were perfectly
reliable. What formula would
she use?
A

Correction
for
attenuation

68
Q
What is the correlation
between the factors in
a factor analysis
where an orthogonal
rotation is used?
A

By definition,
the correlation
would be 0.0

69
Q
What is used to determine
which test items will be
retained for the final
version of a test and to
ensure that a test is both
reliable and valid from the
start?
A

Item

analysis

70
Q

The ________
the p-value, the
________ the
item.

A

Higher (lower);
less difficult
(more difficult)

71
Q
The percentage of examinees
that answer an item correctly is
referred to as a/an \_\_\_\_\_\_\_\_,
which is abbreviated \_\_\_\_\_\_\_\_;
most test developers prefer
items with a \_\_\_\_\_\_\_\_ value at
or around \_\_\_\_\_\_\_\_.
A

Item difficulty
index; p; p;
.50

72
Q

The rule-of-thumb for item difficulty on a
test is that the optimal difficulty level of
test items should be approximately
halfway between 1.0 (i.e., everyone is
correct) and the level of success
expected by chance alone. That known,
what is the optimal item difficulty level of
a multiple choice test with 4 options (e.g.,
EPPP)?

A
p = .625, which
means there is a
62.5% chance of
guessing the correct
answer to an item
73
Q
According to Anastasi, the
p-level expresses item difficulty
in terms of an \_\_\_\_\_\_\_\_ scale,
as conclusions cannot be made
about the differences in
difficulty between items, only
that certain items are
easier/harder than others.
A

Ordinal (difficulty
level are rankings,
according to
Anastasi)

74
Q
The degree to which a test item
differentiates among test-takers
in terms of the behavior the test
is designed to measure is
called \_\_\_\_\_\_\_\_ and can be
assessed by calculating a/an
\_\_\_\_\_\_\_\_, which is abbreviated
as "\_\_\_\_\_\_\_\_."
A

Item
discrimination; item
discrimination
index; D

75
Q
An item on a measure of
anxiety would have good
\_\_\_\_\_\_\_\_ if low-anxiety
examinees consistently
answered it differently than
high-anxiety examinees.
A

Discriminability
(item
discrimination)

76
Q
An item's \_\_\_\_\_\_\_\_ level
places a ceiling on its
\_\_\_\_\_\_\_\_ index; higher
levels of discriminability
are associated with
\_\_\_\_\_\_\_\_ levels of
difficulty.
A

Difficulty;
discrimination;
moderate

77
Q
TRUE or FALSE: The
reliability of a test will
decrease as the mean
discrimination index
(D) increases.
A
FALSE: There is a
direct correlation
between test
reliability and
mean D
78
Q
A graphical depiction of
both item difficulty and
item discrimination is
called a/an \_\_\_\_\_\_\_\_;
analysis based on
\_\_\_\_\_\_\_\_ is derived from
these.
A

Item
characteristic
curve (ICC); item
response theory

79
Q
What are the 2
technical properties of
an item characteristic
curve that are used to
describe it?
A

Item difficulty
and item
discrimination

80
Q
Item response theory assumes (1)
performance on an item is related
to the estimated amount of a/an
\_\_\_\_\_\_\_\_ being measured by the
item, and (2) \_\_\_\_\_\_\_\_ (an item
should have the same
characteristics regardless of the
sample of people taking the test).
A

Latent trait;
invariance of
item
parameters

81
Q
The computerized
selection of test
items for individual
examinees is
referred to as what?
A

Computer
adaptive
assessment (or
testing)

82
Q
What item difficulty
level is associated
with the maximum
level of differentiation
among examinees?
A

.50, indicating half
answered correctly
and half answered
incorrectly

83
Q

What factor
most affects an
item’s difficulty
level?

A

Characteristics

of examinees

84
Q
What type of
interpretation indicates
where the examinee
stands in relation to
others who have taken
the same test?
A

Norm-referenced

interpretation

85
Q
Providing a general
indication as to the
progression a person has
made along the normal
developmental path,
\_\_\_\_\_\_\_\_ norms include
\_\_\_\_\_\_\_\_ and \_\_\_\_\_\_\_\_.
A

Developmental;
mental age;
grade equivalent
scores

86
Q

What is the
calculation
for ratio IQ?

A

(mental
age/chronological
age) x 100

87
Q
A 20-year-old performs
as well on a test as the
average 10-year-old.
His mental age is
\_\_\_\_\_\_\_\_ and his ratio
IQ is \_\_\_\_\_\_\_\_.
A

10-years-old;

50

88
Q
Indicating the grade level a
person's performance is
equivalent to, \_\_\_\_\_\_\_\_
are typically used in the
interpretation of
educational achievement
tests.
A
Grade equivalent
scores (e.g., Wide
Range
Achievement Test,
4th Ed [WRAT-4])
89
Q
TRUE or FALSE: When
using developmental
norms, scores obtained
by people of different
age groups are not
comparable.
A
TRUE: This is due
to the fact that
standard deviation
is not accounted
for
90
Q
Including percentile ranks
and standard scores,
\_\_\_\_\_\_\_\_ norms compare
examinee scores to those
of the most nearly
comparable
standardization sample.
A

Within-group

91
Q
Z-scores, t-scores, stanine
scores, and deviation IQ
scores are all examples of
\_\_\_\_\_\_\_\_, which express
a raw score's distance
from the mean in terms of
standard deviation.
A

Standard

scores

92
Q
Identify the mean (M)
and standard deviation
(sd) of: z-scores,
t-scores, stanine scores,
and deviation IQ scores.
A
Z-score (M = 0, sd = 1);
T-score (M = 50, sd =
10); Stanine (M = 5, sd
= about 2); Deviation IQ
(M = 100, sd = 15)