Assessment & Testing Flashcards
Appraisal can be defined as
a. the process of assessing or estimating attributes.
b. testing which is always performed in a group setting.
c. testing which is always performed on a single individual.
d. a pencil and paper measurement of assessing attributes.
a. the process of assessing or estimating attributes.
A test can be defined as a systematic method of measuring a sample of behavior. Test format refers to the manner in which test items are presented. The format of an essay test is considered a(n) ________ format.
a. subjective
b. objective
c. very precise
d. concise
a. subjective
The National Counselor Exam (NCE) is a(n) ________ test because the scoring procedure is specific.
a. subjective
b. objective
c. projective
d. subtest
b. objective
A short answer test is a(n) ________ test.
a. objective
b. culture-free
c. forced choice
d. free choice
d. free choice
The ________ index indicates the percentage of individuals who answered each item correctly.
a. difficulty
b. critical
c. intelligence
d. personal
a. difficulty
A test format could be normative or ipsative. In the normative format
a. each item depends on the item before it.
b. each item depends on the item after it.
c. the client must possess an IQ within the normal range. d. each item is independent of all other items.
d. each item is independent of all other items.
A client who takes a normative test
a. cannot legitimately be compared to others who have taken the test.
b. can legitimately be compared to others who have taken the test.
c. could not have taken an IQ test.
d. could not have taken a personality test.
b. can legitimately be compared to others who have taken the test.
In an ipsative measure the person taking the test must compare items to one another. The result is that
a. an ipsative measure cannot be utilized for career guidance.
b. you cannot legitimately compare two or more people who have taken an ipsative test. c. an ipsative measure is never a forced choice format.
d. an ipsative measure is never reliable.
b. you cannot legitimately compare two or more people who have taken an ipsative test.
Since the ipsative measure does not reveal absolute strengths, comparing one person’s score to another is relatively meaningless.
The person is measured in response to his or her own standard of behavior.
The ipsative measure points out the highs and lows that exist within a single individual.
Hence, when a colleague tells you that Mr. Johnson’s anxiety is improving, she has given you an ipsative description. This description, however, would not lend itself to comparing say Mr. Johnson’s anxiety to Mrs. McBee’s.
Tests are often classified as speed tests versus power tests. A timed typing test used to hire secretaries would be
a. a power test.
b. neither a speed test nor a power test.
c. a speed test.
d. a fine example of an ipsative measure.
c. a speed test.
An achievement test measures maximum performance or present level of skill. Tests of this nature are also called attainment tests, while a personality test or interest inventory measures
a. typical performance.
b. minimum performance.
c. unconscious traits.
d. self-esteem by always relying on a Q-Sort design.
a. typical performance.
In a spiral test
a. the items get progressively easier.
b. the difficulty of the items remains constant.
c. the client must answer each question in a specified period of time.
d. the items get progressively more difficult.
d. the items get progressively more difficult.
In a cyclical test
a. the items get progressively easier.
b. the difficulty of the items remains constant.
c. you have several sections which are spiral in nature.
d. the client must answer each question in a specified period of time.
c. you have several sections which are spiral in nature.
A test battery is considered
a. a horizontal test.
b. a vertical test.
c. a valid test.
d. a reliable test.
a. a horizontal test.
In a test battery, several measures are used to produce results that could be more accurate than those derived from merely using a single source. Say, this can get confusing. Remember, that in the section on group processes I talked about vertical and horizontal interventions.
In testing, a vertical test would have versions for various age brackets or levels of education (e.g., a math achievement test for preschoolers and a version for middle school children).
A horizontal test measures various factors (e.g., math and science) during the same testing procedure.
Which is more important, validity or reliability?
a. Reliability.
b. They are equally important.
c. Validity.
d. It depends on the test in question.
c. Validity.
Experts nearly always consider validity the number one factor in the construction of a test. A test must measure what it purports to measure.
In the field of testing, validity refers to
a. whether the test really measures what it purports to measure.
b. whether the same test gives consistent measurement.
c. the degree of cultural bias in a test.
d. the fact that numerous tests measure the same traits.
a. whether the test really measures what it purports to measure.
Which measure would yield the highest level of reliability?
a. A TAT, projective test popular with psychodynamic helpers.
b. The WAIS-IV, a popular IQ test.
c. The MMPI-2, a popular personality test.
d. A very accurate postage scale.
d. A very accurate postage scale.
In the real world physical measurements are more reliable than psychological ones.
Construct validity refers to the extent that a test measures an abstract trait or psychological notion. An
example would be
a. height.
b. weight.
c. ego strength.
d. the ability to name all men who have served as U.S. presidents.
c. ego strength.
Any trait you cannot “directly” measure or observe can be considered a construct.
Face validity refers to the extent that a test
a. looks or appears to measure the intended attribute.
b. measures a theoretical construct.
c. appears to be constructed in an artistic fashion.
d. can be compared to job performance.
a. looks or appears to measure the intended attribute.
A job test which predicted future performance on a job very well would
a. have high criterion/predictive validity.
b. have excellent face validity.
c. have excellent construct validity.
d. not have incremental validity or synthetic validity.
a. have high criterion/predictive validity.
A new IQ test which yielded results nearly identical to other standardized measures would be said to have
a. good concurrent validity.
b. good face validity.
c. superb internal consistency.
d. all of the above.
a. good concurrent validity.
Criterion validity could be “concurrent” or “predictive.” Concurrent validity answers the question of how well your test stacks up against a well-established instrument that measures the same behavior, construct, or trait.
Evidence for reliability and validity is expressed via correlation coefficients. Suffice to say that the closer they are to 1.00 the better.
You also should be familiar with the terms convergent and discriminant validity. These terms relate to both criterion validity and construct validity.
The relationship or correlation of a test to an independent measure or trait is known as convergent validity.
Convergent validity is actually a method used to assess a test’s construct/criterion validity by correlating test scores with an outside source. Say, for example, that a measure purports to measure phobic responses.
A client, who has a snake phobia, is then exposed to a snake and experiences extreme panic. If the client scores higher on the test than he would in a relaxed state, then this would display convergent validity.
The test also should show discriminant validity. This means the test will not reflect unrelated variables. Hence, if phobias are unrelated to IQ, then when one correlates clients’ IQ scores to their scores on the test for phobias, this should produce a near zero correlation.
Similarly, if discriminant validity is evident, a counselor who is genuinely qualified to sit for a state licensing exam should score higher on the exam than a student who flunked an introductory counseling course.
When a researcher is engaged in test validation, both convergent and discriminant validity should be thoroughly examined.
A valid test is ________ reliable.
a. not always
b. always
c. never
d. 80%
b. always
One method of testing reliability is to give the same test to the same group of people two times and then correlate the scores. This is called
a. test–retest reliability.
b. equivalent forms reliability.
c. alternate forms reliability.
d. the split-half method.
a. test–retest reliability.
One method of testing reliability is to give the same population alternate forms of the identical test. Each form will have the same psychometric/statistical properties as the original instrument. This is known as
a. test–retest reliability.
b. equivalent or alternate forms reliability.
c. the split-half method.
d. internal consistency.
b. equivalent or alternate forms reliability.
A counselor doing research decided to split a standardized test in half by using the even items as one test and the odd items as a second test and then correlating them. The counselor
a. used an invalid procedure to test reliability.
b. was testing reliability via the split-half correlation method.
c. was testing reliability via the equivalent forms method.
d. was testing reliability via the inter-rater method.
b. was testing reliability via the split-half correlation method.
Which method of reliability testing would be useful with an essay test but not with a test of algebra problems?
a. Test–retest.
b. Alternate forms.
c. Split-half.
d. Inter-rater/inter-observer.
d. Inter-rater/inter-observer.
A reliability coefficient of 1.00 indicates
a. a lot of variance in the test.
b. a score with a high level of error.
c. a perfect score which has no error.
d. a typical correlation on most psychological and counseling tests.
c. a perfect score which has no error.