Assessment And Test Flashcards by ASHLEIGH JOYNER

Appraisal can be defined as

a. the process of assessing or estimating attributes.
b. testing which is always performed in a group setting.
c. testing which is always performed on a single individual.
d. a pencil and paper measurement of assessing attributes.

The process of assessing or estimating attributes.

Appraisal is a broad term which includes more than merely “testing clients.” Appraisal could include a survey, observations, or even clinical interviews.

How well did you know this?

Not at all

Perfectly

A test can be defined as a systematic method of measuring a sample of behavior. Test format refers to the manner in which test items are presented. The format of an essay test is considered a(n) ________ format.

a. subjective
b. objective
c. very precise
d. concise

Subjective.

A “subjective” paradigm relies mainly on the scorer’s opinion. If the rater knows the test taker’s attributes, the rater’s “personal bias” can significantly impact upon the rating. For example, an attractive examinee might be given a higher rating. (This is the so-called halo effect.)

How well did you know this?

Not at all

Perfectly

The National Counselor Exam (NCE) is a(n) ________ test because the scoring procedure is specific.

a. subjective
b. objective
c. projective
d. subtest

Objective.

How well did you know this?

Not at all

Perfectly

A short answer test is a(n) ________ test.

a. objective
b. culture-free
c. forced choice
d. free choice

Free Choice.

Some exams will call this a “free response” format. In any case, the salient point is that the person taking the test can respond in any manner he or she chooses. Although free choice response patterns can yield more information, they often take more time to score and increase subjectivity (i.e., there is more than one correct answer).

How well did you know this?

Not at all

Perfectly

The NCE and the CPCE would be examples of a(n) ________ test.
a. free choice
b. forced choice
c. projective
d. intelligence

Forced Choice.

Forced choice” items are sometimes known as “recognition items.” This book is composed of forced choice/recognition items. On some tests this format is used to control for the “social desirability phenomenon” which asserts that the person puts the answer he or she feels is socially acceptable (i.e., the test provides alternatives that are all equal in terms of social desirability). The MMPI-2 (Minnesota Multiphasic Personality Inventory), for example, uses forced choices to create a “lie scale” composed of human frailties we all possess. This scale, therefore, ferrets out those individuals who tried to make themselves look good (i.e., the way they believe they “should” be).

How well did you know this?

Not at all

Perfectly

The ________ index indicates the percentage of individuals who
answered each item correctly.

a. difficulty
b. critical
c. intelligence
d. personal

Difficulty.

The higher the number of people who answer a question correctly, the easier the item is—and vice versa. A 0.5 difficulty index (also called a difficulty value) would suggest that 50% of those tested answered the question correctly, while 50% did not. Most theorists agree that a “good measure” provides a wide range of items that even a poor performer will answer correctly.

How well did you know this?

Not at all

Perfectly

Short answer tests and projective measures utilize free response items. The NCE and the CPCE uses forced choice or so-called ________ items.

a. vague
b. subjective
c. recognition
d. numerical

Recognition.

Recognition items give the examinee two or more alternatives.

How well did you know this?

Not at all

Perfectly

A true/false test has ________ recognition items.
a. similar
b. free choice
c. dichotomous
d. no

Dichotomous

“Dichotomy” simply means that you are presented with two opposing choices. This explains why choice “a” is definitely incorrect. When a test gives the person taking the exam three or more forced choices (e.g., the NCE, the CPCE, or this book) then psychometricians call it a “multipoint item.”

How well did you know this?

Not at all

Perfectly

A test format could be normative or ipsative. In the normative
format

a. each item depends on the item before it.
b. each item depends on the item after it.
c. the client must possess an IQ within the normal range.
d. each item is independent of all other items.

Each item is independent of all other items.

Ipsative measures compare traits within the same individual; they do not compare a person to other persons who took the instrument. The Kuder Career Planning instruments are often cited as falling into this category. The ipsative measure allows the person being tested to compare items.

How well did you know this?

Not at all

Perfectly

A client who takes a normative test

a. cannot legitimately be compared to others who have taken the test.
b. can legitimately be compared to others who have taken the test.
c. could not have taken an IQ test.
d. could not have taken a personality test.

Can legitimately be compared to others who have taken the test.

How well did you know this?

Not at all

Perfectly

In an ipsative measure the person taking the test must compare
items to one another. The result is that

a. an ipsative measure cannot be utilized for career guidance.
b. you cannot legitimately compare two or more people who
have taken an ipsative test.
c. an ipsative measure is never a forced choice format.
d. an ipsative measure is never reliable.

You cannot legitimately compare two or more people who have taken an ipsative test.

Since the ipsative measure does not reveal absolute strengths, comparing one person’s score to another is relatively meaningless.

How well did you know this?

Not at all

Perfectly

Tests are often classified as speed tests versus power tests. A timed typing test used to hire secretaries would be

a. a power test.
b. neither a speed test nor a power test.
c. a speed test.
d. a fine example of an ipsative measure.

A speed test.

In terms of difficulty, a speed test is really intended to be fairly easy. The difficulty is induced by time limitations, not the difficulty of the tasks or the questions themselves.

How well did you know this?

Not at all

Perfectly

A counseling test consists of 300 forced response items. The person taking the test can take as long as he or she wants to answer the questions.

a. This is most likely a projective measure.
b. This is most likely a speed test.
c. This is most likely a power test.
d. This is most likely an invalid measure.

This is most likely a power test.

How well did you know this?

Not at all

Perfectly

An achievement test measures maximum performance or present level of skill. Tests of this nature are also called attainment tests, while a personality test or interest inventory measures

a. typical performance.
b. minimum performance.
c. unconscious traits.
d. self-esteem by always relying on a Q-Sort design.

Typical performance.

How well did you know this?

Not at all

Perfectly

In a spiral test

a. the items get progressively easier.
b. the difficulty of the items remains constant.
c. the client must answer each question in a specified period
of time.
d. the items get progressively more difficult.

The items get progressively more difficult.

Just remember that a spiral staircase seems to get more difficult to climb as you walk up higher.

How well did you know this?

Not at all

Perfectly

In a cyclical test

a. the items get progressively easier.
b. the difficulty of the items remains constant.
c. you have several sections which are spiral in nature.
d. the client must answer each question in a specified period
of time.

You have several sections which are spiral in nature.

In each section the questions would go from easy ones to those
which are more difficult.

How well did you know this?

Not at all

Perfectly

A test battery is considered

a. a horizontal test.
b. a vertical test.
c. a valid test.
d. a reliable test.

A horizontal test.

In a test battery, several measures are used to produce results that could be more accurate than those derived from merely using a single source.

How well did you know this?

Not at all

Perfectly

In a counseling research study, two groups of subjects took a test with the same name. However, when they talked with each other they discovered that the questions were different. The researcher assured both groups that they were given the same test. How is this possible?

a. The researcher is not telling the truth. The groups could not possibly have taken the same test.
b. The test was horizontal.
c. The test was not a power test.
d. The researcher gave parallel forms of the same test.

The researcher gave parallel forms of the same test.

When a test has two versions or forms that are interchangeable they are termed parallel forms or equivalent forms of the same test. From a statistical/psychometric standpoint each form must have the same mean, standard error, and other statistical components.

How well did you know this?

Not at all

Perfectly

The most critical factors in test selection are

a. the length of the test and the number of people who took the test in the norming process.
b. horizontal versus vertical.
c. validity and reliability.
d. spiral versus cyclical format.

Validity and reliability.

Validity refers to whether the test measures what it says it measures while reliability tells how consistent a test measures an attribute.

How well did you know this?

Not at all

Perfectly

Which is more important, validity or reliability?

a. Reliability.
b. They are equally important.
c. Validity.
d. It depends on the test in question.

Validity.

Experts nearly always consider validity the number one factor in the construction of a test. A test must measure what it purports to measure.

How well did you know this?

Not at all

Perfectly

In the field of testing, validity refers to

a. whether the test really measures what it purports to measure.
b. whether the same test gives consistent measurement.
c. the degree of cultural bias in a test.
d. the fact that numerous tests measure the same traits.

Whether the test really measures what it purports to measure.

How well did you know this?

Not at all

Perfectly

A counselor peruses a testing catalog in search of a test which
will repeatedly give consistent results. The counselor

a. is interested in reliability.
b. is interested in validity.
c. is looking for information which is not available.
d. is magnifying an unimportant issue.

Is interested in reliability.

How well did you know this?

Not at all

Perfectly

Which measure would yield the highest level of reliability?

a. A TAT, projective test popular with psychodynamic helpers.
b. The WAIS-IV, a popular IQ test.
c. The MMPI-2, a popular personality test.
d. A very accurate postage scale.

A very accurate postage scale.

In the real world physical measurements are more reliable than psychological ones.

How well did you know this?

Not at all

Perfectly

Construct validity refers to the extent that a test measures an abstract trait or psychological notion. An example would be

a. height.
b. weight.
c. ego strength.
d. the ability to name all men who have served as U.S. presidents.

Ego Strength.

Any trait you cannot “directly” measure or observe can be considered a construct.

How well did you know this?

Not at all

Perfectly

Face validity refers to the extent that a test a. looks or appears to measure the intended attribute. b. measures a theoretical construct. c. appears to be constructed in an artistic fashion. d. can be compared to job performance.

Looks or appears to measure the intended attribute. Face validity—like a person’s face—merely tells you whether the test looks like it measures the intended trait.

A job test which predicted future performance on a job very well would a. have high criterion/predictive validity. b. have excellent face validity. c. have excellent construct validity. d. not have incremental validity or synthetic validity.

Have high criterion/predictive validity.

A new IQ test which yielded results nearly identical to other standardized measures would be said to have a. good concurrent validity. b. good face validity. c. superb internal consistency. d. all of the above.

Good concurrent validity. Concurrent validity answers the question of how well your test stacks up against a well-established instrument that measures the same behavior, construct, or trait.

When a counselor tells a client that the Graduate Record Examination (GRE) will predict her ability to handle graduate work, the counselor is referring to a. good concurrent validity. b. construct validity. c. face validity. d. predictive validity.

Predictive validity. The Graduate Record Examination (GRE), the Scholastic Aptitude Test (SAT), the American College Test (ACT), and public opinion polls are effective only if they have high predictive validity, which is the power to accurately describe future behavior or events. Again the subtypes of criterion validity are concurrent and predictive.

A reliable test is ________ valid. a. always b. 90% c. not always d. 80%

Not always

A valid test is ________ reliable. a. not always b. always c. never d. 80%

Always

One method of testing reliability is to give the same test to the same group of people two times and then correlate the scores. This is called a. test–retest reliability. b. equivalent forms reliability. c. alternate forms reliability. d. the split-half method.

Test–retest reliability.

One method of testing reliability is to give the same population alternate forms of the identical test. Each form will have the same psychometric/statistical properties as the original instrument. This is known as a. test–retest reliability. b. equivalent or alternate forms reliability. c. the split-half method. d. internal consistency.

Equivalent or alternate forms reliability.

A counselor doing research decided to split a standardized test in half by using the even items as one test and the odd items as a second test and then correlating them. The counselor a. used an invalid procedure to test reliability. b. was testing reliability via the split-half correlation method. c. was testing reliability via the equivalent forms method. d. was testing reliability via the inter-rater method.

Was testing reliability via the split-half correlation method.

Which method of reliability testing would be useful with an essay test but not with a test of algebra problems? a. Test–retest. b. Alternate forms. c. Split-half. d. Inter-rater/inter-observer.

Inter-rater/inter-observer. This method is also called “scorer reliability” and is utilized with subjective tests such as projectives to ascertain whether the scoring criteria are such that two persons who grade or assess the responses will produce roughly the same score.

A reliability coefficient of 1.00 indicates a. a lot of variance in the test. b. a score with a high level of error. c. a perfect score which has no error. d. a typical correlation on most psychological and counseling tests.

A perfect score which has no error. This generally occurs only in physical measurement.

An excellent psychological or counseling test would have a reliability coefficient of a. 50. b. .90. c. 1.00. d. –.90.

.90. Ninety percent of the score measured the attribute in question, while 10% of the score is indicative of error

A researcher working with a personality test discovers that the test has a reliability coefficient of .70 which is somewhat typical. This indicates that a. 70% of the score is accurate while 30% is inaccurate. b. 30% of the people who are tested will receive accurate scores. c. 70% of the people who are tested will receive accurate scores. d. 30% of the score is accurate while 70% is inaccurate.

70% of the score is accurate while 30% is inaccurate. Seventy percent of the obtained score on the test represented the true score on the personality attribute, while 30% of the obtained score could be accounted for by error. Seventy percent is true variance while 30% constitutes error variance.

A career counselor is using a test for job selection purposes. An acceptable reliability coefficient would be ________ or higher. a. .20 b. .55 c. .80 d. .70

.80 Although .70 is generally acceptable for most psychological attributes, for admissions for jobs, schools, and so on, it should be at least .80 and some experts will not settle for less than .90.

The same test is given to the same group of people using the test–retest reliability method. The correlation between the first and second administration is .70. The true variance (i.e., the percentage of shared variance or the level of the same thing measured in both) is a. 70%. b. 100%. c. 50%. d. 49%.

70% Here’s the key to simplifying a question such as this. To demonstrate the variance of one factor accounted for by another you merely square the correlation (i.e., reliability coefficient). So .70 × .70 = .49 and .49 × 100 = 49%. Your exam could refer to this principle as the coefficient of determination

IQ means a. a query of intelligence. b. indication of intelligence. c. intelligence quotient. d. intelligence questions for test construction.

Intelligence quotient.

________ did research and concluded that intelligence was normally distributed like height or weight and that it was primarily genetic. a. Spearman b. Guilford c. Williamson d. Galton

Galton. Francis Galton felt intelligence was a single or so-called unitary factor.

Francis Galton felt intelligence was a. a unitary faculty. b. best explained via a two factor theory. c. best explained via the person’s environment. d. fluid and crystallized in nature.

A unitary faculty.

J. P. Guilford isolated 120 factors which added up to intelligence. He also is remembered for his a. thoughts on convergent and divergent thinking. b. work on cognitive therapy. c. work on behavior therapy. d. work to create the first standardized IQ test.

Thoughts on convergent and divergent thinking. Using factor analysis, Guilford determined that there were 120 elements/abilities which added up to intelligence. Two of the dimensions—convergent and divergent thinking—are still popular terms today. Convergent thinking occurs when divergent thoughts and ideas are combined into a singular concept. Divergent thinking is the ability to generate a novel idea.

A counselor is told by his supervisor to measure the internal consistency reliability (i.e., homogeneity) of a test but not to divide the test in halves. The counselor would need to utilize a. the split-half method. b. the test–retest method. c. the Kuder–Richardson coefficients of equivalence. d. cross-validation.

The Kuder–Richardson coefficients of equivalence.

The first intelligence test was created by a. David Wechsler. b. J. P. Guilford. c. Francis Galton. d. Alfred Binet and Theodore Simon.

Alfred Binet and Theodore Simon.

Today, the Stanford–Binet IQ test is a. a nonstandardized measure. b. a standardized measure. c. a projective measure. d. b and c.

A standardized measure. The Stanford–Binet is standardized because the scoring and administration procedures are formal and well delineated.

IQ stands for intelligence quotient, which is expressed by a. CA/MA × 100. b. CA/MA × 100. c. MA/CA × 50. d. MA/CA × 100.

MA/CA × 100.

The Binet stressed age-related tasks. Utilizing this method, a 9-year-old task would be one which a. only a 10-year-old child could answer. b. only an 8-year-old child could answer. c. 50% of the 9-year-olds could answer correctly. d. 75% of the 9-year-olds could answer correctly.

50% of the 9-year-olds could answer correctly. A 9-year-old task was defined as one in which one half of the 9-year-olds tested could answer successfully.

Simon and Binet pioneered the first IQ test around 1905. The test was created to a. assess high school seniors in America. b. assess U.S. military recruits. c. discriminate children without an intellectual disability from children with an intellectual disability. d. measure genius in the college population.

Discriminate children without an intellectual disability from children with an intellectual disability. The Minister of Public Instruction for the Paris schools wanted a test to identify children with an intellectual disability so that they could be taught separately. The assumption was made that intelligence was basically the ability to understand school-related material.

Today the Stanford–Binet is used from age 2 to adulthood. The IQ formula has been replaced by the a. SAS. b. SUDS. c. entropy. d. KR-20formula.

SAS. SAS stands for “standard age score.”

Most experts would agree that the Wechsler IQ tests gained popularity, as the Binet a. must be administered in a group. b. favored the geriatric population. c. didn’t seem to be the best test for adults. d. was biased toward women.

Didn’t seem to be the best test for adults.

The best IQ test for a 22-year-old single male would be the a. WPPSI-III. b. WAIS-IV. c. WISC-IV. d. any computer-based IQ test.

WAIS-IV. The WAIS-IV (Wechsler Adult Intelligence Scale), is intended for ages 16–90 years.

The best intelligence test for a sixth-grade girl would be the a. WPPSI-IV. b. WAIS-IV. c. WISC-IV. d. Merrill–Palmer.

WISC-IV. The WISC-IV is recommended for children from ages 6–16 years and 11 months.

The best intelligence test for a kindergartner would be them a. WPPSI-IV. b. WAIS-IV. c. WISC-IV. d. Myers–Briggs Type Indicator.

WPPSI-IV.

The mean on the Wechsler and the Stanford–Binet Intelligence scales (SB5) is ________ and the standard deviation is ________. a. 100;100 b. 100; 15 Wechsler, 16 Stanford–Binet c. 100;20 d. 100;1

100; 15 Wechsler, 16 Stanford–Binet. IQs above 100 are above average and those shy of 100 are below average

Group IQ tests like the Otis–Lennon, the Lorge–Thorndike, and the California Test of Mental Abilities are popular in school settings. The advantage is that a. group tests are quicker to administer. b. group tests are superior in terms of predicting school performance. c. group tests always have a higher degree of reliability. d. individual IQ tests are not appropriate for school children.

Group tests are quicker to administer.

The group IQ test movement began a. In 1905. b. with the work of Binet. c. with the Army Alpha and Army Beta in World War I. d. with Freudian psychoanalysis and the psychodynamic movement.

With the Army Alpha and Army Beta in World War I. Note the word group.

In a culture-fair test a. items are known to the subject regardless of his or her culture. b. the test is not standardized. c. culture-free items cannot be utilized. d. African Americans generally score higher than whites.

Items are known to the subject regardless of his or her culture. The culture-fair test attempts to expunge items which would be known only to an individual due to his or her background.

The black versus white IQ controversy was sparked mainly by a 1969 article written by ________. a. John Ertl b. Raymond B. Cattell c. Arthur Jensen d. Robert Williams

Arthur Jensen

The MMPI-2 is a. an IQ test. b. a neurological test. c. a projective personality test. d. a standardized personality test.

A standardized personality test. The original version of this instrument was created in 1940. The Minnesota Multiphasic Personality Inventory-2, the current version used since 1989, is known as a “self-report” personality inventory.

The word psychometric means a. a form of measurement used by a neurologist. b. any form of mental testing. c. a mental trait which cannot be measured. d. the test relies on a summated or linear rating scale.

Any form of mental testing. Psychometrics literally refers to the branch of counseling or psychology which focuses on testing.

In a projective test the client is shown a. something which is highly reinforcing. b. something which is highly charged from an emotional standpoint. c. a and b. d. neutral stimuli.

Neutral stimuli.

The 16 PF reflects the work of a. Raymond B. Cattell. b. Carl Jung. c. James McKeen Cattell. d. Oscar K. Buros.

Raymond B. Cattell. The 16 PF (16 Personality Factor Questionnaire), developed by Raymond B. Cattell, is suitable for persons age 16 and above and has been the subject of over 2,000 papers or other communications! The test measures key personality factors such as assertiveness, emotional maturity, and shrewdness. A couple can even decide that each party will take the 16 PF, and both an individual and joint profile will be compiled, which can be utilized for marital counseling.

The Myers–Briggs Type Indicator reflects the work of a. Raymond B. Cattell. b. Carl Jung. c. William Glasser. d. Oscar K. Buros.

Carl Jung.

The counselor who favors projective measures would most likely be a a. Rogerian. b. strict behaviorist. c. Ta therapist. d. psychodynamic clinician.

Psychodynamic clinician. Choices “a,” “b,” and “c” all reflect positions that do not rely heavily on the unconscious mind (especially the behaviorists, who believe that if you can’t directly measure the behavior, it is not meaningful). However, some theorists (e.g., Allport) would contend that even if it is true that unconscious impulses exist, they are not very important.

An aptitude test is to ________ as an achievement test is to ________. a. what has been learned; potential b. potential; what has been learned c. profit from learning; potential d. a measurement of current skills; potential

Potential; what has been learned

Both the Rorschach and the Thematic Apperception Test (TAT) are projective tests. The Rorschach uses 10 inkblot cards while the TAT uses a. a dozen inkblot cards. b. verbal and performance IQ scales. c. pictures. d. incompletesentences.

Pictures. The TAT consists of 31 cards. The test, which is intended for ages 4 and beyond, uses up to 20 cards when administered to any given individual (i.e., 19 selected to fit the age and sex of the client, plus one blank card). The pictures on each card are intentionally ambiguous, and the client is asked to make up a story for each of them.

Test bias primarily results from a. a test being normed solely on white middle-class clients. b. the use of projective measures. c. using whites to score the test. d. using IQ rather than personality tests.

A test being normed solely on white middle-class clients.

A counselor who fears the client has an organic, neurological, or motoric difficulty would most likely use the a. Bender Gestalt II. b. Rorschach. c. Minnesota Multiphasic Personality Inventory-2. d. Thematic Apperception Test.

Bender Gestalt II. The Bender Visual Motor Gestalt Test (named after psychiatrist Lauretta Bender) is actually an expressive projective measure, though first and foremost it is known for its ability to discern whether brain damage is evident. Suitable for age 4 years and beyond, the client is instructed to copy 16 geometric figures which the client can look at while constructing his or her drawing.

An interest inventory would be least valid when used with a. a first-year college student majoring in philosophy. b. a third-year college student majoring in physics. c. an eighth-grade male with an IQ of 136. d. a 46-year-old white male construction worker.

An eighth-grade male with an IQ of 136. Interest inventories work best with individuals who are of high school age or above inasmuch as interests are not extremely stable prior to that time. Interests become quite stable around age 25.

One major criticism of interest inventories is that a. they have far too many questions. b. they are most appropriate for very young children. c. they emphasize professional positions and minimize blue- collar jobs. d. they favor jobs that will require a bachelor’s degree or higher.

They emphasize professional positions and minimize blue- collar jobs.

Interest inventories are positive in the sense that a. they are reliable and not threatening to the test taker. b. they are always graded by the test taker. c. they require little or no reading skills. d. they have high validity in nearly all age brackets.

They are reliable and not threatening to the test taker. Generally, an interest inventory would be the least threatening variety of test.

A counselor who had an interest primarily in testing would most likely be a member of a. HS-BCP. b. AARC. c. NASW. d. ACES.

AARC. The AARC (Association for Assessment and Research in Counseling) is one of 20 ACA divisions. Can you name the other choices?

The NCE is a. an intelligence test. b. an aptitude test. c. a personality test. d. an achievement test.

An achievement test. The NCE is testing your knowledge and application of material in the counseling profession.

The ________ are examples of aptitude tests. a. O*NET Ability Profiler and the MCAT b. GZTS and the MMPI-2 c. CPI and the MMPI-2 d. Strong and the LSAT

O*NET Ability Profiler and the MCAT Exam Hint: School selection tests assess aptitude.

One problem with interest inventories is that the person often tries to answer the questions in a socially acceptable manner. Psychometricians call this response style phenomenon a. standard error. b. social desirability (the right way to feel in society). c. cultural bias. d. acquiescence.

Social desirability (the right way to feel in society).

An aptitude test predicts future behavior while an achievement test measures what you have mastered or learned. In the case of a test like the ________ the distinction is unclear. a. Binet b. Wechsler c. GRE d. Bender

GRE

Your supervisor wants you to find a new personality test for your counseling agency. You should read a. professional journals. b. the Buros Mental Measurements Yearbook. c. classic textbooks in the field as well as test materials produced by the testing company. d. all of the above.

All of the above. Moreover, it has been discovered that if the counselor involves the client in the process of test selection it will improve his or her cooperation in the counseling process.

The standard error of measurement tells you a. how accurate or inaccurate a test score is. b. what population responds best to the test. c. something about social loafing. d. the number of people used in norming the test.

How accurate or inaccurate a test score is.

A new IQ test has a standard error of measurement (SEM) of 3. Tom scores 106 on the test. If he takes the test a lot, we can predict that about 68% of the time a. Tom will score between 100 and 103. b. Tom will score between 100 and 106. c. Tom will score between 103 and 109. d. Tom will score higher than Betty who scored 139.

Tom will score between 103 and 109. Calculated simply by taking: 106 – 3 = 103 and 106 + 3 = 109. Hint: Your exam could refer to this as the “68% confidence interval” (i.e., 103 to 109). Classical test theory suggests the formula, X = T + E, where X is the obtained score, T is the true score, and E is the error. Hence, psychometricians know that if a client takes the same test over and over, random error (i.e., E in the formula) will cause the score to fluctuate.

A counselor created an achievement test with a reliability coefficient of .82. The test is shortened since many clients felt it was too long. The counselor shortened the test but logically assumed that the reliability coefficient would now a. be approximately .88. b. remain at .82. c. be at least 10 points higher or lower. d. be lower than .82.

Be lower than .82. Increasing a test’s length raises reliability. Shorten it and the antithesis occurs. Note: The Spearman Brown formula is used to estimate the impact that lengthening or shortening a test will have on a test’s reliability coefficient.

A counselor can utilize psychological tests to help secure a ________ diagnosis if third-party payments are necessary. a. CPT b. DSM or ICD c. percentile d. standard error

DSM or ICD Diagnosis is a medical term which asserts that you classify a disease based on symptomatology. CPT (Current Procedural Terminology Codes) are used to let insurance companies, managed care firms, etc. know which service you provided, such as individual therapy or family therapy.

A colleague of yours invents a new projective test. Seventeen counselors rated the same client using the measure and came up with nearly identical assessments. This would indicate a. high validity. b. high reliability. c. excellent norming studies. d. culture fairness.

High reliability.

Counselors often shy away from self-reports since a. clients often give inaccurate answers. b. ACA ethics do not allow them. c. clients need a very high IQ to understand them. d. they are generally very lengthy.

Clients often give inaccurate answers.

In most instances, who would be the best qualified to give the Rorschach Inkblot Test? a. A counselor with NCC after his or her name. b. A clinical psychologist. c. A D.O. psychiatrist. d. A social worker with LCSW after his or her name.

A clinical psychologist. Generally, a clinical psychologist would have the most training in projective measures while the social worker would have the least education regarding tests and measurements.

Your client, who is in an outpatient hospital program, is keeping a journal of irrational thoughts. This would be a. an unethical practice based on NBCC ethical guidelines. b. considered a standardized test. c. an informal assessment technique. d. an aptitude measure.

An informal assessment technique. Self-reports, case notes, checklists, sociograms of groups, interviews, and professional staffings would also fall into the informal assessment category.

You are uncertain whether a test is intended for the population served by your not-for-profit agency. The best method of researching this dilemma would be to a. contact a local APA clinical psychology graduate program. b. e-mail the person who created the test. c. read the test manual included with the test. d. give the test to six or more clients at random.

Read the test manual included with the test. The manual should specify the target population for the test in question.

Clients should know that a. validity is more important than reliability. b. projective tests favor psychodynamic theory. c. face validity is not that important. d. a test is merely a single source of data and not infallible.

A test is merely a single source of data and not infallible. Although the first three choices are important to the counselor, the final statement should be explained to the client. An extremely high score—say on a mechanical aptitude test—does not automatically imply that the client will prosper as a mechanic.

One major testing trend is a. computer-assisted testing and computer interpretations. b. more paper and pencil measures. c. to give school children more standardized tests. d. to train pastoral counselors to do projective testing.

Computer-assisted testing and computer interpretations.

One future trend which seems contradictory is that some experts are pushing for a. a greater reliance on tests while others want to rely on them less. b. social workers to do most of the testing. c. psychiatrists to do most of the testing. d. counselors to ban all computer-assisted tests.

A greater reliance on tests while others want to rely on them less. It seems we counselors just can’t agree on anything. Many counselors would like to see a greater emphasis in the future on tests which assess creative and motivational factors.

Most counselors would agree that a. more preschool IQ testing is necessary. b. teachers need to give more personality tests. c. more public education is needed in the area of testing. d. the testing mystique has been beneficial to the general public.

More public education is needed in the area of testing.

________ would be an informal method of appraisal. a. IQ testing b. Standardized personality testing c. GRE scores d. A checklist

A checklist

The WAIS-IV is given to 100,000 individuals in the United States who are picked at random. A counselor would expect that a. approximately 68% would score between 85 and 115. b. approximately 68% would score between 70 and 130. c. the mean IQ would be 112. d. 50% of those tested would score 112 or above.

Approximately 68% would score between 85 and 115.

A word association test would be an example of a. a neuropsychological test. b. a motoric test. c. an achievement test. d. a projective test.

A projective test.

Infant IQ tests are a. more reliable than those given later in life. b. more unreliable than those given later in life. c. not related to learning experiences. d. neverused.

More unreliable than those given later in life. These “toddler tests” are sometimes capable of picking up gross abnormalities such as severe intellectual disabilities.

A good practice for counselors is to a. always test the client yourself rather than referring the client for testing. b. never generalize on the basis of a single test score. c. stay away from culture-free tests. d. stay away from scoring the test yourself.

Never generalize on the basis of a single test score.

You want to admit only 25% of all counselors to an advanced training program in psychodynamic group therapy. The item difficulty on the entrance exam for applicants would be best set at a. 0.0. b. .5 regardless of the admission requirement. c. 1.0. d. .25.

.25.

According to Public Law 93–380, also known as the Buckley Amendment, a 19-year-old college student attending college a. could view her record, which included test data. b. could view her daughter’s infant IQ test given at preschool. c. could demand a correction she discovered while reading a file. d. all of the above.

All of the above. Persons over age 18 can inspect their own records and those of their children. The Family Educational Rights and Privacy Act (FERPA) also stipulates that information cannot be released without adult consent.

Lewis Terman a. constructed the Wechsler tests. b. constructed the initial Binet prior to 1910. c. constructed the Rorschach. d. Americanized the Binet.

Americanized the Binet. Since Terman was associated with Stanford University the test became the Stanford–Binet.

In constructing a test you notice that all 75 people correctly answered item number 12. This gives you an item difficulty of a. 1.2. b. .75. c. 1.0. d. 0.0.

1.0. The item difficulty index is calculated by taking the number of persons tested who answered the item correctly/total number of persons tested. Hence, in this case 75/75 = 1.0. This maximum score for item 12 tells you it is probably much too easy for your examinees.

Assessment And Test Flashcards

(100 cards)