CACREP AREA: Assessment and Testing Flashcards

1
Q

Appraisal can be defined as

A

the process of assessing or estimating attributes.

**HINT: **

Appraisal could include…
1) surveys
2) observations
3) clinical interviews

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A test can be defined as a systematic method of measuring a sample of behavior. Test format refers to the manner in which test items are presented.

The format of an essay test is considered an ___?___ format.

A

Subjective

HINT: “subjective” paradigm relies mainly on the scorer’s opinion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The National Counselor Exam (NCE) is an ___?___ test because the scoring procedure is specific.

A

Objective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A short answer test is a ___?___ test

A

Free choice

NOTE: CPCE exam may refer to this as “free response”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The NCE and the CPCE would be examples of an ___?___ test

A

Forced choice

HINT: forced choice is sometimes also known as “recognition items”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The ___?___ index indicates the percentage of of individuals who answered each item correctly

A

Difficulty (index)

HINT:

The higher the number of people who answer a correctly, the easier the item is and vice-versa

0.5% difficulty index (aka difficulty value) = suggests 50% of those tested answered the question correctly, while the other 50% did not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Short answer tests and projective measures utilize free response items.

The NCE and the CPCE uses forced choice or so-called ___?___ items

A

Recognition (items)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A true/false test has ___?___ recognition items.

A

Dichotomous

HINT:

Dichotomy = presented with two opposing choices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A test format could be normative or ipsative. In the normative format, each item…?

A

Each item is independent of all other items.

HINT: Ipsative measures compare traits within the same individual; they do NOT compare a person to other persons who took the instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is true of a client who takes a normative test?

A

They can legitimately be compared to others who have taken the test.

HINT: Normative interpretation is when the individual’s score is evaluated by comparing it to others who took the same test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In an ipsative measure the person taking the test must compare items to one another.

The result is that…?

A

You cannot legitimately compare two or more people who have taken an ipsative test.

**HINT: **ipsative approach is a within-person analysis

Ipsative does NOT reveal absolute strengths

The person taking the assessment is measured in response to their OWN standard of behavior

The ipsative measure points out the highs and lows that exists within a single individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Tests are often classified as speed tests versus power tests.

A timed typing test used to hire secretaries would be considered what type of test?

A

A speed test

HINT:

A timed test is an example of speed test and a high percentage of test takers complete it but find it more difficult as it has a time limit

A good timed speed test is purposely set up so nobody finishes it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A counseling test consists of 300 forced response items. The person taking the test can take as long as he or she wants to answer the questions.

This most likely is what type of test?

A

This is most likely a power test.

HINT: A power test is designed to evaluate the level of mastery without a time limit – time is NOT an issue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

An achievement test measures maximum performance or present level of skill.

Tests of this nature are also called attainment tests, while a personality test or interest inventory measures what?

A

Typical performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a spiral test, the items get…?

A

The items get progressively more difficult.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In a cyclical test, what is true?

A

You have several sections which are spiral in nature

(in other words: the test revisits the same topics multiple times, each time with more detail or complexity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A test battery is considered what type of test?

A

Horizontal test

HINT:

Horizontal Test = Compares performance across different subjects or content areas within the same grade level.

Test Battery = A collection of multiple tests administered together to assess different skills, abilities, or knowledge areas within a single evaluation.

Vertical Test = Compares performance across different grade levels on the same content.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In a counseling research study, two groups of subjects took a test with the same name. However, when they talked with each other they discovered that the questions were different.

The researcher assured both groups that they were given the same test. How is this possible?

A

The researcher gave parallel forms of the same test (parallel meaning there’s versions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The most critical factors in test selection are ___?___ and ___?___

A

Validity and reliability.

HINT:

Validity = Refers to how well a test measures what it is supposed to measure.

Reliability = Refers to how consistently a test produces the same results over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which is more important, validity or reliability?

A

Validity

**HINT: **Validity is ALWAYS considered most important factor especially compared to reliability when constructing a test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In the field of testing, validity refers to what?

A

Whether the test really measures what it purports to
measure.

HINT:

FIVE TYPES OF VALIDITY:
1) Content Validity: Ensures that the test covers all relevant content areas or topics it is supposed to assess.

2) Construct Validity: Determines whether the test accurately measures the theoretical concept or construct (idea) it is intended to measure.

3) Concurrent Validity: Assesses how well the test results correlate with those from an established test measuring the same thing, taken at the same time.

4) Predictive Validity: Evaluates how well the test predicts future performance or outcomes.

5) Consequential Validity: Considers the social consequences and implications of using the test, including its impact on test-takers and society.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

A counselor peruses (looks through) a testing catalog in search of a test which will repeatedly give consistent results.

The counselor is interested in…?

A

Is interested in reliability.

HINT:

CAUTION - a test can be reliable BUT NOT valid

reliability can limit how valid a test can be, but validity doesn’t limit how reliable a test is.

A test can have high reliability coefficient but have low validity coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which measure would yield the highest level of reliability?

A

A very accurate postage scale (postage scale measures the weight of mail)

HINT: phyisical measurements are MORE relaible than psychological ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Construct validity refers to the extent that a test measures an abstract trait or psychological notion.

An example would be…?

A

Ego strength

HINT: any trait that you cannot directly measure/observe can be considered a “construct”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Face validity refers to the extent that a test…?

A

Looks or appears to measure the intended attribute.

HINT: Face validity tells you whether a test looks like it measures the intended trait

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

A job test which predicted future performance on a job very well would have…?

A

Have high criterion/predictive validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

A new IQ test which yielded results nearly identical to other standardized measures would be said to have what?

A

good concurrent validity.

HINT:

Concurrent validity measures how well the test compares to a well established instrument that measures the same thing

NOTE: Criterion validity can be concurrent or predicitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

When a counselor tells a client that the Graduate Record Examination (GRE) will predict her ability to handle graduate work, the counselor is referring to what type of validity?

A

predictive validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

A reliable test is ___?___ valid

A

Not always (valid)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

A valid test is ___?___ reliable

A

Always (reliable)

HINT: a valid test is ALWAYS reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

One method of testing reliability is to give the same test to the same group of people two times and then correlate the scores.

This is called what?

A

test-retest reliability

**HINT: **

Test-retest approach/reliability = Giving the same test to the same people twice to see if they get similar scores both times

High test-retest reliability means that the test yields similar results upon repeated administrations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

One method of testing reliability is to give the same population alternate forms of the identical test. Each form will have the same psychometric/statistical properties as the original instrument.

This is known as what?

A

equivalent or alternate forms reliability

HINT:

Counterbalancing = A method used to prevent order effects in tests by varying the order of test conditions for different participants

Counterbalancing is neccesary when testing reliability.

Example = If you’re testing the effect of two different teaching methods on student performance, you might have half the students use Method A first and then Method B, while the other half use Method B first and then Method A. This way, the order of the methods doesn’t unfairly affect the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

A counselor doing research decided to split a standardized test in half by using the even items as one test and the odd items as a second test and then correlating them.

The counselor was testing reliability via…?

A

was testing reliability via the split-half correlation method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Which method of reliability testing would be useful with an essay test but not with a test of algebra problems?

A

Inter-rater/inter-observer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

A reliability coefficient of 1.00 indicates a…?

A

A perfect score which has no error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

An excellent psychological or counseling test would have a reliability coefficient of…?

A

.90

HINT: this means that 90% of the score reflects the attribute being measured, while 10% is due to error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

A researcher working with a personality test discovers that the test has a reliability coefficient of .70 which is somewhat typical.

This indicates that…?

A

70% of the score is accurate while 30% is inaccurate.

HINT: 70% of obtained score on the test represented the true score on the personality attribute, while 30% is due to error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

A career counselor is using a test for job selection purposes.

An acceptable reliability coefficient would be ___?___ or higher

A

.80

HINT: for admissions for jobs, schools, and so on, a test’s reliability coefficient should be at least 0.80 (80%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

The same test is given to the same group of people using the test-retest reliability method. The correlation between the first and second administration is .70.

The true variance (i.e., the percentage of shared variance or the level of the same thing measured in both) is what percentage?

A

49%

HINT:

To find how much one factor’s variance is explained by another, square the correlation (e.g., 0.70 x 0.70 = 0.49), then convert it to a percentage (e.g 0.49 x 100 = 49%).

NOTE: CPCE exam might refer to this as coefficient of determination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

IQ means/stands for…?

A

intelligence quotient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

___?___ did research and concluded that intelligence was normally distributed like height or weight and that it was primarily genetic.

A

Francis Galton

HINT: Francis Galton felt intelligence was a single or unitary factor

42
Q

Francis Galton felt intelligence was…?

A

A unitary faculty

43
Q

J.P. Guilford isolated 120 factors which added up to intelligence. He also is remembered for his…?

A

thoughts on convergent and divergent thinking

HINT:

Convergent thinking = occurs when divergent thoughts and ideas are combined into a singular concept

Divergent thinking = the ability to generate a new idea

44
Q

A counselor is told by his supervisor to measure the internal consistency reliability (i.e., homogeneity) of a test but not to divide the test in halves.

The counselor would need to utilize the…?

A

the Kuder-Richardson coefficients of equivalence.

HINT:

Internal consistency is also known as “inter-item consistency”

Kudar-Rcihardson reliability/item consistency estimates is used to determine if performance on one item related to performance on another
* Uses KR-20 or KR-21 formulas

45
Q

The first intelligence test was created by…?

A

Alfred Binet and Theodore Simon

46
Q

Today, the Stanford-Binet IQ test is a…?

A

A standardized measure.

47
Q

IQ stands for intelligence quotient, which is expressed by what formula?

A

MA/CA X 100

(Mental Age / Chronological Age x 100)

HINT: Standard Deviation (DV) = 15, in Binet it is 16

48
Q

The Binet stressed age-related tasks. Utilizing this method, a 9-year-old task would be one in which…?

A

50% of the 9-year-olds could answer correctly

49
Q

Simon and Binet pioneered the first IQ test around 1905. The test was originally created to…?

A

To discriminate children without an intellectual disability from children with an intellectual disability.

50
Q

Today the Stanford-Binet is used from age 2 to adulthood. The IQ formula has been replaced by the…?

A

SAS (Standard Age Score)

51
Q

Most experts would agree that the Wechsler IQ tests gained popularity, where as the Binet…?

A

didn’t seem to be the best test for adults

HINT: the Binet was initially created for children

52
Q

The best IQ test for a 22-year-old single male would be which test?

A

WAIS-IV

HINT:

WAIS-IV = Wechsler Adult Intelligence Scale - 4th edition

WAIS-IV is best suited for ages 16-90

53
Q

The best intelligence test for a sixth-grade girl would be which test?

A

WISC-IV.

HINT:

WISC-IV = Wechsler Intelligence Scale for Children - 4th edition

WISC-IV is best suited for ages 6-16

54
Q

The best intelligence test for a kindergartner would be which test?

A

WPPSI-IV

HINT: WPPSI-IV (4) is best suited for children (i.e., kindegardners) ages 2-7

55
Q

The mean on the Wechsler and the Stanford-Binet Intelligence scales (SB5) is ___?___ and the standard deviation is ___?___

A

100; 15 Wechsler, 16 Stanford-Binet

56
Q

Group IQ tests like the Otis-Lennon, the Lorge-Thorndike, and the California Test of Mental Abilities are popular in school settings.

The advantage is that groups tests…?

A

group tests are quicker to administer

HINT:

School districts, governments, industries prefer tests that can be administered to many at the same time, which makes it quicker to administer

The downside is group tests are LESS accurate and have LOWER reliability

57
Q

The group IQ test movement began with…?

A

with the Army Alpha and Army Beta in World War I

58
Q

What is true of a culture-fair test?

A

items are known to the subject regardless of his or her culture

HINT:

Culture-fair test attempts to remove items which would be known only to an individual due to their background

NOTE: it is unethical to give tests to a client from a given population unless the test/inventory has been normed on that specific population

59
Q

The black versus white IQ controversy was sparked mainly by a 1969 article written by ___?___

A

Arthur Jensen

HINT: Jensen’s theory believed that due to slavery, it was possible that African Americans were “bred for strength rather than intelligence”

60
Q

The MMPI-2 is an…?

A

A standardized personality test.

HINT:

MMPI-2 = Minnesota Multiphasic Personality Inventory-2

This version of the test is intended to help clinicians diagnose and treat patients

61
Q

The word psychometric means…?

A

any form of mental testing

HINT:

Psychometrics = branch of counseling or psychology that focuses on testing

62
Q

In a projective test the client is shown what?

A

A neutral stimuli

HINT:

Acceptable formats for projective tests:

1) association (what comes to mind when…?)

2) completion (complete these sentences with real feeling…)

3) construction (draw this picture…)

63
Q

The 16 PF reflects the work of who?

A

Raymond B. Cattell

HINT:

Test/inventories like the 16 personality tests, analyze data outside of a given theory

These tests are called “factor analytic tests” or “inventories” rather than theory based tests

64
Q

The Myers-Briggs Type Indicator reflects the work of who?

A

Carl Jung

65
Q

The counselor who favors projective measures would most likely be a…?

A

A psychodynamic clinician

66
Q

An aptitude test is to ___?___ as an achievement test is to ___?___

A

potential; what has been learned

HINT:

APTITUDE TEST = assess “potential” and “predicts” – it predicts the individual could do well in a certain area (does NOT detemrine they actually are skilled in that area)

ACHIEVEMENT TEST = examines what you know and how well an individual currently performs in a specific area

Predictive validty is important when choosing an APTITUDE test

67
Q

Both the Rorschach and the Thematic Apperception Test (TAT) are projective tests.

The Rorschach uses 10 inkblot cards while the TAT uses…?

A

Pictures

68
Q

Test bias primarily results from what?

A

A test being normed solely on white middle-class clients

69
Q

A counselor who fears the client has an organic, neurological, or motoric difficulty would most likely use what assessment?

A

Bender Gestalt II

HINT:

Bender Visual Motor Gestalt Test is an expressive measure and known for its ability to determine whether brain damage is evident in an individual

This test is suitable for ages 4 years and up

70
Q

An interest inventory would be least valid when used with which type of person?

A

an eighth-grade male with an IQ of 136

**HINT: **Interest inventories work best with individuals who are of high school age or above

71
Q

One major criticism of interest inventories is that they…?

A

they emphasize professional positions and minimize blue-collar jobs.

72
Q

Interest inventories are positive in the sense that they are…?

A

they are reliable and not threatening to the test taker.

HINT: interest inventory tests would be the LEAST threatening variety of tests

73
Q

A counselor who had an interest primarily in testing would most likely be a member of what organization?

A

AARC = Association for Assessment and Research in Counseling

74
Q

The NCE is what type of test?

A

An achievement test

75
Q

The ___?___ and ___?___ are examples of aptitude tests.

A

O*NET Ability Profiler and the MCAT

HINT: As an FYI – in schools, their selection tests assess APTITUDE

76
Q

One problem with interest inventories is that the person often tries to answer the questions in a socially acceptable manner.

Psychometricians call this response style phenomenon what?

A

social desirability (the right way to feel in society)

77
Q

An aptitude test predicts future behavior while an achievement test measures what you have mastered or learned.

In the case of a test like the ___?___ the distinction is unclear.

A

GRE

78
Q

Your supervisor wants you to find a new personality test for your counseling agency.

You should read which

HINT: Multiple answers

A

1) professional journals.

2) the Buros Mental Measurements Yearbook.

3) classic textbooks in the field as well as test materials produced by the testing company.

79
Q

What does the standard error of measurement tells you?

A

how accurate or inaccurate a test score is.

HINT:

The standard error of measurement (SEM) shows how much a test score may vary from the true score.

Example: A test score of 85 with an SEM of 3 means the true score is likely between 82 and 88.

80
Q

A new IQ test has a standard error of measurement (SEM) of 3.

Tom scores 106 on the test. If he takes the test a lot, we can predict that about 68% of the time Tom will score between…?

A

Tom will score between 103 and 109

HINT: increasing a test’s length RAISES/INCREASES reliability

80
Q

A counselor created an achievement test with a reliability coefficient of .82. The test is shortened since many clients felt it was too long.

The counselor shortened the test but logically assumed that the reliability coefficient would now be…?

A

be lower than .82

81
Q

A counselor can utilize psychological tests to help secure a ___?___ diagnosis if third-party payments are necessary

A

DSM or ICD

82
Q

A colleague of yours invents a new projective test. Seventeen counselors rated the same client using the measure and came up with nearly identical assessments.

This would indicate what?

A

High reliability

HINT: this scenario is known as inter-rater reliability

83
Q

Counselors often shy away from self-reports because…?

A

clients often give inaccurate answers

84
Q

In most instances, who would be the best qualified to give the Rorschach Inkblot Test?

A

A clinical psychologist

HINT: a clinical psychologist would have the most training in projective measures

85
Q

Your client, who is in an outpatient hospital program, is keeping a joural of irrational thoughts.

This would be an…?

A

an informal assessment technique

HINT:

Other items that fall within the category of informal assessment

1) self reports
2) case notes
3) checklists
4) sociograms of groups
5) interviews
6) professional staffings

86
Q

You are uncertain whether a test is intended for the population served by your not-for-profit agency.

The best method of researching this dilemma would be to…?

A

read the test manual included with the test

HINT: the manual should specify the target population for the test in question

87
Q

Clients should know what regarding tests/assessments?

A

a test is merely a single source of data and not infallible

88
Q

One major testing trend is…?

A

computer-assisted testing and computer interpretations

89
Q

One future trend (in the realm of assessments) which seems contradictory is that some experts are pushing for…?

A

a greater reliance on tests while others want to rely on them less

90
Q

In the context of testing, most counselors would agree that…?

A

More public education is needed in the area of testing

91
Q

___?___ would be an informal method of appraisal

A

A checklist

92
Q

The WAIS-IV is given to 100,000 individuals in the United States who are picked at random.

A counselor would expect that…?

A

approximately 68% would score between 85 and 115.

HINT:

Tht distribution scores will be normal – this tells you that the mean (average) score will be 100 (i.e., average IQ) and the standard deviation will be 15

NOTE: SD is 16 if using the Binet

A normal distribution approximately 68% of the population will fall between +/- 1 SD of the mean

With a SD of 15, you simply subtract 15 from 100 to get the low score (i.e., 85) and add 15 to 100 to get 115

93
Q

A word association test would be an example of what type of test?

A

A projective test

94
Q

What is true about infant IQ tests?

A

They are more unreliable than those given later in life

95
Q

In the contexts of testing/assessments, a good practice for counselors is to…?

A

never generalize on the basis of a single test score

96
Q

You want to admit only 25% of all counselors to an advanced training program in psychodynamic group therapy.

The item difficulty on the entrance exam for applicants would be best set at…?

A

.25

97
Q

According to Public Law 93-380, also known as the Buckley Amendment, a 19-year-old college student attending college could…?

HINT: Multiple answers

A

1) could view her record, which included test data

2) could view her daughter’s infant IQ test given at preschool

3) could demand a correction she discovered while reading a file.

98
Q

What did Lewis Terman do?

A

Americanized the Binet.

HINT: Terman was associated with Stanford University – the test became the the Standford-Binet

99
Q

In constructing a test you notice that all 75 people correctly answered item number 12.

This gives you an item difficulty of…?

A

1.0

HINT:

Item difficulty index is calculated by taking the number of people tested who answered the item correctly divided by the total number of people tested

EX: 75/75 = 1.0

So, this maximum score for item 12 tells you it is probably much to easy for your examinees