CACREP AREA: Assessment and Testing Flashcards
Appraisal can be defined as
the process of assessing or estimating attributes.
**HINT: **
Appraisal could include…
1) surveys
2) observations
3) clinical interviews
A test can be defined as a systematic method of measuring a sample of behavior. Test format refers to the manner in which test items are presented.
The format of an essay test is considered an ___?___ format.
Subjective
HINT: “subjective” paradigm relies mainly on the scorer’s opinion
The National Counselor Exam (NCE) is an ___?___ test because the scoring procedure is specific.
Objective
A short answer test is a ___?___ test
Free choice
NOTE: CPCE exam may refer to this as “free response”
The NCE and the CPCE would be examples of an ___?___ test
Forced choice
HINT: forced choice is sometimes also known as “recognition items”
The ___?___ index indicates the percentage of of individuals who answered each item correctly
Difficulty (index)
HINT:
The higher the number of people who answer a correctly, the easier the item is and vice-versa
0.5% difficulty index (aka difficulty value) = suggests 50% of those tested answered the question correctly, while the other 50% did not
Short answer tests and projective measures utilize free response items.
The NCE and the CPCE uses forced choice or so-called ___?___ items
Recognition (items)
A true/false test has ___?___ recognition items.
Dichotomous
HINT:
Dichotomy = presented with two opposing choices
A test format could be normative or ipsative. In the normative format, each item…?
Each item is independent of all other items.
HINT: Ipsative measures compare traits within the same individual; they do NOT compare a person to other persons who took the instrument
What is true of a client who takes a normative test?
They can legitimately be compared to others who have taken the test.
HINT: Normative interpretation is when the individual’s score is evaluated by comparing it to others who took the same test.
In an ipsative measure the person taking the test must compare items to one another.
The result is that…?
You cannot legitimately compare two or more people who have taken an ipsative test.
**HINT: **ipsative approach is a within-person analysis
Ipsative does NOT reveal absolute strengths
The person taking the assessment is measured in response to their OWN standard of behavior
The ipsative measure points out the highs and lows that exists within a single individual
Tests are often classified as speed tests versus power tests.
A timed typing test used to hire secretaries would be considered what type of test?
A speed test
HINT:
A timed test is an example of speed test and a high percentage of test takers complete it but find it more difficult as it has a time limit
A good timed speed test is purposely set up so nobody finishes it
A counseling test consists of 300 forced response items. The person taking the test can take as long as he or she wants to answer the questions.
This most likely is what type of test?
This is most likely a power test.
HINT: A power test is designed to evaluate the level of mastery without a time limit – time is NOT an issue
An achievement test measures maximum performance or present level of skill.
Tests of this nature are also called attainment tests, while a personality test or interest inventory measures what?
Typical performance.
In a spiral test, the items get…?
The items get progressively more difficult.
In a cyclical test, what is true?
You have several sections which are spiral in nature
(in other words: the test revisits the same topics multiple times, each time with more detail or complexity)
A test battery is considered what type of test?
Horizontal test
HINT:
Horizontal Test = Compares performance across different subjects or content areas within the same grade level.
Test Battery = A collection of multiple tests administered together to assess different skills, abilities, or knowledge areas within a single evaluation.
Vertical Test = Compares performance across different grade levels on the same content.
In a counseling research study, two groups of subjects took a test with the same name. However, when they talked with each other they discovered that the questions were different.
The researcher assured both groups that they were given the same test. How is this possible?
The researcher gave parallel forms of the same test (parallel meaning there’s versions)
The most critical factors in test selection are ___?___ and ___?___
Validity and reliability.
HINT:
Validity = Refers to how well a test measures what it is supposed to measure.
Reliability = Refers to how consistently a test produces the same results over time.
Which is more important, validity or reliability?
Validity
**HINT: **Validity is ALWAYS considered most important factor especially compared to reliability when constructing a test.
In the field of testing, validity refers to what?
Whether the test really measures what it purports to
measure.
HINT:
FIVE TYPES OF VALIDITY:
1) Content Validity: Ensures that the test covers all relevant content areas or topics it is supposed to assess.
2) Construct Validity: Determines whether the test accurately measures the theoretical concept or construct (idea) it is intended to measure.
3) Concurrent Validity: Assesses how well the test results correlate with those from an established test measuring the same thing, taken at the same time.
4) Predictive Validity: Evaluates how well the test predicts future performance or outcomes.
5) Consequential Validity: Considers the social consequences and implications of using the test, including its impact on test-takers and society.
A counselor peruses (looks through) a testing catalog in search of a test which will repeatedly give consistent results.
The counselor is interested in…?
Is interested in reliability.
HINT:
CAUTION - a test can be reliable BUT NOT valid
reliability can limit how valid a test can be, but validity doesn’t limit how reliable a test is.
A test can have high reliability coefficient but have low validity coefficient
Which measure would yield the highest level of reliability?
A very accurate postage scale (postage scale measures the weight of mail)
HINT: phyisical measurements are MORE relaible than psychological ones
Construct validity refers to the extent that a test measures an abstract trait or psychological notion.
An example would be…?
Ego strength
HINT: any trait that you cannot directly measure/observe can be considered a “construct”
Face validity refers to the extent that a test…?
Looks or appears to measure the intended attribute.
HINT: Face validity tells you whether a test looks like it measures the intended trait
A job test which predicted future performance on a job very well would have…?
Have high criterion/predictive validity.
A new IQ test which yielded results nearly identical to other standardized measures would be said to have what?
good concurrent validity.
HINT:
Concurrent validity measures how well the test compares to a well established instrument that measures the same thing
NOTE: Criterion validity can be concurrent or predicitive
When a counselor tells a client that the Graduate Record Examination (GRE) will predict her ability to handle graduate work, the counselor is referring to what type of validity?
predictive validity
A reliable test is ___?___ valid
Not always (valid)
A valid test is ___?___ reliable
Always (reliable)
HINT: a valid test is ALWAYS reliable
One method of testing reliability is to give the same test to the same group of people two times and then correlate the scores.
This is called what?
test-retest reliability
**HINT: **
Test-retest approach/reliability = Giving the same test to the same people twice to see if they get similar scores both times
High test-retest reliability means that the test yields similar results upon repeated administrations.
One method of testing reliability is to give the same population alternate forms of the identical test. Each form will have the same psychometric/statistical properties as the original instrument.
This is known as what?
equivalent or alternate forms reliability
HINT:
Counterbalancing = A method used to prevent order effects in tests by varying the order of test conditions for different participants
Counterbalancing is neccesary when testing reliability.
Example = If you’re testing the effect of two different teaching methods on student performance, you might have half the students use Method A first and then Method B, while the other half use Method B first and then Method A. This way, the order of the methods doesn’t unfairly affect the results.
A counselor doing research decided to split a standardized test in half by using the even items as one test and the odd items as a second test and then correlating them.
The counselor was testing reliability via…?
was testing reliability via the split-half correlation method
Which method of reliability testing would be useful with an essay test but not with a test of algebra problems?
Inter-rater/inter-observer.
A reliability coefficient of 1.00 indicates a…?
A perfect score which has no error
An excellent psychological or counseling test would have a reliability coefficient of…?
.90
HINT: this means that 90% of the score reflects the attribute being measured, while 10% is due to error.
A researcher working with a personality test discovers that the test has a reliability coefficient of .70 which is somewhat typical.
This indicates that…?
70% of the score is accurate while 30% is inaccurate.
HINT: 70% of obtained score on the test represented the true score on the personality attribute, while 30% is due to error
A career counselor is using a test for job selection purposes.
An acceptable reliability coefficient would be ___?___ or higher
.80
HINT: for admissions for jobs, schools, and so on, a test’s reliability coefficient should be at least 0.80 (80%)
The same test is given to the same group of people using the test-retest reliability method. The correlation between the first and second administration is .70.
The true variance (i.e., the percentage of shared variance or the level of the same thing measured in both) is what percentage?
49%
HINT:
To find how much one factor’s variance is explained by another, square the correlation (e.g., 0.70 x 0.70 = 0.49), then convert it to a percentage (e.g 0.49 x 100 = 49%).
NOTE: CPCE exam might refer to this as coefficient of determination
IQ means/stands for…?
intelligence quotient