Assessment and Testing Flashcards
What is appraisal?
Appraisal refers to the process of assessing or estimating attributes. It could include a survey, observations, or even clinical interviews. A test is simply an instrument which measures a given sample of behavior. Measure means that it connotes that a number or a score has been assigned to the person’s attribute or performance.
What is the study of psychometrics?
Psychometrics is the study of psychological measurement. Someone who primarily administers and interprets tests has the job title of a psychometrician. It is critical that counselors inform clients about the limitations of any tests they administer.
What is a test and what is test format?
A test is a systmatic method of measuring a sample of behavior. Test format refers to the manner in which test items are presented. Test formats can be subjective (a paradigm that relies mainly on the scorers opinion, can be impacted by personal bias). In an objective test, the rater’s judgement plays little or no part in the scoring process.
What is a free-choice test?
In a free-choice or free-response test or question, the person taking the test can respond in any manner he or she chooses. Although free choice responses can yield more information, they often take more time to score and increase subjectivity.
What is a forced-choice test?
Forced choice items can also be known as recognition items – I.e. multiple choice. On some tests, this format is used to control for the social desirability phenomenon (when people puts the answers he thinks is socially desirably). The MMPI uses forced choices to create a “lie scale” composed of human frailties we all posses, so the scale ferrets out people who try to make themselves look good vs. answering honestly.
What is the difficulty index?
TH isis the percentage of individuals who answered each item correctly. The higher the number of people who answered a question correctly, the easier it is – and vice versa. A .5 difficulty index would suggest that 50% of those tested answered the question correctly. Most theorists agree that a “good measure” will provide a wide range of items that even a poor performer can answer correctly.
What are recognition items?
Recognition items are response-types, like multiple choice, that gives the examinee two or more alternatives. A true/false test has dichotomous recognition items. If a test has 3 or more forced choices, psychometrician call it a multipoint item.
What is a normative test format?
Normative tests are used to compare someone to other people who took the same test. Can be used to assess a quality, trait, etc.
A client who takes a normative test can be compared to others who have taken the test. A normative interpretation is one in which the individual’s score is evaluated by comparing it to others who took the same test – a percentile rank is an excellent example.
What is an Ipsative test format?
Ipsative measures compare traits within the same individual - they do not compare a person to other persons who took the instrument (I.e. NOT the MMPI).
You cannot legitimately compare two or more people who have taken an ipsative test because the ipsative measure does not reveal absolute strengths. The person is measured in response to his or her own standard of behavior. (i.e. PHQ9, GAD7)
So when someone says, “Mr. Johnson’s anxiety is improving” she has given an ipsative description – and has nothing to do with comparing Mr. Johnson’s anxiety to another person’s. The ipsative approach yields a within-person analysis.
What is a speed test?
A speed test is a timed test that is really intended to be fairly easy – the difficulty is induced by the time limitations and not the difficulty of the tasks or questions themselves. A good timed speed test is purposely set up so that no one finishes it. A timed test is really a type of speed test, but a high percentage of the test takers completed it and it is usually more difficult and has a time limit (I.e. the NCE).
What is a power test?
A power test is designed to evaluate the level of mastery without a time limit. Like a speed test, this is ideally designed so that no one receives a perfect score.
How does an achievement test (also called an attainment test) differ from a personality test or interest inventory?
Achievement/attainment tests measure maximum levels of skill or present performance of skill. A personality test or interest inventory measures typical performance. Interest inventories are popular with career counselors because they measure what the client likes or dislike.
What is the Q-Sort?
This is a design often used to investigate personality traits. It involves a procedure in which an individual is given cards with statements and asked to place them in piles of “most like me” and “least like me”. Then the subject compiles them to create the “ideal self”. The ideal self can then be compared to his or her current self perception in order to assess self esteem.
What is a spiral test?
In a spiral test, the items get progressively more difficult.
What is a cyclical test?
A cyclical test has several items that are spiral (items get progressively more difficult) in nature. So in each section, the questions go from easy to more difficult.
What is a test battery?
In.a test battery, several measures are used to produce results that could be more accurate than those derived from a single source. This is considered a horizontal test. A horizontal test measures various factors (I.e. math and science) during the same testing procedure.
What are parallel forms of a test?
When a test has two versions or forms that are interchangeable, they are termed parallel forms or equivalent forms of the same test. From a statistical/psychometric standpoint, each form must have the same mean, standard error, and other statistical components.
What are the most critical factors in test selection?
Validity and reliability. Validity refers to whether the test measure what it says it measures. Reliability tells how consistently a test measures an attribute.
Which is more important - validity or reliability?
Experts nearly always consider validity to be the number one factor in the construction of a test. A test must measure what it purports to measure. Reliability is then the second most important concern. A scale, for example, needs to actually measure body weight to be valid. and to be reliable, it will need to give repeated readings that are the same if the same person keeps stepping on the scale.
What are the 5 types of validity?
Validity is a measure of whether a test really measures what it purports to measure – though note that a test that is valid for one population is not necessarily valid for another one. There are 5 basic types of validity:
- content validity - does the test examine or sample the behavior under scrutiny in a comprehensive way (I.e. an IQ test that only looks at memory can’t say it has examined the entire range of intelligence).
- Construct validity - this refers to a test’s ability to measure a theoretical concept like intelligence, self-esteem, artistic talent, etc. Any trait you cannot “directly’ measure or observe can be considered a construct.
- concurrent validity - this deals with how well the test compares to other instruments that are intended for the same purpose
- predictive validity - also known as empirical validity which reflects the test’s ability to predict future behavior based on established criteria. Sometimes concurrent validity and predictive validity are lumped together under the title of “criterion validity”
- Consequential validity - tries to ascertain the social implication of using tests
Can you have reliability without validity?
Yes. A test can be reliable but not valid – I.e. a scale that consistently reads 109lbs for someone who weighs 140lbs. So a test can have a high reliability coefficient but still have a low validity coefficient. Reliability places a ceiling on validity but validity doesn’t set limits on reliability.
What is face validity?
Face validity mere y tests you whether your test looks like it measures the intended test. I.e. does the Wechsler appear to be an IQ test?
What is incremental validity?
This has been used to describe a number of testing phenomena: it has been used to describe the process by which a test is refined and becomes more valid as contradictory items are dropped. Incremental validity also refers to a test’s ability to improve predictions when compared to existing measures that purport to facilitate selection in business or educational settings. When a test has incremental validity, it provides you with additional valid information that was not attainable via other procedures.
What is synthetic validity?
Synthetic validity is derived from the word “synthesized”.
This is technique for inferring the validity of a selection test or other predictor of job performance from a job analysis. It involves systematically analyzing a job into its elements, estimating the validity of the test or predictor in predicting performance on each of these elements, and then combining the validities for each element to form an estimate of the validity of the test or predictor for the job as a whole.