PSYC 549 Applied Measurement Techniques Flashcards
Achievement Test
needs ex
a test that is designed to measure an individual’s level of knowledge in a particular area generally used in schools and educational settings. Unlike an aptitude test which measures a person’s ability to learn something, an achievement test focuses specifically on how much a person knows about a specific topic. It measures an individual’s previous learning.
Aptitude Test
needs ex
measures a person’s potential to learn or acquire specific skills. Often used for measuring high school students’ potential for college. These Are prone to bias.
Assessment Interview
an initial interview in which the counselor is gathering information about the patient and beginning to form a conceptualization of their case and presenting problems.
May be structured, in which case a certain order of questions and format is strictly adhered to, or they may be unstructured, in which the interviewer is free to follow their own course of questioning
Structured interviews are generally more reliable and valid, but lack the freedom of unstructured interviews to pursue a topic of interest or follow an instinct
EXAMPLE: Therapist prefers to use a mix of structured and unstructured techniques for his assessment interviews, structured questioning ensures that he assesses important areas (i.e. suicide assessment). He then goes on to use an unstructured format which gives him the freedom to explore areas of interest. For instance, he noticed the client’s relationship with mother seems like a sore topic and he wants to explore this more.
Construct
complex, abstract concepts that are indirectly observed through a collection of related events; typically defined prior to conducting a study - can be easy or difficult to measure; characteristic which varies from individual to individual, but which is not directly observable.
The characteristic is an internal event or process that must be inferred from external behavior. Constructs may be derived from theory, research, or observation
Tests generally are designed to measure an internal construct
EXAMPLE: The counselor administered a paper and pencil assessment measure that solicited responses related to fidgeting, excessive worrying, difficulty concentrating - all representing the construct of anxiety. Anxiety can be measured indirectly by assessing the prevalence of these bxs.
Criterion referenced scoring/tests
needs ex.
Similar to an achievement test. A test that describes the specific types of skills, tasks, or knowledge of an individual relative to a well defined mastery criterion. only matters what/how a person does; not compared to a norm/standard; in order to establish a cut-off score the test is given to 2 groups - a group that has knowledge in the area/has been taught and a group that does not; the least frequent score from either group - the antimode - establishes the point at which mastery begins
driver’s test
Criterion related validity
needs ex
the extent to which a measure is related to an outcome; measures what a person can do/ability; performance; can be predictive or concurrent; evidence is provided by high correlation between a test and well-defined standard
Cross validation
The process of evaluating a test or a regression equation for a sample other than the one used in the original studies. The best way to ensure that proper references are being made. a technique for estimating the performance of a predictive model; applying the test to another independent group; is model/assessment generalizable.
EXAMPLE: The researchers created a new test to measure anxiety in children, and administered it to two new samples of children than the sample it was originally tested on. This cross-validation was necessary because chance and other factors such as cultural differences and SES may have influenced the original validation.
Normal curve
is the bell-shaped curve which is created by a normal distribution of a population. It is symmetrical in nature. Random sampling tends to make a normal curve.
EXAMPLE: The child psychologist tested the adolescent’s IQ and discovered that the child’s IQ was 165, placing him in the 99th percentile, more than 3 SDs above the mean on the normal curve because IQ is normally distributed.
Norm referenced scoring/testing
related to testing; a test in which each test-takers’ results are compared to norms.
Norms are performances by defined groups on a given test
not standards, but rather are what a typical performance or result on a test looks like, based on a sample of results
Tests should be normed on a sample that is reflective of the population for which the test is intended. Can be problematic when tests are not normed with a culturally diverse population
Ex: cognitive tests, height and weight in children as assessed by pediatricians
EXAMPLE: The child psychologist tested the adolescent’s IQ and discovered that the child’s IQ was 165, placing him in the 99th percentile, more than 3 SDs above the mean on the normal curve because IQ is normally distributed. IQ testing is an example of norm referenced scoring/testing because an individual’s score is always interpreted in terms of typical performance/results.
Objective tests
context of testing; unbiased, structured tests
unambiguous stimuli and answers are scored quantitatively
not open to interpretation; there is a correct and a wrong answer
Clearly stated questions and answers
No subjective element, therefore not influenced by rater variables
EXAMPLE: The psychologist found that when assessing clients with Borderline Personality Disorder, objective tests of personality–such as the MMPI– were more valid in providing personality information than projective tests–such as the Rorschach Test– in which his own personal bias or judgment could hinder the test results thus affecting the reliability and validity of the measure. The objective tests were also easier to score.
Projective tests
context of testing; tests in which the test-taker is asked to provide a spontaneous response to ambiguous stimuli, rather than choosing an answer from provided response options
Based on projective hypothesis that says when people attempt to understand an ambiguous or vague stimulus, their interpretation of that stimulus reflects their needs, feelings, experiences, thought processes
Most often personality tests
Have fallen out of favor in recent years.
Tests include the Rorschach inkblot test and the TAT among others.
Usually these types of tests require extensive training and not a lot of evaluator agreement
Most fall flat when psychometric properties are examined i.e. low reliability low validity
EXAMPLE: You are seeing a client and you ask them to interpret a black ‘blob’ while using the Rorschach inkblot test. This is a projective test that suggests the client saying that she sees a crab in the image might be indicative of her mood at the time of testing.
Reliability (types of)
extent to which a test or measure yields consistent results across administrations; extent to which scores are free from measurement error
foundational characteristic of “psychometric soundness”
There are three main types of reliability
Inter-Rater Reliability examines the degree of consistency between different raters’ scorings (
Correlation between those scores (Kappa statistic)
Test-Retest Reliability examines consistency of a measure from one time to another
Same test given at two points in time
Correlation between those scores obtained by the same person on 2 occasions
Assumes trait does not change between time 1 and 2 so timing important
Interval between measurements must be considered
Parallel Forms Reliability examines the consistency of the results of two tests constructed in the same way from the same content domain
Tests must be very very similar!
Correlation between the equivalent forms of the test
Internal Consistency Reliability examines the consistency of items within a test
Done via split-half, KR20, and Alpha
Split-half is when test is split in half and the two halves are correlated
Internal Reliability: extent to which a measure is consistent within itself i.e. split-half, KR20, & Alpha
External Reliability: extent to which a measure varies from one use to another i.e. inter-rater, test-retest, and parallel forms
EXAMPLE: While developing a new version of an IQ test, researchers gave the test to the same group of subjects at several different times to evaluate the instrument’s test-retest reliability.
Standard deviation
related to statistics and measurement techniques; the average amount that scores differ from the mean score of a distribution
Found by taking the square root of the variance, which is the average squared deviation around the mean
How spread out the data are ; a highly useful measure of the variability of a set of scores
gives an approximation of how much a typical score is above or below the average score.represents the spread of scores
Always a positive number > 0 ; 0 only occurs theoretically; there’s always variability
In general, smaller SD = scores closer to mean; larger SD = larger distribution/spread
EXAMPLE: A psychologist administered a test to assess depression in a group of college students on a scale from 1-100. In the first group, the mean score was 70 and the standard deviation was 4. This means that the average student scored a 70 (indicated by the mean of 70) and also that most of them tended to score pretty close to 70, indicated by the standard deviation of 4. In a second group, the mean score was also 70 but the standard deviation was 20 - the high standard deviation indicates that there was a lot of variability.
Standard scores
raw scores that are converted to z-scores that have a fixed mean and SD; convert raw scores into standard scores to make objective comparisons about the data; the mean z-score is always 0 and the SD is always 1
EXAMPLE: You have two clients in treatment for depression, there are both going through CBT and you want to compare the baseline “severity” of depression. They took different measures of depression, the BDI and the QIDS (Quick Inventory of Depressive Symptomatology). You convert their convert their scores to standard scores, or z-scores, to compare the two.
Test Bias
in the context of psychometrics; a systematic error in the measurement process that differentially influences scores for identified groups.
said to occur when a test yields higher or lower scores on average when it is administered to specific criterion groups such as people of a particular race or sex than when administered to an anverage population sample
The question then becomes, does this occur because of a real difference in the attribute being measured or is this due to cultural test bias?
If latter, issue of fairness
Can be due to poor standardization sample
EXAMPLE: An African American high school student comes to therapy after having been diagnosed with Social Anxiety Disorder (SAD) by a school psychologist. The school psychologist administered a Fear Questionnaire to him after he was referred to her by teachers who said the student seemed very nervous in class and did not interact with others. He would also miss classes. The student’s mother suggested he go to a therapist with a multicultural background. The student had just started high school at a school where he was one of the only people of color. The therapist decided to further assess the student for SAD because she knows there is a possibility of test bias in assessments that do not account for differences in experiences of multicultural individuals.