PSYC 549 Applied Measurement Techniques Flashcards

Question 1

Q

Achievement Test

needs ex

Answer

A

a test that is designed to measure an individual’s level of knowledge in a particular area generally used in schools and educational settings. Unlike an aptitude test which measures a person’s ability to learn something, an achievement test focuses specifically on how much a person knows about a specific topic. It measures an individual’s previous learning.

Question 2

Q

Aptitude Test

needs ex

Answer

A

measures a person’s potential to learn or acquire specific skills. Often used for measuring high school students’ potential for college. These Are prone to bias.

Question 3

Q

Assessment Interview

Answer

A

an initial interview in which the counselor is gathering information about the patient and beginning to form a conceptualization of their case and presenting problems.
May be structured, in which case a certain order of questions and format is strictly adhered to, or they may be unstructured, in which the interviewer is free to follow their own course of questioning
Structured interviews are generally more reliable and valid, but lack the freedom of unstructured interviews to pursue a topic of interest or follow an instinct

EXAMPLE: Therapist prefers to use a mix of structured and unstructured techniques for his assessment interviews, structured questioning ensures that he assesses important areas (i.e. suicide assessment). He then goes on to use an unstructured format which gives him the freedom to explore areas of interest. For instance, he noticed the client’s relationship with mother seems like a sore topic and he wants to explore this more.

Question 4

Q

Construct

Answer

A

complex, abstract concepts that are indirectly observed through a collection of related events; typically defined prior to conducting a study - can be easy or difficult to measure; characteristic which varies from individual to individual, but which is not directly observable.
The characteristic is an internal event or process that must be inferred from external behavior. Constructs may be derived from theory, research, or observation
Tests generally are designed to measure an internal construct

EXAMPLE: The counselor administered a paper and pencil assessment measure that solicited responses related to fidgeting, excessive worrying, difficulty concentrating - all representing the construct of anxiety. Anxiety can be measured indirectly by assessing the prevalence of these bxs.

Question 5

Q

Criterion referenced scoring/tests

needs ex.

Answer

A

Similar to an achievement test. A test that describes the specific types of skills, tasks, or knowledge of an individual relative to a well defined mastery criterion. only matters what/how a person does; not compared to a norm/standard; in order to establish a cut-off score the test is given to 2 groups - a group that has knowledge in the area/has been taught and a group that does not; the least frequent score from either group - the antimode - establishes the point at which mastery begins

driver’s test

Question 6

Q

Criterion related validity

needs ex

Answer

A

the extent to which a measure is related to an outcome; measures what a person can do/ability; performance; can be predictive or concurrent; evidence is provided by high correlation between a test and well-defined standard

Question 7

Q

Cross validation

Answer

A

The process of evaluating a test or a regression equation for a sample other than the one used in the original studies. The best way to ensure that proper references are being made. a technique for estimating the performance of a predictive model; applying the test to another independent group; is model/assessment generalizable.

EXAMPLE: The researchers created a new test to measure anxiety in children, and administered it to two new samples of children than the sample it was originally tested on. This cross-validation was necessary because chance and other factors such as cultural differences and SES may have influenced the original validation.

Question 8

Q

Normal curve

Answer

A

is the bell-shaped curve which is created by a normal distribution of a population. It is symmetrical in nature. Random sampling tends to make a normal curve.

EXAMPLE: The child psychologist tested the adolescent’s IQ and discovered that the child’s IQ was 165, placing him in the 99th percentile, more than 3 SDs above the mean on the normal curve because IQ is normally distributed.

Question 9

Q

Norm referenced scoring/testing

Answer

A

related to testing; a test in which each test-takers’ results are compared to norms.
Norms are performances by defined groups on a given test
not standards, but rather are what a typical performance or result on a test looks like, based on a sample of results
Tests should be normed on a sample that is reflective of the population for which the test is intended. Can be problematic when tests are not normed with a culturally diverse population
Ex: cognitive tests, height and weight in children as assessed by pediatricians

EXAMPLE: The child psychologist tested the adolescent’s IQ and discovered that the child’s IQ was 165, placing him in the 99th percentile, more than 3 SDs above the mean on the normal curve because IQ is normally distributed. IQ testing is an example of norm referenced scoring/testing because an individual’s score is always interpreted in terms of typical performance/results.

Question 10

Q

Objective tests

Answer

A

context of testing; unbiased, structured tests
unambiguous stimuli and answers are scored quantitatively
not open to interpretation; there is a correct and a wrong answer
Clearly stated questions and answers
No subjective element, therefore not influenced by rater variables

EXAMPLE: The psychologist found that when assessing clients with Borderline Personality Disorder, objective tests of personality–such as the MMPI– were more valid in providing personality information than projective tests–such as the Rorschach Test– in which his own personal bias or judgment could hinder the test results thus affecting the reliability and validity of the measure. The objective tests were also easier to score.

Question 11

Q

Projective tests

Answer

A

context of testing; tests in which the test-taker is asked to provide a spontaneous response to ambiguous stimuli, rather than choosing an answer from provided response options
Based on projective hypothesis that says when people attempt to understand an ambiguous or vague stimulus, their interpretation of that stimulus reflects their needs, feelings, experiences, thought processes
Most often personality tests
Have fallen out of favor in recent years.
Tests include the Rorschach inkblot test and the TAT among others.
Usually these types of tests require extensive training and not a lot of evaluator agreement
Most fall flat when psychometric properties are examined i.e. low reliability low validity

EXAMPLE: You are seeing a client and you ask them to interpret a black ‘blob’ while using the Rorschach inkblot test. This is a projective test that suggests the client saying that she sees a crab in the image might be indicative of her mood at the time of testing.

Question 12

Q

Reliability (types of)

Answer

A

extent to which a test or measure yields consistent results across administrations; extent to which scores are free from measurement error
foundational characteristic of “psychometric soundness”
There are three main types of reliability
Inter-Rater Reliability examines the degree of consistency between different raters’ scorings (
Correlation between those scores (Kappa statistic)
Test-Retest Reliability examines consistency of a measure from one time to another
Same test given at two points in time
Correlation between those scores obtained by the same person on 2 occasions
Assumes trait does not change between time 1 and 2 so timing important
Interval between measurements must be considered
Parallel Forms Reliability examines the consistency of the results of two tests constructed in the same way from the same content domain
Tests must be very very similar!
Correlation between the equivalent forms of the test
Internal Consistency Reliability examines the consistency of items within a test
Done via split-half, KR20, and Alpha
Split-half is when test is split in half and the two halves are correlated
Internal Reliability: extent to which a measure is consistent within itself i.e. split-half, KR20, & Alpha
External Reliability: extent to which a measure varies from one use to another i.e. inter-rater, test-retest, and parallel forms
EXAMPLE: While developing a new version of an IQ test, researchers gave the test to the same group of subjects at several different times to evaluate the instrument’s test-retest reliability.

Question 13

Q

Standard deviation

Answer

A

related to statistics and measurement techniques; the average amount that scores differ from the mean score of a distribution
Found by taking the square root of the variance, which is the average squared deviation around the mean
How spread out the data are ; a highly useful measure of the variability of a set of scores
gives an approximation of how much a typical score is above or below the average score.represents the spread of scores
Always a positive number > 0 ; 0 only occurs theoretically; there’s always variability
In general, smaller SD = scores closer to mean; larger SD = larger distribution/spread

EXAMPLE: A psychologist administered a test to assess depression in a group of college students on a scale from 1-100. In the first group, the mean score was 70 and the standard deviation was 4. This means that the average student scored a 70 (indicated by the mean of 70) and also that most of them tended to score pretty close to 70, indicated by the standard deviation of 4. In a second group, the mean score was also 70 but the standard deviation was 20 - the high standard deviation indicates that there was a lot of variability.

Question 14

Q

Standard scores

Answer

A

raw scores that are converted to z-scores that have a fixed mean and SD; convert raw scores into standard scores to make objective comparisons about the data; the mean z-score is always 0 and the SD is always 1

EXAMPLE: You have two clients in treatment for depression, there are both going through CBT and you want to compare the baseline “severity” of depression. They took different measures of depression, the BDI and the QIDS (Quick Inventory of Depressive Symptomatology). You convert their convert their scores to standard scores, or z-scores, to compare the two.

Question 15

Q

Test Bias

Answer

A

in the context of psychometrics; a systematic error in the measurement process that differentially influences scores for identified groups.
said to occur when a test yields higher or lower scores on average when it is administered to specific criterion groups such as people of a particular race or sex than when administered to an anverage population sample
The question then becomes, does this occur because of a real difference in the attribute being measured or is this due to cultural test bias?
If latter, issue of fairness
Can be due to poor standardization sample

EXAMPLE: An African American high school student comes to therapy after having been diagnosed with Social Anxiety Disorder (SAD) by a school psychologist. The school psychologist administered a Fear Questionnaire to him after he was referred to her by teachers who said the student seemed very nervous in class and did not interact with others. He would also miss classes. The student’s mother suggested he go to a therapist with a multicultural background. The student had just started high school at a school where he was one of the only people of color. The therapist decided to further assess the student for SAD because she knows there is a possibility of test bias in assessments that do not account for differences in experiences of multicultural individuals.

Question 16

Q

Validity (types of)

Answer

Study These Flashcards

A

a psychometric property; the extent to which you’re measuring the construct you intended to measure; in general a validity coefficient of 0.3-0.4 is considered adequate
Content Validity: degree to which a measure represents all aspects of a given construct ; how well a measure encompasses the full domain of what it is trying to measure
Criterion Validity: extend to which the test corresponds with a particular criterion against which it is compared; how well one measure predicts outcome of another measure
Indicated by high correlations between a test and a well-defined measure
For example, you might examine a suicide risk scale and suicide rates/attempts or driver skill test and number of infractions on the road
Provides evidence for predictive validity
Concurrent Validity: extent to which a new measure correlates with a previously established/validated measure
Construct Validity: the degree to which the test measures the construct or trait it intends to measure.
Convergent evidence: Occurs when there is a high correlation between two or more tests that purport to assess the same criterion
Discriminant evidence: occurs when two tests of unrelated constructs have low correlations; that is, they discriminate between two qualities that are not related to each other
Face Validity: a logical rather than statistical quality; the extent to which a test is subjectively viewed as covering the concept it purports to measure

In the context of research…are results trustworthy and meaningful?
Internal Validity: whether the effects observed in a study are due to the manipulation of the independent variable and not some other factor; a measure of how sound the research is
depends largely on the procedures of a study and how rigorously it is performed - did it avoid confounds?
External Validity: the extent to which the results of a study can be generalized to other situations and to other people
Ecological validity, an aspect of external validity, refers to whether a study’s findings can be generalized to the real world
Rigorous testing methods can weaken external validity

EXAMPLE: A business psychologist has developed a new IQ-test that requires only 5 minutes per subject, as compared to 90 min for the test acknowledged as the gold standard. He administers both tests to a sample population and then compared scores. The correlation between the two sets of scores was high indicating that the new test has that the new test has concurrent validity.

Question 17

Q

Variance

Answer

Study These Flashcards

A

part of statistics and data analysis; a measure of variability; the average squared deviation around the mean
Must be squared because sum of deviations around mean would always = 0
Widely referenced and useful measure for statistical analysis but NOT useful as a descriptive statistic

EXAMPLE: A clinical psychologist is doing research on a new tx for substance use disorders. She conducts an experiment in which she compares the tx group to a group that received the gold standard tx and to a control group. At first glance it looks like the level of symptomatic reduction is the same in the new tx and gold standard tx groups, but upon further inspection the psychologist notes that the new tx group has a large amount of variance. That is, some people saw significant sx reduction and others saw very minimal change. She needs to investigate this further. What is it that makes the new tx beneficial for some?

PSYC 549 Applied Measurement Techniques Flashcards

(17 cards)