Midterm Flashcards
Many of the tasks that David Wechsler used in his WAIS, WAIS-R, WAIS-III, and WAIS-IV were adapted from what sources?
The Army Alpha, Army Beta, Army Performance Scale Examination
Updating the WAIS-IV’s theoretical foundations was achieved by considering which theoretical construct?
Phonological processing!
It was achieved by considering Fluid Reasoning, Working Memory, and Processing Speed
What was the major structural change implemented from the WAIS-III to the WAIS-IV?
The WAIS-IV eliminated the Verbal and Performance IQs, but the 4
indexes were retained
Which Subtest in NOT new to the WAIS-IV?
Symbol Search
Which Subtests ARE new to the WAIS-IV?
Figure Weights, Visual Puzzles, and Cancellation are all new
Which WAIS-IV subtests offer process scores?
Digit Span, Block Design, and Letter-number Sequencing have process scores
Which index includes the subtest with the lowest loadings on the general (g) factor?
Note: General intelligence or general mental ability is denoted by g
Processing Speed (Coding, Symbol Search, and Cancellation)
What is the Flynn Effect?
The phenomenon that IQ test norms in the United States get out of date at the rate of about 3 points per decade!
True or False? Analysis of ethnic differences on the WAIS-IV have shown that ethnicity accounts for more variance in IQ than socioeconomic status.
False!
Socioeconomic status and an array of other background, behavioral, and personal variables impact IQ far more than ethnicity alone
What is Galton known for?
Galton developed the first comprehensive individual test of intelligence, composed of sensorimotor tasks
What was the FIRST Weschler test series?
Wechsler-Bellevue Intelligence Scale in 1939
What does the Verbal Comprehension Index (VCI) Measure?
Crystallized Intelligence (Gc) and represents ability to reason with previously learned information.
What are the four subtests that make up the Verbal Comprehension Index (VCI)?
Similarities
Vocabulary
Information
Comprehension (not a core subtest)
What subtests make up the Perceptual Reasoning Index (PRI)?
Block Design
Matrix Reasoning
Visual Puzzles
Figure Weights (not a core subtest) Picture Completion (not a core subtest)
What does PRI measure?
Measures visual processing and fluid reasoning (Gv-Gf), and represents ability to analyze and synthesize visual stimuli and reason with it
What does Working Memory Index (WMI) measure?
Measures short-term memory (Gsm), and represents ability to
comprehend and hold or transform information in immediate
awareness and then use it in a few seconds
What subests make up Working Memory Index (WMI)?
Digit Span
Arithmetic
Letter-number Sequencing (not core)
What does Processing Speed Index (PSI) measure?
Measures Processing Speed (Gs) and represents ability to perform
simple tasks quickly
When is the General Ability Index (GAI) used?
Good measure of global ability when FSIQ is not interpretable
True or False? The examiner must memorize all directions in order to administer the test properly.
False! We can read the directions, but we still need to be very familiar with the test, procedures and prompts.
When applying the reverse rule, you must…
Go back to administer items until two consecutive items are correct, including the item with which you initially started
Which core subtests require the use of a stopwatch?
Block Design, Arithmetic, Symbol Search, Visual Puzzles, Coding
What is a Basal
Establishing a basal means that a test-taker has answered a set number of answers correctly so that they can continue with the rest of the test/subtest.
What is a Ceiling
A ceiling is reached when a test-taker has answered an established number of questions incorrectly in a row.
All of the following are principles of the intelligent testing philosophy:
- Subtests measure what the individual has learned
- Subtests are samples of behavior and are not exhaustive
- Test batteries are optimally useful when interpreted from a theoretical model
- IQ tests assess mental functioning under fixed experimental conditions
- Hypotheses generated from the test profile should be supported with data from multiple sources
If the FSIQ is not interpretable, then…
Determine if the GAI can be calculated and interpreted as a reliable and valid estimate of a person’s general intellectual ability
An abnormally large discrepancy between PRI and VCI means it is…
Rare among the normal population
What is Psychological Assessment
A comprehensive examination of psychological functioning that involves collecting, evaluating, integrating test results and collateral information, and reporting information about an individual
What is Psychological Testing
An objective and standardized measure of a sample of behavior
Pillars of Assessment
Norm referenced tests (this is what the IQ test is)
Interviews
Observations
Informal Assessment Procedures
What are some differences between Assessments and Testing?
o Complexity
▪ Testing- Unidimensional
▪ Assessment- multidimensional
o Duration
▪ Testing- Few minutes, hours
▪ Assessment- Hours to days
o Cost
▪ Testing- Inexpensive, but can be expensive
▪ Assessment- professional costs
Purpose
▪ Testing- Decision making
▪ Assessment- Referral Question, problem
Why is assessment important?
o Clinical versus statistical prediction controversy
o Clinical interviews are not really that reliable because we are all biased and human
o Statistical prediction consistently outperforms clinical judgment
Psychological tests are used for…
…measuring characteristics of humans that pertain to overt (observable) and covert behavior. They are a set of items that are designed to measure characteristics of human beings that pertain to behavior
What is Binet known for?
He made the first representative sample intelligence test
Achievement tests
Measure previous learning
Aptitude tests
measure potential for acquiring a specific skill
Intelligence tests
measures an individual’s potential to solve problems, adapt to changing circumstances and profit from experience
Personality tests
measure typical behavior, including traits, temperaments and dispositions
Statistics serve two important functions. What are they?
o Used for the purpose of description
o Use statistics to make inferences, which are logical deductions about events that cannot be observed directly
Norm-referenced tests
Evaluate individuals relative to normative groups that depend on demographics such as age and culture
Norms
used to relate a score to a particular distribution for a subgroup of a population. Ex. Norms are used to describe where a child is (score) on some measure relative to other children of the same age (subgroup of the population)
Criterion-referenced test
describes the specific types of skills, tasks, or knowledge of an individual relative to a well-defined mastery criterion.
Descriptive Statistics
Provide a concise description of a collection of quantitative information
Inferential Statistics
Used to make inferences from observations of a sample to a larger group (population)
Measurement
Defined as the application of rules for assigning numbers to objects (rating scale for wine)
Magnitude
Represents quantity (A scale of height, we can say John is taller than Fred)
Equal Intervals
A scale has the property of equal intervals if the difference
between two points at any place on the scale has the same meaning
as the difference between two other points (A ruler, but not an IQ
test)
Absolute 0
Obtained when nothing of the property being measured exists
Nominal Scales
▪ Not really scales
▪ For naming (such as naming teams)
▪ Can create frequency distributions, but not mathematical calculations
Ordinal Scales
▪ For ranking
▪ Has the property of magnitude
▪ Can rank (1, 2, 3) from taller to shorter
Interval Scales
▪ Property of equal intervals and Magnitude (temperature in degrees) ▪ But cannot reveal ratios
▪ Cannot have absolute zero
Ratio Scales
▪ Properties of magnitude, Equal Intervals, and Absolute 0
▪ Speed of travel
What are Frequency Distributions?
Displays scores on a variable or a measure to reflect how frequently each value was obtained
Positive Skew
The tail goes off toward the higher or positive side of the X axis
Class Interval
The distance between each unit on the horizontal axis
Percentile Ranks
Answers the question, “what percent of the scores fall below a particular score”
You actually exclude the score of interest (it is a measure of relative performance)
Percentiles
The specific scores or points within a distribution
Mean
The average score or result of a sample.
Standard Deviation
a measure of how dispersed the data is in relation to the mean
Variance
The average of the squared differences from the mean.
Z-Score
the deviation of a score from the mean in standard
deviation units.
Expectancy Effects
Beliefs held by people administering and scoring tests might also get
translated into inaccurate test scores
Data can sometimes be affected by what an experimenter expects to find. What is the name of this effect?
Rosenthal Effect!
The Point Scale Concept
In an age-scale format, the arrangement of items had nothing to do with their content and subjects didn’t receive points for each task completed. In a point scale, points are assigned to each item so the test yields a total overall score and a score for each content area
Performance Scale Concept
Requires the subject to do something rather than just answer questions. This way, the test can directly compare an individua’s verbal and nonverbal intelligence.
To determine scaled scores you need:
Examinee’s chronological age
Subtest raw scores on record form
Table A.1 from scoring manual
Converting Raw Scores to Scaled:
Transfer total scores for each subtest onto pg. 1 of record form
Find scaled score equivalent for each subtest using Table A.1 for appropriate age
Record each scaled score in corresponding column
What is the chronological age range for WAIS-IV participants?
16-90
What do IQ tests assess?
IQ tests assess mental functioning under fixed experimental conditions
________ generated from the test profile should be supported with data from multiple sources
Hypotheses
What are some weaknesses of the WAIS-IV?
● Block Design is the only hands-on subtest
● Ambiguity in some scoring criteria
● Annoying instructions on one subtest
● Some questionable scoring rules
Strengths of the WAIS-IV include:
o Generally strong reliability
o A much more user-friendly Record Form
o A high-quality standardization sample
What are the three independent traditions T. R. Taylor (1994) identified that have been employed to study the nature of human intelligence?
Psychometric Approach
Information-processing approach
Cognitive tradition
Psychometric Approach
Examines elemental structure of test. It is the OLDEST approach.
Information-processing approach
examines the processes that underlie how we learn and solve problems
Cognitive tradition
focuses on how humans adapt to real-world demands
Binet defined intelligence as the capacity to:
○ to find and maintain a definite direction or purpose,
○ to make necessary adaptations—that is, strategy
adjustments—to achieve that purpose, and
○ to engage in self-criticism so that necessary adjustments in strategy can be made.
Binet’s contributions
Age differentiation - estimated the mental ability of a child in terms of his or her completion of the tasks designed for the average child of a
particular age, regardless of the child’s actual or chronological age.
General mental ability - the total product of the various separate and distinct
elements of intelligence
Charles Spearman’s contribution
Intelligence consists of one general factor (g) plus a large number of specific factors, hence “Spearman’s g”.
Factor Analysis
A method for reducing a set of variables or
scores to a smaller number of hypothetical
variables called factors
What are the two basic types of intelligence?
Fluid and Crystallized
What is Fluid Intelligence?
Abilities that allow us to reason, think, and acquire new knowledge. These abilities allow us to learn and acquire information.
What is Crystallized Intelligence?
Represents the knowledge and understanding that we have acquired from previous learning.
1905 Binet-Simon Scale (first version)
■ purpose was restricted to identifying mentally disabled children in the Paris school system
■ 30 items presented in an increasing order of difficulty
■ Idiot described the most severe form of intellectual
impairment, imbecile moderate levels of impairment, and
moron the mildest level of impairment
What changed from the 1905 to the 1908 Binet-Simon Scale (second version)
Items were grouped according to age level
rather than simply one set of items of increasing
difficulty
What is mental age based on?
A subject’s mental age is based on his or her performance compared with the average performance of individuals in a specific
chronological age group.
1916 Stanford-Binet Intelligence Scale (third version of Binet scales)
Introduced the Intelligence Quotient (IQ)
Strength: the increased sample size clearly marked an improvement over the meager 50 and 203 individuals of the 1905 and 1908 Binet-Simon versions.
Limitation: entire standardization sample consisted
exclusively of white, native-Californian children.
The concept of g refers to the ____.
view that one general mental ability factor underlies all intelligent behavior.
What major concept did Binet introduce in the 1908 Binet-Simon scale?
mental age
The standardization sample of the 1916 Stanford-Binet scale was inadequate in that _____.
it was comprised exclusively of white children from California
A central problem of the 1937 revision of the Stanford-Binet scale was that _____.
different age groups showed significant differences in the standard deviation of IQ scores
_____ is a standard score with a mean of 100 and a standard deviation of 16 (later 15) that was first introduced in the 1960 revision of the Stanford-Binet.
The deviation IQ
The knowledge you have acquired through your academic studies would best be described in terms of _____.
crystallized intelligence.
The gf-gc theory of intelligence is
a hierarchical model on which only the later versions of the Stanford-Binet are based.
Administration of the modern Stanford-Binet requires examiners to continue testing until the _____.
examinee’s ceiling is reached
What is the oldest approach to investigating human intelligence?
Psychometric
Binet believed that human intelligence was expressed through?
Judgment, attention, and reasoning
To support the notion of g, Spearman developed a statistical technique called
Factor analysis
Those abilities that allow us to learn and acquire information can be referred to as
Fluid intelligence
As used in the Stanford Binet Scale the deviation IQ is a standard score with a
Mean of 100 SD of 16
Item
An item is a specific stimuli to which a person responds overtly; this response can be scored or evaluated (e.g. classified, graded on a scale, or counted). In simple terms, items are the specific questions or problems that make up a test.
Psychological Test
A psychological test or educational test is a set of items that are designed to measure characteristics of human beings that pertain to behavior.
Psychological Testing
Refers to all the possible uses, applications, and underlying concepts of psychological and educational tests. The main use of these tests, though, is to evaluate individual differences or variations among individuals. Such tests measure individual differences in ability and personality and assume that differences shown on the test reflect actual differences among individuals.
Overt behavior
an individual’s observable behavior
Covert behavior
takes place within an individual and cannot be directly observed
-To deal with distribution problems of scores, psychologists make use of ___, which relate raw scores on test items to some defined theoretical or empirical distribution.
scales
Individual Tests
Tests that can be given to only one person at a time
Test administrator
The person giving the test
Group Test
Is a test that can be administered to more than one person at a time by a single examiner, such as when an instructor gives everyone in the class a test at the same time.
Achievement
previous learning. A test that measures or evaluates how many words you can spell correctly is called a spelling achievement test.
Aptitude
refers to the potential for learning or acquiring a specific skill. A spelling aptitude test measures how many words you might be able to spell given a certain amount of training, education, and experience
Intelligence
Refers to a person’s general potential to solve problems, adapt to changing circumstances, think abstractly, and profit from experience.
Human ability
In view of considerable overlap of achievement, aptitude, and intelligence tests, all three concepts are encompassed by the term human ability.
Personality Tests
are related to the overt and covert dispositions of the individual.
Structured personality tests
provide a statement, usually of the “self-report” variety, and requires the subject to choose between two or more alternative responses such as “true” or “false”.
Projective personality tests
are unstructured and either the stimulus (test materials) or the required response-or both-are ambiguous.
Reliability
the accuracy, dependability, consistency, or repeatability of test results. In more technical terms, reliability refers to the degree to which test scores are free of measurement errors.
Validity
refers to the meaning and usefulness of test results. More specifically, validity refers to the degree to which a certain inference or interpretation based on a test is appropriate.
Interview
a method of gathering information through verbal interaction, such as direct questions. Not only has the interview traditionally served as a major technique of gathering psychological information in general, but also data from interviews provide an important complement to test results.
Issues of Psychological Testing
Many social and theoretical issues, such as the controversial topic of racial differences in ability, accompany testing.
_____ are when two or more tests are used in conjunction.
Test batteries
Charles Darwin’s theory
higher forms of life evolved partially because of differences among individual forms of life within a species. The most adaptive characteristics survive at the expense of those who are less fit and that the survivors pass their characteristics on to the next generation.
Sir Francis Galton
concentrated on demonstrating that individual differences exist in human sensory and motor functioning, such as reaction time, visual acuity, and physical strength.
J. E. Herbart
developed mathematical models of the mind,he eventually used these models as the basis for educational theories that strongly influenced 19th century educational practices.
E. H. Weber
attempted to demonstrate the existence of a psychological threshold, the minimum stimulus necessary to activate a sensory system.
G.T. Fechner
devised the law that the strength of a sensation grows as the logarithm of the stimulus intensity.
Who founded the science of psychology?
Wilhelm Wundt!
G. Whipple
provided the basis for immense changes in the field of testing by conducting a seminar at the Carnegie Institute.
Alfred Binet and T. Simons
developed the first major general intelligence test (Binet-Simon Scale, 1905)
A ___ sample is one that comprises individuals similar to those for whom the test is to be used.
representative
How did World War I influence psychological testing?
The war created a demand for large-scale group testing because relatively few trained personnel could evaluate the huge influx of military recruits.
Robert Yerkes
developed two structured group tests of human abilities: The Army Alpha and the Army Beta : Army Alpha required reading ability whereas Army Beta measured the intelligence of illiterate adults.
Traits
relatively enduring dispositions (tendencies to act, think, or feel in a certain manner in any given circumstance) that distinguish one individual from another. (eg. optimistic and pessimistic)
The first structured personality test was the ____________.
Woodworth Personal Data Sheet
The MMPI (Minnesota Multiphasic Personality Inventory)
It used empirical methods to determine the meaning of a test response and helped to revolutionize structured personality tests. It began a new era for structured personality tests.
Inferences
logical deductions about events that cannot be observed directly
Descriptive statistics
methods used to provide a concise description of a collection of quantitative information
Inferential statistics
methods used to make inferences from observations of a small group of people known as a sample to a larger group of people known as a population
Quartiles
points that divide the frequency distribution into equal fourths. The first quartile is the 25th percentile; the second quartile is the median, or 50th, percentile; and the third quartile is the 75th percentile.
Median
a value or quantity lying at the midpoint of a frequency distribution of observed values or quantities, such that there is an equal probability of falling above or below it. It is also known as the 50th percentile.
Interquartile Range
the interval of scores bounded by the 25th and 75th percentiles.
Deciles
similar to quartiles except that they use points that mark 10% rather than 25% intervals.
Stanine System
converts any set of scores into a transformed scale, which ranges from 1 to 9 (standardized to have mean 5, SD 2)
Norms
the performances by defined groups on particular tests
Tracking
the tendency to stay at about the same level relative to one’s peers (e.g., height and weight)
Norm-referenced test
compares each person with a norm (more focused on comparison)
Criterion-referenced test
describes the specific types of skills, tasks, or knowledge, that the test-taker can demonstrate, such as mathematical skills (more individualized)
Reliability
extent to which measurement gets consistent results over repeated measurements (with same conditions each time)
Classical Test Theory
Distribution of random errors assumed to be the same for all people –> Standard error of measurement (SEM)
Item Response Theory
computer used to focus range of item difficulty to assess individual’s ability level
Sources of Error/What Affects Reliability
Testing environment
Test itself (unrepresentative of what’s being tested)
Examinee (physical or mental health)
Scoring of the test
Testing Reliability Methods
Test-Retest Method
Parallel Forms Method
Split-Half Method
Test-Retest Method
Only applies to measures of stable traits
“Carryover effect” = when examinee remembers answers from first session systematic carryover is not a concern = when everyone’s scores increase
Practice effects (sharpened skills from previously taking test)
Time interval must be considered carefully
Consider events that occurred between test taking sessions
Parallel Forms Method
comparison of two equivalent forms of a test that measure same thing (each test uses different items)
Split-Half Method
Divides test into halves that are scored separately which are then compared
Spearman-Brown (not always advisable to use)
Content validity evidence
considers the adequacy of representation of the conceptual domain the test is designed to cover.
Construct underrepresentation
describes the failure to capture important components of a construct. (e.g., a test of mathematical knowledge included algebra, but not geometry)
Construct-irrelevant variance
occurs when scores are influenced by factors irrelevant to the construct. (e.g., a test of intelligence might be influenced by reading comprehension, test anxiety, or illness.)
Criterion validity evidence
tells how well a test corresponds with a particular criterion. Evidence is provided by high correlations between a test and a well-defined criterion measure.
Predictive validity evidence
the forecasting function of tests is actually a type or form of criterion validity evidence.
Concurrent validity evidence
applies when the test and the criterion can be measured at the same time.
Validity Coefficient
The relationship between a test and a criterion that is expressed by a coefficient. coefficient tells the extent to which the test is valid for making statements about the criterion.
A construct is defined as something built by ______.
mental synthesis
Convergent Evidence
When a measure correlates well with other tests believed to measure the same construct.
Discriminant Evidence
one type of evidence a person needs in test validation is proof that the test measures something unique
To demonstrate discriminant evidence for validity, a test should have ____ correlations with measures of unrelated constructs, or evidence for what the test does not measure
low