Testing and Assessment (Nueukrug & Fawcett, 2020) Flashcards
Assessment
Assessment includes a broad array of evaluative procedures that yield information about a person (Hunsley, 2002).
Tests
Tests are a subset of assessment techniques that yield scores based on the gathering of collective data (e.g., finding the sum of correct items on a multiple-choice exam).
distinguishing between testing and assessment
The History of Assessment
The modern era of assessment began near the beginning of the twentieth century, assessment procedures can be found in ancient times.
- Jean Esquirol- Used language to identify intelligence- forerunner of verbal IQ
*Edouard Seguin- developed the form board to increase motor- forerunner of “performance IQ”
The History of Assessment
- Sir Francis Galton- examined relationship of sensory motor responses to intelligence
*Wilhelm Wundt- Developed one of the first psychological laboratories
*James McKeen Cattell - Brought Statistics to mental testing- coined term mental test
*G.S. Hall - Early experimental psychologist. First president of APA.
The Emergence of Ability Tests (Testing in the Cognitive Domain)
*Alfred Binet- Created first modern intelligence test
*Lewis Terman- Enhanced Binet’s work to create Stanford-Binet intelligence test
*Intelligence Quotient- mental age divided by chronological age
Group testing in the cognitive domain
Robert Yerkes- president of the APA chaired a special committee to create a screening test for new recruits during World War I.
The original test the committee developed was as the Army Alpha.
Group testing in the cognitive domain - developers
James Bryant Conant -developed SAT to equalize educational opportunities.
Edward Thorndike- Developer of the Stanford Achievement Test
Frank Parsons- Leader in vocational counseling
Ethical Issues in Assessment
Confidentiality - Ethical guideline to protect client information
Cross- Cultural Sensitivity
Cross- Cultural Sensitivity- Ethical guideline to protect clients from discrimination and bias in testing.
Informed Consent
Informed Consent - Permission given by client after assessment process is explained.
Invasion of privacy
Invasion of privacy- testing is an invasion of a person’s privacy.
Proper diagnosis
Proper diagnosis- choose appropriate assessment techniques for accurate diagnosis
Release of test data
Release of test data -Test data are protected - client release required.
Test administration
Test administration - use established and standardized methods
Test Security
Ensure integrity of test content and test itself.
Test scoring and interpretation
Take into consideration the problems with tests
Standards in Assessment
Used to further ethical application of assessment techniques
Moral model
Consider moral principles involved in ethical decision-making
The Family Education Rights and Privacy Act (FERPA) of 1974
Affirms right to access test records in the in the school
The Health Insurance Portability and Accountability Act (HIPPAA)
ensures privacy of medical and counseling records
Privilged Communications
Privilged Communications- Legal right to maintain privacy of conversation
Jaffee v. Redmond- Affirms privileged communication laws.
Freedom of Information Act
Freedom of Information Act- affirms right to access federal and state records
Civil Rights Acts (1964 and Amendments)
Civil Rights Acts- Test must be valid for job in question
American with Disabilities Act (ADA)
American with Disabilities Act accommodations for testing must be made
Individuals with Disabilities Education Act (IDEA)
IDEA - assures right to be tested for learning disabilities in schools
Section 504
Assessment for programs must measure ability -not disability
Carl Perkins Act
Carl Perkins Act - ensures access to vocational assessment, counseling , and placement
Accreditation bodies
Assist in setting curriculum standards (APA, NASP, and CACREP)
Forensic Evaluations
Completed by forensic health evluators and
DSM (Diagnostic and Statistical Manual) DSM
The first edition DSM was published in 1952 with three broad categories
DSM II -1968
DSM III - introduced the multiaxial
DSM IV -contained 365 diagnoses– used a five axis diagnosis
DSM 5- accepted diagnostic classification system for mental disorders – became single axis
DSM 5-TR
Making and Reporting Diagnosis
Principal Diagnosis - The reason the person came to treatment is listed first
Subtype - “Specify Whether”
Specifier- “Specify if”- pick as many as apply
Subtypes can be used to help communicate greater clarity,
Specifiers are not mutually exclusive, so more than one can be selected
Other and Unspecified Disorder
Other specified disorder- doesnt fit a standard diagnosis with an explanation why not
Unspecified Disorder- doesn’t fit a standard diagnosis without explanation
validity and reliability
validity- whether the test measure what its supposed to measure;
reliability- whether te score an individual has received on a test is an accurate measure of his or her true score;
cross-cultural fairness- whether the score the indvidual has obtained is a true reflection of the individual , and not a function of a cultural bias inherent in the test or in the examiner
practicality- whether it makes sense to use a test in a particular situation
correlation coefficient
Correlation coefficient- Which shows the relationship between two sets of scores, is a statistical concept frequently used in discussions of critical factors just listed..
Correlation coefficient range from -1.00 to +1.00
Positive Correlation - move in the same direction
Negative Correlation -move in the opposite direction
Examples of Positive and Negative Correlations
Validity
Validity- Evidence supporting the use of test scores
Content Validity- evidence that test items represent that represent the domain
Face Validity- Superficial appearance of a test - not true validity
Criterion-related validity- relationship between test scores and another standard
Validity (continued)
Concurrent validity- relationship test scores and another currently obtainable benchmark
Predictive validity- relationship between test scores and a future standard
Construct validity- evidence that an idea or concept is being measure by a test
Experimental design validity - using experimentation to show that a test measure a concept.
Validity (continued)
Factor analysis- statistically examining the relationship between subscales and the larger construct.
Convergent validity relationship between a test and other similar tests
Discriminant validity- showing a lack of relationship between a test and other dissimilar tests.
Reliability
Reliability - Account of freedom from measurement error- consistency of test scores
test-retest reliability- relationship between score from one test given at different administrations
Reliability (continued)
Alternate forms reliability- relationship between scores from two similar versions of the same test
internal consistency- reliability measured statistically by going “within” the test
split half reliability- correlating one half of a test against the other half
Reliability (continued)
Coefficient alpha or Kuder-Richardson Reliability based on a mathematical comparison of individual items with one another and total score.. in brief, they do this by correlating the scores for each item on the test with the total score on the test and finding the average correlation for all of the items.
Item response Theory- examining each item for its ability to discriminate as a function of the construct being measured
Cross cultural fairness
Cross cultural fairness- degree to which cultural background, class, disability, and gender do not affect test results.
Selecting and administering a good test
1.Determine the goals of your client
2. Choose instrument types to reach client goals
- Access information about possible instruments
- Examine validity, reliability, cross-cultural fairness, and practicality of possible instruments
- Choose an instrument wisely
Raw Score
Raw Score- untreated score before manipulation or processing
Frequency distributions
Frequency Distributions - list of scores and number of times a score occurred
Cumulative distributions
line graph to examine percentile rank of a set of scores
Normal Curves and Skewed
Quincunx- Board developed by Sir Francis Galton to demonstrate the bell shaped curve
normal curve- bell-shaped distribution that human traits tend to fall along
skewed curve - test scores not falling along a normal curve
negative skewed curve- majority of scores at upper end
positively skewed curve- majority of scores at lower end
means of central tendency
mean - arithmetic average of a set of scores
median - scores where 50% fall above and 50% below
mode- most frequently occurring score
measures of variability
range- difference between highest and lowest score plus 1
interquartile range- middle 50% of scores around the median
measures of variability (continued)
standard deviation - how scores vary from the mean
SD is important because in all normal curves the percentage of scores between the standard deviation units is the same
A normal distribution is
normal referencing versus criteria referencing
normal referencing - comparison of individual test scores to average scores of a group
criterion referencing - comparison of test scores to a predetermined standard
normative comparisons and derived scores
derived score- converted score based against a norm group
percentiles- percentage of people falling at or below a score
standard scores
standard scores- derived score based on mean and standard deviation
standard scores (continued)
z- scores- standard score with mean of 0 and SD of 1
** once the raw score has been converted to a z-score, almost any other type of derived score can be found, including percentiles, T-scores, DIQ, stanines, and so forth
standard scores (continued)
t-scores- standard score mean of 50 an SD of 10 (generally used with personality test)
Deviation IQ- -standard score wit mean of 100 and SD of 15.
standard scores (continued)
Stanines- have a mean of 5 and a standard deviation of 2, and range from 1 to 9…often used with achievement test.
Sten Scores- standard score with mean of 5.5 and standard deviation of 2.
NCE- normal curve equivalents scores - standard score with mean of 50 and SD of 21.06
developmental norms
Age Comparisons- comparison of individual score to average score of others at the same age
Grade equivalent- comparison of individual score to average score of other at the same grade level
standard error of measurement
range where a “true” score might lie
reliability coefficient and SEM - as reliability decreases, the SEM (the range of true scores) increases.
**standard error of estimate - range where a predicted score might lie
scales of measurement
to distinguish between different kinds of test scores and subsequently know what kind of statistics can be applied to them, four kinds of scales of measure have bee identified: nominal scales, ordinal scales, interval scales, and ratio scales.
scales of measurement (continued)
nominal scale- numbers arbitrarily assigned to represent categories
1= Asian
2= Latino
3= African American
4= Caucasian
scales of measurement (continued)
Ordinal scale - numbers with rank order but unequal distances between
1- strongly disagree
2- somewhat disagree
3- neither agree or disagree
4- somewhat agree
scales of measurement (continued)
interval scale- establishes equal distances between measurements but has no absolute zero reference point. (ex: SAT test scores)
scales of measurement (continued)
ratio scale- numbers with equal interval s and meaningful zero
(very few behavioral measures fall into this category) .. tends to be height, weight, and temperature on celsius scales.
Defining intelligence testing
Intelligence testing is a subset of intellectual and cognitive functioning and assesses a broad range of cognitive capabilities that generally results in an “IQ” score
Intelligence models
Big five personality