Assessment & Testing Flashcards
Assessment
Processes and procedures for collecting information about human behavior
Appraisal (evaluation)
Going beyond assessment and making judgments about human attributes and behaviors.
3 measures of central tendency
- Mean or the arithmetic: average (M or X bar)
- Median: middle score in distribution
- Mode: most frequent score
Positive vs negative skew
Reference photo
Range
Highest score minus lowest score
(Inclusive range is same but then adding 1)
Standard deviation
Describes the variability within a distribution of scores.
(SD for a sample)
(Sigma for population)
Essentially the mean of all the deviations from the mean.
Variance
The square of the standard deviation
Normal curve (bell curve)
Distributes scores into 6 equal parts - 3 above the mean and 3 below
Percentile
75%. This score is higher than 74% of the scores. 25% of the scores are higher than this
Stanine
Standard 9
Split normal curve into 9 equal parts
5 in the middle standard deviation about 2
Standardized scores
Convert raw scores
Allow for direct comparison
Express a persons distance from the average
Z- score
Z = zero
Mean is 0 Standard deviation of 1
Range is -3 to 3
T-Score
T = 10
Standard deviation of 10 mean of 50
Correlation coefficient (r)
Ranges from -1.00 (perfect negative correlation) to 1 (perfect positive correlation)
Shows the relationship between two sets of numbers.
Bivariate correlation
Correlation between two variables
Multivariate correlation
A correlation between 3 or more variables
Reliability
Consistency of a test or measure
Extent to which a measure is free of error
Stability or test-retest reliability
Results of two administrations are correlated
2 weeks is a good time between tests administrations
Equivalence reliability
Alternative forms of the same test administered to the same group and then correlated
Internal consistency or split half reliability
Test divided into two halves
Correlation between two halves is calculated
May apply spearman-brown formula to determine how reliable test would be had you not split it
Internal consistency reliability
The more homogenous the items the more reliable the test
For dichotomous items (true-false or yes-no) use Kiser-Richardson formulas
For no dichotomous items (multiple choice or essay) use cronbachs alpha coefficient
True and error variance
tests are administered. Each one measures true variance (T1 and T2) and error variance (E1 and E2).
If the correlation between two tests or two forms of the same test is, for example, .90, then the amount of true variance measured in common is the correlation
squared (.902 = 81%).
See photo
Coefficient of determination vs coefficient of nondermination
r^2 is the degree of common variance between error and true variance for instance.
.90 is the correlation then squared it is .81 which is 81% and this is the coefficient of determination the remaining 19% is the coefficient of non determination and represents error variance.
Standard Error of Measurement (SEM)
Measure of reliability
Helps determine the range within which an individuals score probably falls
Validity
Test measures what it purports to measure
Face validity
The instrument looks valid
Example: a math test has math items
Content validity
assesses whether a test is representative of all aspects of the construct.
Example: two professors of Psychology 101 devise final exams which covers the important content that they both teach.
Predictive validity
Predictions made by test are confirmed by later behavior (criterion)
Example: the scores on the GRE predict later grade point average.
Concurrent validity
The results of the test are compared with other tests results of behaviors (criteria) at or about the same time.
Example: scores of an art aptitude test may be compared to grades already assigned to students in an art class.
Construct validity
The extent the tool/tools measures some hypothetical construct such as anxiety, creativity.
Convergent validation
High correlation between the construct under investigation and others.
Discriminate validation
When there is no significant correlation between the construct under investigation and others.
Tests may be reliable but not valid
Valid tests are reliable unless there is a change in the underlying trait or characteristic being measured (maturation, training, development)
Power based vs speed based tests
Power: no time limits or generous ones (NCE and CPCE)
Speed: times, and emphasis placed on speech and accuracy (measures of intelligence, ability, aptitude)
Norm referenced assessment
Comparing individuals to others who have taken the test before. Norma’s can be national, state or local.
Criterion referenced assessment
Comparing an individual’s performs to some predetermined criterion.
Example NCE cut off scores
Cutoff score = criterion
Ipsatively interpreted
Comparing the results on the test within the individual.
Individuals high and low scores on a test
Purposes/rationale for using tests
- Check if client is in range of services
- Help client gain self-understanding
- Counselor gain better understanding
- Which counseling methods?
- Predict future performance
- Make decisions about future
- Indentify interests
- Evaluate the outcomes of counseling
Circumstances where testing may be useful:
- Placement - education or work setting
- Admissions
- Diagnosis
- Educational planning
- Evaluation
- Licensure & certification
- Self-understanding
Regression toward the mean
If one earns a very low score or a very high score on a pretest that individuals will probably earn a score closer the mean on the post test
Standardized vs no standardized assessment
Standardized: instruments administered in formal, structured procedure and the coding is specified.
Non standardized: there are no formal or routine instructions for administration or or for scoring. Checklists or rating scales
Intelligence tests
Stanford- Binet Intelligence scales
Wechsler Adult Intdlligence Scale (WAIS-IV)
Cognitive abilities test
Specialized ability tests
Kaufman Assessment battery for children
System of multicultural Pluralistic assessment
ACT
SAT
Miller analogies test (MAT)
GRE
Achievement tests
California Achievement Test
Iowa Test of Basic Skills
Stanford Achievement Test
Specialized achievement tests
General Education Development (GED)
Collage Boards Advanced Placement Program
College- level Examination Program (CLEP)
Aptitude
Differential Aptitude Test
O*Net Ability Profiler
Armed Services Vocational Aptitude Battery (ASVAB)
Career Ability Placement Survey (CAPS)
Personality:
The dynamic product of genetic factors, environmental experiences, and learning to include traits and characteristics.
Projective Tests
Rorschach
Thematic Apperception Test (TAT)
Ritter Incomplete Sentences Blank (Second Edition)
Draw-A-Person Test
Inventories
Minnesota Multiphasic Personality Inventory
California Psychological Inventory (CPI)
NEO Personality Inventory
Beck Depression Inventory
Myers-Briggs Type Indicator
Specialized inventories
Tennessee Self Concept Scale
Bender Visual-Motor Gestalt Test
Interest inventories
Strong interest inventory
Self directed search
Career assessment inventory
Campbell interest and skill survey
O*net interest Profiler
Semantic differential
Report where you are on a range of polar opposites
Very good <———————> Very Bad
Intrusive vs unobtrusive measurement
Instrusive (reactive): participant knows they are being watched or questioned and that may effect performance
Unobtrusive (non reactive): data is collected without the awareness of the individual.
Social desirability
Tendency to respond in ways that are perceived as socially desirable
Using & interpreting test scores
- Training in test theory and studying test manual
- Prepare for the test interpretation
- Describe test in non technical terms
- Describe nature of scores
- Organize the data so it makes most sense to client (explain interrelationships)
- Ask for reactions and feelings
- Scores are additional data
- Go slowly
Advantages of computer based assessment
- Standardized administration and scoring
- Feedback and results may be available immediately
- If computers are available cost is less
- Profiles of results and reports can be generated
Disadvantages of computer based assessment
- Not all assessment are available on computer
- Can be scary for some people
- Computers can be expensive
- Personal contact with an administrator may not be available
Ethical issues in testing
- May be biased to nonwhite females
- Counselors must be trained and competent
- Test may label and stereotype people
Assessment resources
The mental measurement yearbook: list of references of tests.
Tests in print IX: info on testing instruments
A comprehensive guide to career assessment
Association for Assessment and Research in Counseling
18 divisions of the American Counseling Association