Final Exam Flashcards
Patterns of responding to scale items that result in false or misleading information
Response bias
When an individual agrees (or disagrees) with a statement without regard for the meaning of those statements
Acquiescence bias
Agreeing to all items regardless of content
Yea-saying
Disagreeing to all items regardless of content
Nay-saying
Tendency to avoid or endorse extreme response options
Extreme/Moderate responding
The tendency for a person to respond in a way that seems socially appealing, regardless of his or her true characteristics.
Social desirability bias
: test takers intentionally attempt to appear socially desirable
Impression management
test takers intentionally underreport negative aspects of themselves, or have unrealistically positive views of themselves. trait like
Self-deception
Respondents are motivated to appear more cognitively impaired, emotionally distressed, physically challenged, or psychologically disturbed than they actually are.
Malingering
Carelessness or lack of motivation to respond meaningfully. Likert-type items
Random Responding
Some respondents may be “luckier” than others and answer items correctly
Correct/Incorrect items
Guessing
May increase accurate responding
May also increase random responding
Anonymity
Pairs or sets of items that are equally socially desirable
Forced choice assessment
Instruments used to collect important information from individuals
Surveys
Using statistics of a sample to ensure it is representative of a population
Probability sampling
is the body of knowledge or behaviors that the test represents
Testing universe
the group of individuals who will take the test
Target audience
the information that the test will provide to the test user
Purpose
determine whether students have the skills or knowledge necessary to understand new material
determine how much information students already know about the new material
Decisions made at the beginning of instruction
Placement assessments
Assessments that help teachers determine what information students are and are not learning during the instructional process.
Decisions made during instruction
Formative assessments
Assessments that involves an in-depth evaluation of an individual to identify characteristics for treatment or enhancement. Decisions made during instruction
Diagnostic assessments
Determine what students do and do not know
Gauge student learning
Assign earned grades
Often the same tests are used for formative and summative assessment
Decisions made after instruction
Summative assessments
Collections of an individual’s work, to highlight and assess that part of student learning and performance, which mat be difficult to assess with standardized testing
Portfolios
When a student’s test performance significantly affects educational paths or choices.
High-stakes tests
Measure understanding rather than application
Too structured
Typically only true/false, multiple choice questions
traditional assessment
Measures a students ability to apply in real-world settings the knowledge and skills he or she has learned
Authentic assessment
Teaching to the test is _______ for traditional assessment, but ______ for authentic assessment
Discouraged, encouraged
Treatment methods with documented research evidence that the methods are effective for solving the problem being addressed
Evidence-based treatment methods
the integration of the best available research with clinical expertise in the context of patient characteristics, culture, and preferences
Evidence-based practice
One of the strongest, and consistent predictor of performance
Moderated by job complexity
Cognitive Ability/General Mental Ability testing
Underlying concepts of constructs that the tests or groups of test questions are measuring
Factors
An advanced statistical procedure based on the concept of correlation that helps investigators identify the underlying constructs or factors being measured
Factor analysis
No formal hypothesis about the factors
Exploratory factor analysis
Factor structure specified in advance based on theory
Confirmatory factor analysis
Factor Analysis Limitations
Relies on a linear approximation
Not well suited for categorical or binary data responses
A theory that relates the performance of each item to a statistical estimate of the test taker’s ability on the construct being measured
Item Response Theory
Item Response Theory Limitations
Heavily reliant on very large sample sizes
Often not feasible in organizational settings
A good psychological test
Representative sample of behaviors
Standardized testing conditions
Scoring rules
Statements by professionals regarding what they believe are appropriate and inappropriate behaviors when practicing the profession
Ethical standards
APA Ethical Principles
Beneficence and Nonmaleficence, Fidelity and Responsibility, Integrity, Justice, Respect for People’s Rights and Dignity
Estimates a persons standing on underlying trait measured, rather than their score on the test
Item Response Theory
assigning numbers with specific rules to phenomena
Measurement
Most basic level of measurement
Data “in name only”
Numbers assigned to categories to give them labels
Nominal Scale
Same properties as the nominal scale, adds order
Doesn’t tell us distance between the candidates
Ordinal Scale
Distance from one point to another is the same
Interval Scale
Adds an absolute zero point, representing the complete absence of the property measured
Ratio Scale
Shows the observed distribution of scores
Histogram
Where distributions are centered
central tendency
How spread out (distributed) are groups of scores
Variability
A group of test scores achieved by some group of individuals
Norms
X = T + E
X = observed score T = true score E = random error
Average score obtained if an individual took a test an infinite number of times
Can never truly be known
Random errors cancel each other out over an infinite number of times
True score
Difference between the true score and the observed score
Over an infinite number of testing occasions, ____ error will be zero
Also reduced by adding more items
Normally distributed
Random error
Single source of error which always increases or decreases the true score by the same amount
Hard to predict
Practice effects and order effects are ______ error
Does not reduce reliability
Systematic error
Evidence that the interpretations that are being made from the scores on a test are appropriate for their intended purpose
Validity
The extent to which the questions on a test are representative of the material that should be covered by the test
Content validity
An attribute, trait, or other characteristic that is abstracted from observable behavior
Construct
The behavior we want to predict
criterion
The extent to which the scores on a test correlate with scores on a measure of performance or behavior
Criterion-related validity
Evidence that a test relates to other tests and behaviors as predicted by theory
Construct validity
Perceptions of the test takers that the test measures what it intended to measure
Face Validity
A method of defining a construct by identifying its relationships with as many other constructs as possible
Nomological network
two measures labeled with same construct but uncorrelated
Jingle
two measures labeled with different construct but correlated
Jangle
The measure(s) of performance (or some other outcome) that we expect to correlate with test scores
Criterion
The extent to which scores on a test correlate with scores on a measure of performance or behavior
Criterion-related validity
Assess job applicants on the predictor
Predictive method
Assess current employees on both predictors and criteria, or examine previously existing data
Concurrent method
Correlation between test scores (predictors) and performance (criterion) representing the strength of the validity evidence
Validity coefficient (r)
The amount of shared variance between predictor and criterion
Coefficient of determination (R2)
The process of administering a test to another sample of test takers, representative of the target population
Cross-validation
A reduction in the correlation between predictors and criteria due to random error when comparing the first and second administration of the test
Shrinkage
Scores on a test taken by different subgroups in the population (e.g., men and women; minority and majority) need to be interpreted different because of some characteristic of the test not related to the construct being measured
Measurement bias