Lesson 1-3 Flashcards
The process of measuring Psychology-related variables by means of devices or procedures designed to obtain a sample of behavior.
Psychological Testing
It is the gathering and integration of
Psychology-related data for the purpose of making a psychological evaluation that is accomplished through the use of tools such as tests, interviews, case studies, behavioral observation, and specifically designed apparatuses and measurement procedures.
Psychological Assessment
To obtain some gauge, usually numerical in nature, with regard to an ability or attribute
Objective of Testing
To answer a referral question, solve a
problem, or arrive at a decision through
the use of tools of evaluation
Objective of Assessment
May be individual or group in nature
Process of Testing
It is typically individualized
Process of Assessment
Tester is not the key to the process
Role of Evaluator in Testing
Assessor is the key to the process:
selecting tests/tools and drawing conclusions
Role of Evaluator in Assessment
Requires technician-like skills: administering, scoring, and interpreting
Skill of Evaluator in Testing
Requires an educated selection of tools of evaluation, skills in evaluation, and integration of data
Skill of Evaluator in Assessment
Yields a test score or a series of test scores
Outcome of Testing
Entails a logical problem-solving approach to shed light on a referral question
Outcome of Assessment
Process of Assessment
Referral, Initial Meeting, Tool Selection, Formal Assessment, Report Writing, Feedback Sessions
From: Teacher, Counselor, Health Provider, Employer, Individual
Referral
Intake Interview (clarify reason for
referral)
Initial Meeting
Preparation for assessment
Tool Selection
Actual assessment begins
Formal Assessment
Writes a report of the findings that is designed to answer the referral question
Report Writing
Between client and assessor (third
parties may be scheduled)
Feedback Sessions
8 Tools of Psychological Assessment
Test, Interview, Portfolio, Case History Data, Behavioral Observation, Role-Play Tests, Computers, Other tools
A measuring device or procedure
Test
Device or procedure designed to measure variables related to Psychology
Psychological Test
Almost always involves analysis of a sample of behavior
Psychological Test
Behavioral sample could range from responses to a pencil-and-paper questionnaire, to oral responses to questions related to the performance of some task.
Psychological Test
Method of gathering information through direct communication involving reciprocal exchange
INTERVIEW
Face-to-face: Verbal and non-verbal behavior
Face-to-face
Changes in voice pitch, long pauses, signs of emotions
Telephone
online interview, e-mail interview, text messaging
Electronic
Samples of one’s ability and
accomplishment
Portfolio
Refers to records, transcripts, and other
accounts in written, pictorial, or other
form that preserve archival information,
official and informal accounts, and other
data and items relevant to an assessee
CASE HISTORY DATA
Monitoring the actions of others or
oneself by visual or electronic means
while recording and/or quantitative
and or qualitative information
regarding those actions
Behavioral Observation
Tool of assessment wherein assesses are
directed to act as if they were in a particular situation
ROLE-PLAY TESTS
Can serve as test administrators and as highly efficient test scorers
Computer
Mere listing of scores
Simple scoring
statistical analyses
Extended scoring
Numerical or narrative statements
Interpretive
Written in language appropriate for communication between professionals, may provide expert opinion (analysis of data)
Consultative
Inclusion of data from sources other than the test
Integrative
Video Thermometer Sphygmomanometer
OTHER TOOLS
Create tests or other methods of assessment
Test Developer
Clinicians, counselors, school psychologists, human resources personnel, etc.
Test User
Anyone who is the subject of an
assessment or an evaluation
Test-taker
Evolving society causes changes to
psychological variables
Society at large
Tests or aids that can be adequately be administered, scored, and interpreted with the aid of the manual and general orientation
Level A
Achievement, Proficiency
Level A
Tests or aids that require some technical knowledge of test construction and use of supporting psychological and educational fields
Level B
Aptitude
Level B
Tests or aids that require substantial understanding of testing and supporting psychological fields together with supervised experience in the use of these devices
Level C
Projective tests, Individual Mental Tests
Level C
The nature of transformation of the test into a form ready for administration to the individual with disabling conditions will depend on the nature of the disability
Testing people with disabilities
Legal and Ethical Considerations
*Rights of Testtakers
- Right of Informed Consent
- Right to be Informed of Test Findings
- Right to privacy and confidentiality
- Right to Least of Stigmatizing Label
- Psychological Traits and States exist
- Psychological Traits and States can be quantified and measured
- Test-related behavior predicts non-test related behavior
- Tests and measurement techniques have strength and weaknesses
- Various sources of error are part of the assessment process
- Testing and Assessment can be conducted in fair and unbiased manner
- Testing and Assessment benefit society
Some Assumptions about
Psychological Testing and
Assessment
Any distinguishable, relatively enduring way in which one individual varies
from another
Psychological Traits and States exist
Trait
Also distinguishes one person from another but is relatively less enduring
Psychological Traits and States exist
State
Test developer provided test users with a clear operational definition of the construct under study/assessment.
Psychological Traits and States can be quantified and measured
Once having defined the trait, state or other construct to be measured, a test developer considers the types of item content that would provide insight to it.
Psychological Traits and States can be quantified and measured
Measuring traits and states by means of a test also entails appropriate ways to score the test and interpret the result.
Psychological Traits and States can be quantified and measured
The tasks in some tests mimic the actual behaviors that the test user is trying to understand.
Test-related behavior predicts non-test related behavior
The obtained sample of behavior is typically used to make predictions about the future behavior.
Test-related behavior predicts non-test related behavior
In some forensic matters, psychological tests may be used not to predict
behavior but to postdict it.
Test-related behavior predicts non-test related behavior
Understanding of behavior that has already taken place
Postdict
- Complex nature of violence
- Low base rate
- False positives and false negatives
- Dynamic nature of behavior
- Ethical and legal concerns
- Cultural and social bias
- Inadequate data and research
- Limited understanding of causality
- Contextual factors
Why do you think it is difficult to predict violence by means of
test?
Competent test user understand and appreciate the limitations of the tests they use as well as how those limitations might be compensated for by data from other sources.
* Users understand:
* How a test was developed
* Circumstances under which it is appropriate
* How it should be administered and to whom
* How results should be interpreted
Tests and other measurement techniques have a strength and weaknesses
-How a test was developed
-Circumstances under which it is appropriate
-How it should be administered and to whom
-How results should be interpreted
Users understand
Refers to factors other than what a test attempts to measure will influence performance on the test
Various sources of error are part of the assessment process
Error
Component of a test score attributable to sources other than the trait or ability measured
Error variance
Potential sources of error variance
- Assessee
- Assessor
- Measuring instruments
- All major test publishers strive to develop instruments that are fair when used in a strict accordance with guidelines in the test manual.
- One source of fairness-related problems is the test user who attempts to use a particular test with people whose background and experience are different from the background and experience of people for whom the test was intended.
Testing and Assessment can be conducted in a fair and unbiased manner
In a world without tests or assessment procedures:
1. People could present themselves as professionals regardless of their background, ability, or professional credentials.
2. Personnel might be hired on the basis of nepotism rather than documented merit.
3. Teachers and school administrators could arbitrarily place children in different types of special classes simply because that is where they children belonged.
Testing and Assessment benefit the society
What is a “good” test?
Criteria for a good test:
* Clear instructions for administration, scoring, and interpretation
* Offered economy in the time and money it took to administer, score, and interpret it
* Measures what it suppose to measure
Psychometric Soundness
Reliability, Validity
Involves the consistency of the tool
Reliability
Measure what it purports measure
Validity
Refers to the consistency and stability of the results obtained from a particular assessment tool or measurement instrument.
* High _________ is crucial in psychological testing because it indicates that the results are dependable and not subject to significant fluctuations or random errors.
Reliability
Reliability Estimates
- Test-Retest
- Parallel-Forms and Alternate Forms
- Split-Half
- Inter-Rater Reliability
- Internal Consistency
- Others
- Refers to the extent to which a test
or assessment tool accurately and
effectively measures the specific
psychological construct it is
intended to assess. - It is a critical concept because it
ensures that the results obtained
from a test are meaningful and
relevant for the purpose for which
the test was designed.
Validity
Types of Validity
- Content Validity
- Criterion-Related Validity
- Construct Validity
- Face Validity
Refers to the established standards or reference points that allow test scores to be interpreted in a meaningful way.
Norms
Also referred as normative data
Norms
Provide context by comparing an individual’s group test’s scores to a representative sample of people who have taken the same test under similar conditions.
Norms
Norm-referenced testing and assessment
Process of administering a test to a representative sample of testtakers under clearly specified conditions and the data are scored and interpreted for the purpose of establishing norms.
Standardization
A portion of the universe of people deemed to be representative of the whole population
Sample
Process of selecting the portion of the universe deemed to be representative of the whole population
Sampling
Population is divided into subgroups, called strata, based on certain
characteristics or attributes that are interest to the researcher
Stratified Sampling
Population is divided into subgroups, called strata based on characteristics. Involves random selection of
participants from each strata
Stratified-random sampling
Selecting individuals or groups from a population based on a predetermined criteria and the researcher’s judgment
Purposive Sampling
Occurs when data is gathered opportunistically when the opportunity arises, without primary intention of
conducting formal research
Incidental Sampling
Basic steps:
1. Define the test and its purpose.
2. Identify the target population.
3. Collect data from the target population.
4. Collect demographic information.
5. Score the test.
6. Analyze the data.
7. Create norm tables or charts.
8. Interpret the norms.
9. Publish the norms.
10. Regularly update norms.
11. Ensure that ethical guidelines are followed.
Developing Norms
Types of Norms
- Percentile
- Age Norms
- Grade Norms
- National Norms
- National Anchor Norms
- Subgroup Norms
- Local Norms
Divide the distribution to 100 equal parts
Percentile
Used in the context of norms to indicate the relative standing or performance an individual or a group within a larger population.
Percentile
Based on the principle that individuals of different ages may have varying abilities, characteristics, and developmental stages.
Age Norms
Used to evaluate an individual’s performance, development, or behavior in relation to what is considered typical or expected for their age group.
Age Norms
Typically used in the context of standardized tests and assessments to evaluate how students in a particular grade are performing in relation to their peers of the same grade.
Grade Norms
Used to assess and compare the performance or characteristics of a specific group or population within a given country
Provide a benchmark for understanding how individuals or
groups in the country compare to the larger national population in
terms of various attributes
National Norms
Provide a benchmark for understanding how individuals or groups in the country compare to the larger national population in terms of various attributes
National Norms
Designed to serve as common benchmarks that guide the development of educational standards, curricula, and assessments, ensuring that students across different regions or school systems are held to the same standards
National Anchor Norms
Derived by examining the data from subgroups of subpopulations that share common characteristics, such as gender, age, ethnicity, socioeconomic status or other demographic factors
Subgroup Norms
Used to evaluate and compare the performance of students or educational institutions within a specific local or regional context
Local Norms
Typically derived from data collected from schools, districts, or educational institutions within a particular geographic area
Local Norms
Compare individual’s performance to that of a norming or reference group
Norm-Referenced
Aim to determine how a testtaker’s performance ranks relative to others
Norm-Referenced
Scores: percentiles or standard score
Norm-Referenced
Determine whether a student has achieved specific learning objectives, skills, or standards
Criterion-Referenced
Focus on the mastery of content or skills
Criterion-Referenced
Scores: “predefined criterion or standard
Criterion-Referenced
Refers to the consistency in measurement
Reliability
An index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance
Reliability coefficient
A statistic useful in describing sources of test score variability
Variance
Refers to all the factors associated with the process of measuring some variable other than the variable being measured.
MEASUREMENT ERROR
caused by unpredictable fluctuations and inconsistencies of other variables in a measurement process
Random Error
caused by typically constant or proportionate to what is presumed to be the true value of the variable being measure
Systematic Error
SOURCES OF VARIANCE
TEST CONSTRUCTION
TEST ADMINISTRATION
TEST SCORING & INTERPRETATION
Item sampling or content sampling
TEST CONSTRUCTION
Test environment, testtaker variables, examiner-related variables
TEST ADMINISTRATION
Scorers and scoring systems
TEST SCORING & INTERPRETATION
Obtained by correlating pairs of scores from the same people on two different administrations of the same test
TEST-RETEST RELIABILITY
Appropriate: reliability of a test that purports to measure something that is relatively stable over time
TEST-RETEST RELIABILITY
Appropriate
passage of time
TEST-RETEST RELIABILITY
Possible source of error
variance
When the interval between testing is
greater than 6 months
TEST-RETEST RELIABILITY
Coefficient stability
Degree of relationship between various forms of a test that can be evaluated by means of an alternate-forms or parallel forms coefficient of reliability
PARALLEL-FORMS & ALTERNATE-FORMS
COEFFICIENT OF EQUIVALENCE
Obtained by administering different versions of an assessment tool (both versions must contain items that probe the same construct) to the same group of individuals at the same time
PARALLEL-FORMS
Consistency of test results between two different – but equivalent – forms of a test.
Used when it is necessary to have two forms of the same tests (administered different time)
ALTERNATE FORMS
DEGREE OF CORRELATION AMONG ALL ITEMS
SINGLE ADMINISTRATION OF A SINGLE FORM OF A TEST
USEFUL: HOMOGENEITY OF THE TEST
INTERNAL
CONSISTENCY
Obtained by correlating two pairs of scores from equivalent halves of a single test administered once
SPLIT-HALF
- Divide the test into equivalent halves.
* Randomly assign items to one or the other half of the test
* Odd-even reliability
* Divide the test by content - Calculate a Pearson r scores on the two halves of the test.
- Adjust the half-test reliability using the Spearman-Brown formula.
* Spearman-Brown formula allows a test developer or user to estimate internal consistency from a correlation of two halves of a test
Interpretation: At least 0.70 or higher to determine reliability
COMPUTATION OF A COEFFICIENT OF SPLIT-HALF RELIABILITY
A statistic of choice for determining the inter-item consistency of dichotomous items
Kuder-Richardson formula 20 or KR-20
Appropriate for use on tests containing non-dichotomous items
Calculated to help answer questions about how similar sets of data
COEFFICIENT ALPHA
Note: It is possible to conceive of data sets that would yield negative values of alpha. If this happens, alpha coefficient should be reported as 0
COEFFICIENT ALPHA
Focuses on the degree of difference that exists between item scores
APD
Average Proportional Distance
- Calculate the absolute differences between scores for all the items.
- Average the difference between scores.
- Obtain the APD by dividing the average difference between scores by the number of response option on the test, minus one.
* An obtained value of .2 or lower:Excellent internal consistency
* A value of .25 to .2: Acceptable range
COMPUTATION OF AVERAGE PROPORTIONAL DISTANCE
-scorer reliability”, “judge
reliability”, “observer reliability”,
“inter-scorer reliability”
* Degree of agreement of consistency between two or more scorers with regards to a particular measure
* If consensus can be demonstrated in the ratings, the researchers can be more
confident regarding the accuracy of the ratings and their conformity with the established rating system.
* Method: Calculate a coefficient of correlation
INTER-SCORER RELIABILITY