Mid Term Flashcards
An ongoing fluid and dynamic process that continues throughout the course of the helping relationship
Assessment
Refers to any systemic procedure for collecting information that is used to make inferences or decisions about the characteristics of a person
Assessment
Is a complex problem-solving process
Assessment
Encompasses a broad array of data collection methods from multiple sources, to yield relevant accurate and reliable information about an individual
Assessment
Considered an ongoing process of gathering information
Assessment
Often falsely used interchangeably with testing
Assessment
Can proceed effectively without testing
Assessment
Three ways assessment, and testing overlap
Collects info
Measures
Evidence based
A single assessment instrument should never be the sole determinant of the decision-making process
True
Four purposes of assessment
Screen
Diagnose
Intervene
Monitor
Multiple methods of data collection is referred to as ____ ____ assessment
Multimodal approach
Three methods for assessment
Interview
test
observe
Instruments designed to measure specific attributes of an individual
Tests
An assessment method that involves witnessing and documenting the behavior in particular environments
Observation
There is no set number of methods or sources that are required in an
Assessment
Additional sources and methods of gaining information leads to more ___ and ___ picture of the individual
Complete and accurate
Four steps of assessment
1) ID the problem,
2) select proper assessment methods,
3) evaluate the assessment information,
4) report results/ make recommendations
How many basic competencies created by professional associations are there?
23
What does the acronym RUST stand for?
Responsibilities of users of standardized tests
Four qualifications necessary to administer and interpret standardized tests
Purpose
characteristics,
setting/ conditions,
roles of test selectors, administrators, scores, and interpreters
How long ago were essay exams given to civil Service employees in China?
2200 BC
Who’s philosophy emphasize the importance of assessing an individuals, competency, and aptitude
Socrates and Plato
Who identified items to screen for learning disabilities
FitzHerbert
Who is the first to suggest formal IQ test?
Huarte
Who had the first psychological lab?
Wundt
First IQ test creator
Binet
Who applied the theory of evolution in an attempt to demonstrate the role heredity plays in intelligence
Gaulton
Who created educational measure
Thorndyke
Who created origins of intelligence?
Piaget
Who created the bell shaped curve?
Hernstein and Murray
What year was the no child left behind act implemented
2001
Individuals with disabilities education improvement act of what year
2004
Methods and sources of assessment vary greatly depending on these things
Needs, (client)
Purpose, (asmnt)
setting ,
availability
May come primarily from collateral sources and records
assessment information
Assesses pathology
Standardized testing
These (two things) will seldom provide enough information to make a useful decision
Observation and interview
What is always the first step with a client no matter what direction you choose to go in
Interview
Always use one method for assessment information
FALSE
*More than one method- t
What are two types of assessments
Formal and informal
Three types of assessments are
Interviews
tests
observations
What is considered the cornerstone of assessment?
Initial interview
What begins prior to other assessment methods
Initial interview
What is the primary purpose of the initial interview?
Gather background information relevant to the problem
List three things that depend on the purpose of the interview
Setting,
population,
counselor skills
What are the three categories of interviews?
Structured
semi structured
unstructured
Counselors must be able to
Establish rapport
Be warm, respectful, empathetic
Safe place and accepting
Good listening skills
Effective probing and reflecting
List five interview guidelines
Physical setting
Purpose
Confidentiality
Abide by standards
Avoid why questions
Alert to verbal and nonverbal behavior
Test that have a great impact on one’s life path are called
High stakes
Five things tests are used to measure various attributes
Cognition
Knowledge
Skills
Abilities
Personality traits
Purposes of tests
(6 items)
Screening
Classifying
Placing
Diagnose
Intervene
Progress
Two different types of tests
Content - (purpose)
Format- (structure)
Five categories of assessments
Intellectual
Achievement
Aptitude
Career
Personality
What are educational and psychological measurement based on
____ _____ ____
statistical principle data
Six types of test variables
Qualitative
Quantitative
Continuous
Discrete
Observable
Latent
Give an example of nominal data
Gender, hair color, nationality
Give an example of ordinal measurement
Rank, grades
N.O.I.R.
gender - N
rank- o
IQ - interval (no true zero)
height - ratio
The means of putting disorganized scores in order is called
Frequency distribution
Two examples of frequency distribution are the
Histogram
frequency polygon- (Bell curve)
The frequency is symmetrical. What does it look like?
Bell shaped curve
If a distribution is asymmetrical, it is called
Skewed
If the frequency is negatively skewed
Left skewed
If frequency distribution is positively skewed, it looks like
right
What are the measures of central tendency?
Mean mode median
What are the measures for variability? (3)
Range, variance, standard deviation
What percentage of data falls between one standard deviation
68%, 34% on either side
What percentage of data falls within two standard deviations
95%
What percent of data will fall between three standard deviations
99.5%
What are two types of scores?
Normative and criterion
Give an example of normative reference scores
Standardized test, IQ and achievement
What’s an example of criterion referenced scores?
Proficiency tests, mastery test
Five questions necessary to evaluate a normative group
1)cohesion
2)leadership structure
3) communication style
4) history & development
5) social ID
Six types of reference scores
(Think of complete bell curve scores)
Percentile Rank
Standard scores
Z scores
T scores
Scaled scores
Stanine
Percentage and percentile ranks are the same thing.
true or false
False
Standard scores range from a median of ____ with a standard deviation of ____
M 100
SD 15
What is the average range for IQ?
90 to 109
These scores convey standard scores in standard deviations that are not very sensitive
Z scores
A fixed standard score with the median of 50 and a standard deviation of 10 with a 40 to 60 average
T scores
A fixed standard score at 10 with a standard deviation of four. 8 to 12 is average.
Scaled scores
Fixed standard score of 5 w/standard deviation of 2 with a 1-9 range
Stanine
What are two types of test scores that compare performance, based on developmental levels
Age equivalent,
grade equivalent
Always include either ____ scores, or _____ ranks when interpreting test scores
Standard, percentile
Involves interpretive and descriptive data
Qualitative assessment
An IQ of 130 and above is classified as
Very superior
An IQ of 120 to 129 is classified as
Superior
Beck score of 29 to 63 indicates what
Severe depression
This measures performance against a set of standards. Shows clear, proficiency, in specific areas.
Criterion reference
This compares individual to group. Strengths among peers.
Norm reference
Why do we need to be careful when determining to use criterion reference or norm referenced interpretation?
Has significant impact on validity
What type of score may not provide enough units to differentiate amongst scores?
Stanine
3 types of interviews
Structured interview
semi structured interview
unstructured interview
Types of tests
(5)
Standardized vs. Non-standardized
Individual vs. Group
Maximum vs. Typical-Performance
Objective vs. Subjective
Verbal vs. Non-verbal
The degree to which evidence and theory support the interpretation of test scores for proposed uses of the test.
Validity
A ____ can be a whole test with many parts, test with one part, or a sub test, measuring specific characteristics
scale
Give an example of a type of scale test
Stanford Binet IQ test
Distinct exam given in one setting can be made up of many parts of different tests
Battery
A _____ test takes measures of an IQ, anxiety, autism assessment to make a complete test
Battery test
The national counselor exam is an example of
Computer-based tests
Give an example of a computer adaptive test
Graduate management admission test, GMAT
Monitoring and making a record of others or oneself in a particular context, is called
Observation
Seeing what a person actually does in situations is called
Observation
Methods for identifying immediate behavior before, and after are called
Antecedence and consequences
Gathering information to identify problem behaviors and develop interventions is called what
Functional behavior assessment
An observation that is graded and uses a rubric is what type of observation
Formal
An observation that’s not graded, and based on past performance is what type of observation
Informal
The type of observation that uses senses like eyesight smell
Direct observation
This type of observation is reliant on reports from others
Indirect observation
This setting offers a more accurate reflection of real life circumstance
Natural setting
This setting is created by the observer
Contrived setting
The observer doesn’t intrude on the research context
Unobstructive observation
The researcher becomes a participant in the culture or context of situation
Participant observation
List three methods of recording observations
Event, duration, time sampling
These measure the general functioning of specific skills
Rating scales
This measures multiple domains of functioning
Broadband scales
This measures one or a few domains, more in-depth
Narrow band scales
Third-party reporters are called
Collateral sources
Very important data from teachers family employers when purpose is behavioral
Collateral source
Required source when conducting a forensic evaluation
Collateral source
Confidentiality is very important in
Collateral source
Permission must be obtained and written consent is required for
Collateral sources
Assessments ,scoring reports , adapted tests, SPSS, and SASR all ____ based
Computer
Computer-based assessments can be used as standalone, clinical evaluations- T or F
False - we should never use these as standalone
Who is ultimately responsible for the accuracy of interpretation of assessments?
The clinician
_____ requires interpretation to have meaning for the individual
Results
All of these :
Invasion of privacy,
too much reliance on a test score,
testing bias,
incriminating results,
IQ test don’t measure the correct construct, demonstration of competency for diploma,
multiple-choice test need to be replaced by authentic performance assessment,
too much pressure on stakeholders and high stakes testing
Are examples of:
Controversies about assessments
Types of information needed, needs of the client, resources, timeframe for assessment, quality of instrument, qualification of the counselor are all criteria in determining what
Selecting appropriate assessment instrument
Z
selecting appropriate assessment instrument
There is a single source that catalogs every possible, formal and informal assessment instrument.
False
References, publishers, website, specimen, sets, manuals, research, literature, professional organizations are all sources of
Locating assessment instrument information
What questions should be asked before choosing an instrument?
What is the purpose of the instrument? What is the make up of the norm group? Are the results of the instrument reliable? Is there evidence of validity? Does the manual provide clear instructions?
What practical issues should be considered, when choosing an instrument (5 options)
Time, ease,cost, scoring,
interpretation
Self report, individually administered, group administration, computer administration, video administered, audio administered, sign language administered, nonverbal are all modes for
Administering assessment instruments
What must be done before you administer an instrument
Obtain informed consent, maintain a copy in your records at all times
Psychometric property pertaining to consistency dependability in the production of test scores is known as
Reliability
This refers to the degree to which test scores are dependable
Reliability
Dependability, consistent and stable is the definition of
Reliability
What is one of the most important characteristics of assessment results?
Reliability
If a scale fluctuates and produces different results, each time it said to be
Unreliable
This refers to the results obtained with an assessment instrument not the actual instrument
Reliability
Instruments are rarely totally consistent or error-free
T or f
True
The greater the amount of measurement error on test scores equals
Lower reliability
Amount of error in an instrument is called
Measurement error
Any fluctuation that results from factors related to the measurement that is irrelevant to what is being measured is called
Measurement error
The concept of true scores is totally
Theoretical
You’re really not going to have 100% of a ____ ____
True score
Some degree of error is inherent in all instruments is known as
Standard error of measurement
A simple measure of an individuals test score fluctuation if the test were given to them repeatedly is known as
Standard error of measurement
An estimation of the accuracy of individuals observed score, as compared to the true score is known as
Standard error of measurement
What are three types of measurement error
Time sampling
Interrator differences
Content sampling
(TIC)
Repeated testing of the same individual is known as
Time sampling
The greatest source of error in instrument scores is from
Content sampling
An error that results from selecting test items that inadequately measure the content that’s intended is known as:
Content sampling
The subjectivity of the individual scoring the test is called
Interrater reliability
Personality test or IQ tests are a form of what type of sampling
Content sampling
Quality of the test items, test length, test taker variables, and test administration are examples of other
Measurement errors
What is the oldest most commonly used method of estimating reliability?
Test retest
This is most useful in measuring traits, abilities, or characteristics that are stable and do not change generally overtime
Test retest
Giving two different versions of forms of the same test at the same time is called
Simultaneous administration
Giving two different versions of the same test on different days to same group is an example of
Delayed administration
An example of a delayed simultaneous administration is using the ______ test
Woodcock Johnson, AB
Measuring the extent to which items on the instrument measure the same ability or trait is an example of
Internal consistency reliability
Having a high internal consistency reliability means that the tests are
Homogenous
If there is a strong correlation on the test, then there is a
High degree of internal consistency
Split half reliability,
Kuder Richardson formula, coefficient alpha,
are three means for determining
Internal consistency
What’s another name for a coefficient alpha?
Cronebacks alpha SPSS
What is used for items answered yes or no, right or wrong , zero or one ,
for internal consistency
Kuder Richardson formula
A potential source of error is the lack of agreement among raters for this reliability
Interrater reliability
This can be done by correlating the scores obtained independently by two or more raters
Interater reliability
This does not reflect content sampling or time sampling errors
Interator reliability
Sensitive only to the differences among raters
Interator reliability
What is a test designed to be given more than one time
Test retest, or alternate forms
This evaluates, the extent to which different items on the test, measured the same content
Internal consistency
If items are heterogenous and the test measures more than one construct, the reliability will be
Low
Two types of scales that have low reliability
Joy and depression
For test with more than one construct what method is appropriate
Split half method
What reliability coefficients are acceptable and unacceptable
.70 is acceptable
.59 is unacceptable
What does SEM stand for?
Standard error of measure
Scores by a single individual if tested multiple times=
Standard error of measure
Spread of scores obtained by a group of test takers on a single test
Standard deviation
Confidence intervals
Bell curve
68% w/in 1 SD
95% w/in 2 SD
99.5 w/in 3 SD
Longer tests improve
Reliability
Larger number of test items can more accurately measure the ______ thus reducing content sampling errors.
Construct
Multiple-choice test, writing, unambiguous questions, make sure questions are not too hard or easy, clearly stating administration and scoring, training, grading, or interpreting the test are examples of
Improving reliability factors
Something that is sound, meaningful and accurate
Validity
Can be viewed as the extent to which test scores provide answers to the targeted questions
Validity
Can be reliable, but not
Valid
Cannot be valid and not
Reliable
Does the measure retain similar results each time similar people take it
Reliability
It measures what it claims to measure:
Validity
Refers to appropriateness of the use and interpretation of test results not the test
Validity
This is a matter of degrees. It’s not all or none.
Validity
This is a single unified concept
Validity
Three subtypes of validity
Content
criterion
construct
Test manuals are constructed from what types of validity
Content, criterion, construct
This type of concept looks at test content, response, processes, internal structure, relations to other variables, consequences of testing
Unitary concept
Most textbooks use which type of terminology
Traditional content,
criterion construct
This is specific to a particular purpose
Validity
No test is valid for all purposes
True
What’s another name for construct?
Latent variables
What are some examples of latent variables?
Aggression, morale, happiness, quality of life
What are scientifically developed concepts/ ideas used to describe behavior called
Constructs
What cannot be measured directly or observed directly
Constructs
What is defined by a group of interrelated variables that can be measured
Construct
An example of an interrelated construct variable that can be measured
Aggression: measured by physical violence, verbal attacks, and poor social skills
If we have evidence that the interpretation of the results is valid based on the purpose of the test, then the results are considered to
Reflect the construct being measured
A measure that provides inconsistent results cannot provide
Valid scores
What is the most fundamental consideration in developing in evaluating tests?
Validity
Validity centers on the relationship between the ____ of the test, and the ____based on the test scores
Purpose, interpretation
The greater the impact, the results have on someone’s life the more ____ is required
Evidence
Evidence of relationship between the content of the test, and the construct it presumes to measure is known as
Test content validity
_____ areas reflect essential knowledge, behaviors, skills that represent the construct
Content
______ comes from educational standards, accreditation standards, school curricular, syllabi textbooks
Achievement
Personality and clinical inventories come primarily from
Characteristics in the DSM
This comes from job, descriptions, employment, assessments, activities, tasks, and duties of the specific job
Career
Some instruments are designed to measure a general _____, while others are designed to measure ____ components of a construct
construct,
several
Predictor variable is compared to the criterion for which is designed to predict:
Criterion based evidence
(aptitude test is the predictor variable)
The predictor variable is concurrently related to some criterion, (example depressed mood)
Concurrent evidence
The degree to which the test score estimates some future level of performance
Predictive evidence
The chosen criterion must be _____ to the intended purpose of the test
Appropriate
(Example IQ test is not predictor of morality)
Relevant, reliable, and uncontaminated are what
Criterion measures should be
Should not be influenced by external factors that are unrelated to the criterion is the definition of
Uncontaminated
The means by which we evaluate the relationship between test results and a criterion measure
Validity coefficients
The purpose is to show that the test scores accurately predict the criterion performance
Validity coefficient
The range is from negative one to + 1
.5 is very high.
.21 is very low.
Validity coefficient
A means of providing evidence of internal structure of a test
Evidence of homogeneity
This can be proven by high, internal consistency coefficient
Homogeneity
______ _____ Between scales, or a sub test on a single instrument, provides evidence of these components measure the construct that was intended
High correlation
____ ____ is by correlating one instrument to other instruments that assess the same construct
Convergent evidence
Test developers will use other
Well, established instruments
When revising an instrument developers will use ___ ____ to compare with the latest version to be sure both are measuring the same construct
Previous versions
_____ uses consistently low correlation values between the test and other test to measure different constructs
Divergent
Another means to provide evidence of constructibility is called
Group differentiation
If two groups have vastly ___ scores in a predicted way, then the test has evidence of ____
Different, construct validity
Shows the degree to which test scores change with age
Age differentiation
Source of construct validity
Experimental results
The expectation that benefits will come from the test scores is known as
Intended Evidence-based consequences
Actual and potential consequences of the test use, and the social impact are known as
Unintended consequences
The actions and processes and emotional traits that the test taker invokes in responding to
Evidence based response process
Does it look legitimate. Is it too hard or too childish. Is it too long too short or examples of
Evidence-based response process
( Does the test appear to test what it’s intended to test? ?)
disruptive behavior in classroom is best
Observed
The degree to which instrument measures what it is supposed to measure.
Validity
amount of variation of a random variable expected about its mean.
Standard deviation