Quiz 2 Flashcards
Five principles of assessment
- It is thorough 
- It uses a variety of assessment methods
- It is valid
- It is reliable
- It is tailored to the individual client
Possible essay question about five principles of assessment
A good assessment must have foundational integrity. There are five principles we can use to test if an assessment is a good measuring tool. First, a good assessment is thorough. It should have as much relevant information as possible so we can come to a correct diagnosis and meaningful recommendations for therapy can be made. Second, a good assessment uses a variety of assessment tools. It should consist of interviews, case history information, client observations along with formal and informal testing. Third, a good assessment is valid. This means it evaluates the intended skills. Forth, a good assessment is reliable. Repeated administration of the same test to the same child should yield the same results. Finally, a good assessment is tailored to the individual client. It should be appropriate for the clients age, gender, skill level, and cultural background.
Norm-reference tests
They are always standardized. They compare an individual’s performance to the performance of a larger group. This type of test is preferred for school districts and insurance companies for third-party payment and qualification purposes.
Mean
Determines the peak of the bell scale and it represents the average performance.
(In a perfect distribution, the peak also depicts the median, which is the middle of the distribution And the mode, which is the most frequently occurring score.)
Empirical Rule
For a normal bell curve the empirical rule states that:
68% of all outcomes fall within one standard deviation of the mean (34% on each side)
95% of all outcomes will fall within to standard deviation of the mean (47.5% on each side)
99.7% of all outcomes will fall within three standard deviation of the mean (49.85% on each side)
Mode
Most frequently occurring score
Median
The middle of distribution.
In a perfect distribution, the peak also the pics the median.
Standard deviation
A measurement used to quantify the variability of data dispersion, In a set of given values around the mean.
Criterion-referenced tests
May or may not be standardized.
Don’t attempt to compare an individuals performance to anyone else’s.
Identifies what client can and can’t do compared to a predefined criterion.
How does my client compared to an expected level of performance?
Assumes that there is a level of performanceThat must be met for a behavior to be acceptable
**Use most often when assessing a client for neurologic disorders, fluency invoice disorders
Authentic assessment
These are non-traditional assessments.
Identifies with a client can and can’t deal
Emphasis on contextualized test stimuli
it’s ongoing. Information is maintained in a client portfolio. Provides an opportunity for the client to self-monitor, self evaluate.
Systemic observations social skills observed during lunch, recess.
Real-life situation, Language sampling, structured symbolic play
Self-monitoring self-assessment. Use of anecdotal notes in checklists video and audio taping and involvement of caregiver and other professionals.
Dynamic assessment
This is a form of authentic assessment.
Purpose is to evaluate a learning potential Based on his or her ability to modify responses after the clinician provided teaching or other assistance. (Test-teach-retest)
Great for kids with cognitive communicative disorders And kids from culturally and linguistically with that verse background
Mediated learning experience (MLE)
The clinician teaches a strategies specific to the skill being evaluated, observing the clients response to instruction and adjusting teaching accordingly.
Test-teach-retest
Validity
Test truly measures what it claims to measure.
Validity types
Face validity
Content validity
Construct validity
Criterion validity
Concurrent validity
Predictive validity
Face validity
Test looks like it assesses the skills it claims to assess. Face validity alone is not a valuable measure Of validity because it is based merely on appearance, not on content or outcome.
Content validity
Test contest are representative of the constant domain of a skill being assessed.
A valid articulation test should elicit all phonemes, thereby assessing the spectrum of articulation. Content validity is related to face validity; content validity, though, Judge is the actual content of the text rather than superficial appearance and is judged by individuals with expert knowledge.
Construct validity
Measures a predetermined theoretical content, which is an explanation of a behavior or Attribute based on empirical observation. For example the theoretical content that preschool children’s language skills improve with aged is based on language development studies. Therefore, a valid test of early language development well show improved language skills when administered to normally developed preschool children of progressively increasing age.
Criterion validity
Refers to a validity that is established by use of an external criterion there are two types of criterion validity.
Concurrent validity refers to a test levity in comparison to a widely excepted standard.
For example, the Stanford-Binet Intelligence scale is already excepted as a valid assessment for intelligence. You are intelligent tests are compared to the Stanford-Binet which serves as the criterion measure.
Predictive validity refers to the tests ability to predict performance (the criterion measure) in another situation or at a later time. It implies that there is a known relationship between the behaviors the test measures and the behaviors or skills exhibited at some future time. College entrance exams, such as the graduate were record examination (GRE), are used because of their predictive validity. The GRE scores are expected to predict future academic performance.
Reliability
Results are replicable. When administered properly, the test gives consistent results on repeated administrations or with different interpreters judging the same administration. There are several types of reliability.
Types of reliability
Test-retest reliability Split-half reliability Rater reliability Intra-rather reliability Inter-rather reliability Alternate form reliability/parallel form reliability
Test-retest reliability
Refers to a test stability overtime. It is determined by Administering the same test multiple times to the same group and then comparing the scores. If the scores from the different administrations are the same or very similar, the test is considered stable and reliable.
Split-half reliability
Refers to a tests Internal consistency. Scores from 1/2 of the test correlate with results from the other half of the test. The halves must be comparable in style and scope and all items should assess the same skill. This is often achieved by dividing the test into odd numbered questions and even numbered questions.
Rater reliability
Refers to the level of agreement among individuals rating a test. It is determined by administering a single test and audio or video taping it so it can be scored multiple times there are two types of greater reliability.
Intra-rater reliability is established if results are consistent when the same person reads the test on more than one occasion.
Inter-rater reliability Is established if results are consistent when more than one person rate the test

Alternate form reliability Also called parallel form reliability
Refers to a test correlation coefficient with a similar test. It is determined by administering a test (test A) To a group of people and then administering a parallel form of the test (test B) To the same group of people. The two sets of test results are compared to determine the tests alternate form reliability.
Formal tests
Also called standardized test
Are those that provide standard procedures for the administration and scoring of the test.
Standardization is a comp list so that test giver bias and other extraneous influences do not affect the client performance and saw that results from different people are comparable
Most of the standardized test clients use our norm referenced but standardized is not synonymous with norm referenced.
Standardized
Standardized is not synonymous with norm referenced.
Standardized test, also called formal test, are those that provide standard procedures for administration and scoring of the test.
Any type of test can be standardized as long as uniform test administration and scoring are used.
Test developers are responsible for clearly outlining the standardization and psychometric aspects of the test. Each test manuals should include information about:
The purpose of the test
The age range for which the test is designed and standardized
Test construction and development
Administration in the scoring procedures
The normative sample group and statistical information derived from it
Test reliability
Test validity reference
Chronological age
The exact age of the person in years, months, and days. It is important for analyzing findings in standardized test, as it allows the clinician to convert right onto into meaningful scores scores.
To calculate chronological age:
- Record the test administration date as years, months, days.
- Record the clients birthday as near, month, days.
- Subtract the birthday from the test day. If necessary, borrow 12 months from the year column and add it to the Moscone, reducing the year by one, And/or borrow 30 or 31 days (based on the number of days in the month borrowing from) from the months, and added to the days: reducing the month by one
Adjusted age/corrected age
For preemie infants and toddlers. Takes into account the gestational development that was missed due to premature delivery. Adjusted age is determined by using the child due date, rather than actual birthdate, then calculating chronological age. Adjusted age becomes less relevant as a child grows, and is generally not a consideration for children over the age of three
Basal
Refers to the starting point for test administration and the scoring. Allows the tester to hone in on only the most relevant testing materials. It would not be worthwhile or efficient, for example, to spend time accessing pre-speech babbling skills in a client who speaks in sentences, or vice versa.
Ceiling
Refers to the ending point for test administrator and scoring. Allows the tester to hone in on only the most relevant testing materials. It would not be worthwhile or efficient, for example, to spend time accessing pre-speech babbling skills in a client who speaks in sentences, or vice versa.
Z-score
Also called standard score, allows the clinician to compare the clients score to the normative sample. The Z score tells how many standard deviations the raw score is from the meat. The Z score is useful because it showsWhere an individual score lies along the continuum of the bell shaped curve, and there’s tells how different the test takers score is from the average.
Percentile rank
Another expression of individuals standing in comparison to the normal distribution. The percentile rank tell us the percentage of people scoring at or below a particular score. For example, scoring in the 75th percentile indicates that the individual scored higher than 75% of the people taking the same test. The 50th percentile is the median; 50% of the test takers obtained the median score.
Stanine/ (standard nine)
And additional method of ranking an individual’s test performance. A stanine Is a score based on a nine unit scale where a score of five describes average performance. Each stanine unit (except 1and9) Is equally distributed across the curve. Most people (54%) score stanine of four, five, or six; Fewer people (8%) score a stanine of one or nine.
Confidence interval
Represents the degree of certainty on the part of the test developer that the statistical values obtained are true. Confidence intervals allow for natural human variability to be taken into consideration. Many test manuals provide statistical data for a confidence interval of 95% (some lower, but the higher the better when considering test reliability). This allows the clinician to obtain a range of possible scores in which the value of the score exist 95% of the time. In other words, a 95% confidence interval provides a range ofReliable scores, not just a single reliable score.
Age equivalent, grade equivalent
Be aware that their scores are the least useful and most miss leading scorers obtained from a standardized test.
An age equivalent score is the average or a score for a particular age. For example, if 30 is the average Ross score for eight-year-olds, then all test take her to obtain a Rossmoore of 30 obtain a great equivalent score of eight years. Although it seems logical that raw scores transfer easily to age equivalent, age equivalent scores do not take into account the norm distribution of scores within a population. It would be incorrect to conclude that a 10-year-old child with an age equivalent score of eight years is performing below expectations based onAge equivalent scores.
**Age equivalent in grade equivalent scores are not considered a reliable measure and she generally not be used*******