psy Ass & Test Ch 4 Flashcards
Test batter
A selection of tests designed to measure different variables but having a common objective. i.e an intelligence test, a personality test, and a neuropsychological test might be used to obtain a general psy profile of an individual
Criterion- referenced scoring
A score on the assessment specially indicate what knowledge a student is capable and what knowledge they are possess , a criterion referenced scores are most appropriate especially when an educator wants to assese the specific concept or skill a student has learned through the classroom. It says how students perform against objective or standard against another students
S s
Norm -referenced score
It shows the ability of one student it compare one students with the the avarage class
Latent variable
The variable which is not directly observable
Puroposese of test development
Theoretical advance
Empiricical advances
Practical needs
Stages of test development
1- test conceptualisation 2-test construction 3-"test pilot 4-item analysis 5- test revision
Test conceptualisation
The beginnings of any published test can probably be traced to thoughts—self-talk, in behavioral terms. The test developer says to himself or herself something like “ there ought to be a test designed to measure [fill in the blank] in [such and such] wa
new disease comes to the attention of medical researchers, they attempt to develop diagnostic tests to assess its presence or absence as well as the severity of its manifesta- tions in the body.
The development of a new test may be in response to a need to assess mastery in an emerging occupation or profession. For example, new tests may be developed to assess mastery in fields such as high-definition electronics, environmental engineering, and wireless communication
Pilot work
In the context of test development, terms such as pilot work, pilot study, and pilot research refer, in general, to the preliminary research surrounding the creation of a prototype of the test. Test items may be pilot studied (or piloted) to evaluate whether they should be included in the final form of the instrument. In developing a struc- tured interview to measure introversion/extraversion, for example, pilot research may involve open-ended interviews with research subjects believed for some reason (per- haps on the basis of an existing test) to be introverted or extraverted.
Another type of pilot study might involve physiological monitoring of the subjects (such as monitoring of heart rate) as a function of exposure to different types of stimuli.
The process may entail the creation, revision, and deletion of many test items in addition to literature reviews, experimentation, and related activi- ties. Once pilot work has been completed, the process of test construction begins. Keep in mind, however, that depending on the nature of the test—particularly its need for updates and revisions—the need for further pilot research is always a possibility.
Test construction
Pilot work, like many of the other elements of test conceptualization and construction that we discuss in this chapter, is a necessity when constructing tests or other measur- ing instruments for publication and wide distribution. Of course, pilot work need not be part of the process of developing teacher-made tests for classroom use (see Everyday Psychometrics). As you read about more formal aspects of professional test construction, think about which (if any) technical procedures might lend themselves to modification for everyday use by classroom teachers.
Scaling
We have previously defined measurement as the assignment of numbers according to rules. Scaling may be defined as the process of setting rules for assigning numbers in measurement. Stated another way, scaling is the process by which a measuring device is designed and calibrated and by which numbers (or other indices)—scale values—are assigned to different amounts of the trait, attribute, or characteristic being measured.
Types of scaling
scales are instruments used to measure some- thing, such as weight. In psychometrics, scales may also be conceived of as instruments used to measure. Here, however, that something being measured is likely to be a trait, a state, or an ability. When we think of types of scales, we think of the different ways that scales can be categorized. we saw that scales can be meaning- fully categorized along a continuum of level of measurement and be referred to as nom- inal, ordinal, interval, or ratio. But we might also characterize scales in other ways.
If the testtaker’s test performance as a function of age is of critical interest, then the test might be referred to as an age-based scale. If the testtaker’s test performance as a function of grade is of critical interest, then the test might be referred to as a grade-based scale. If all raw scores on the test are to be transformed into scores that can range from 1 to 9, then the test might be referred to as a stanine scale. A scale might be described in still other ways. For example, it may be categorized as unidimensional as opposed to multidimensional. It may be categorized as comparative as opposed to categorical. This is just a sampling of the various ways in which scales can be categorized.
There is no best type of scale. Test developers scale a test in the manner they believe is optimally suited to their conception of the measurement of the trait (or whatever) that is being measured.
Moca Montreal cognitive assesement
When a patient starts to experience memory loss and other forms of cognitive decline, it can be a stressful, uncertain, and trying time for everyone involved, from the patient to their family, friends, caretakers, and even healthcare professionals. No matter what the cause of the cognitive impairment, it’s important to quickly find out how an individual’s cognitive function is affected so that an appropriate treatment plan can be devised. This is where MoCA comes in—our straightforward tool for diagnosing patients and gauging an appropriate follow-up and treatment plan. With the ability to assess several cognitive domains, the MoCA test is a proven and useful cognitive screening tool for many illnesses, including
Two types of measures
1-Self report questionnaires
Self efficiency
BDI -II
Subjective
2- rating scales -clinician/spouse rate patient
Behaviour
Hamilton rating scales for depression
Objective
Classical test theory
Classical test theory (CTT) in psychometrics is all about reliability. We use the word reliable or reliability often in our colloquial language. Your friend who is always on time is reliable, for instance. But in psychology, reliability refers to how consistent a test or measure is. In other words, if you took the same test several times, you should get about the same score each time. So, assuming the conditions are the same, you’d get the same score on a test because the test itself is well designed.
There are three ideas we need to keep in mind when we’re talking about CTT: test score, error, and true score. The test score is what we call the observed score. So, if you take a math exam and get an 85, that’s your test score. Error refers to, well, exactly what it sounds like! It’s the amount of error that is found in a test or measure. This might be a mistake in the test, or it might also refer to things in the external environment that we can’t totally control but that impact testing. Let’s say you’re taking your history exam, and there’s construction going on in the building next door. Hammering isn’t great for concentration, is it? This is a form of error because the terrible noise might impact your score. Then, we have the true score. This is the score you would have achieved if there were absolutely no errors in the measurement. Alas, this isn’t really possible. But psychometrics assumes everyone has, in theory, a true score. We can calculate this true score with an equation.
Charles Spearman, a psychologist and statistician, thought that we could reduce random error as much as possible, thereby making tests better. Spearman is widely considered one of the founders of CTT. So, the important take away of CTT is that it’s a theory that tries to explain and deal with error, so our tests are more reliable.
Item Response Theory
Criterion validity
One measure predict out come. Iq says you will get a good marks from uni