Developing & Administering Psych Tests Flashcards
CTT
Test development is performed by the intuitive & more or less random collection of a sample of items from an infinite domain of potential items (face validity)
IRT
Test development is carried out from the identification of behaviours that will constitute the empirical representation of the latent traits
Dimensions (Factors)
Dimensions & factors sometimes come together
For geometry, dimension is the measure of the size of an object, usually given as length, width & height
Unidimensional vs multidimensional
Unidimensional= 1 dimension, e.g. attitude- favourable unfavourable
Bidimensional= 2 dimensions, e.g. skill- verbal (insufficient sufficient) & non-verbal (insufficient sufficient)
Tridimensional= 3 dimensions, e.g. burnout- intensify, level & influence
Guilford’s (1967) model of the structure of the intellect
Operation
Product
Content
Descriptors
A descriptor refers to an observable behaviour states objectively & concisely
Are the significant & important features of the construct & provide condensed info on what should be evaluated
Descriptors should
1) Should be representative & significant for the measurement of the psychological attribute
2) have validity, objectivity & consistency
3) be focused on clear, practical & easy to understand aspects
4) be easy to measure, based on easily available info
5) allow relationship with other descriptors without overlapping them
6) serve as a reference to the development of items
Bloom’s taxonomy
Triangle
1) Remember
2) understand
3) apply
4) analyse
5) evaluate
6) create
Test blueprint (specifications)
Tells you exactly what skills will be tested & how many points each question worth
May include important details
Ensures that test assesses level or depth of learning you want to measure
Development of test specs more common for skills tests
Test specs indicate what dimensions & descriptors can be evaluated in the test & in what proportions
The operationalisation of subjective concepts enables their measurement from set of descriptors, which will represent in the future of the phenomenon under investigation
Items
A command or question requested to an individual for the investigation of a descriptor
Seems to examine perceptions, interests, opinions, attitudes, knowledge, skills & aptitude’s related to test content domain
Must represent a descriptor fully & accurately
Sources of items
Theory & specifications matrix
Other measures of the same construct
Interview with target pop
Interview with experts in psychometrics or in the area whose test is intended for
Specific criteria for development of items
1) Behavioural= items must express a behaviour, not an abstraction
2) Objectivity= items should allow for a right or wrong response
3) Simplicity= items should express a single idea in order to avoid ambiguities
4) Clarity= items should be intelligible even to the lowest level of target pop, short sentences with simple & unambiguous expressions
5) Relevance= items must be consistent with the psych trait & other items covering the same construct
6) precision= items must have a position defined in the attribute continuum & be distinct from other items that cover the same continuum
7) neutrality= do not use extreme expressions e.g. excellent, miserable, the magnitude of the persons reaction is given in the response scale
Specific criteria regarding test development
1) amplitude= range of simple descriptors to more complex descriptors
2) balance of amplitude
Number of items in test
Depends on complexity of the construct
CTT= 3 times more than the final number expected, pool of items
IRT= 10% additional items, theoretical validity
Malhotra (2013); classification of scaling techniques
Scaling techniques (comparative scales & non comparative scales)
Comparative scales (paired comparison, rank order, constant sum/allocating points, Q-sort & other procedures)
Non-comparative scales (continuous rating scales & itemised rating scales)
Itemised rating scales (likert, semantic differential, Stapel)
Advantages of Dichotomous item format
Ease & speed of application, processing & analysis
Statistical analyses can be performed to identify strengths & weaknesses of test items
Scoring system is ambiguous
Markers personal opinions & impressions can be controlled
Disadvantages of dichotomous item format
Demand a lot of time & care while writing distractors
Reduced number of alternatives that can be chosen
Respondent May be influences by alternatives presented
Advantages of polytomous item format
Can be considered as ordinal or internal variables
Since they have different levels or degrees to evaluate same attribute, polytomous items tend to be more accurate than dichotomous items when attitudes are measured
Disadvantage of polytomous item format
Difficult to choose between category levels
Itemised rating scale decisions
Number of categories; no single optimal number, traditional guidelines suggest between 5-9
Odd or even (forced vs unforced) number of categories
Testing
Testing is a social construct in which all parties should seek a common shared understanding of the process
Testing procedures are normally designed to be administered under carefully controlled or standardised conditions that embody systematic scoring protocols
Test administration should
Provide measures of performance & involve drawings of inferences from samples of behaviour
Include procedures that may result in the qualitative classification or ordering of people
Rights & responsibilities of test givers & takers
1) Purpose & procedures for testing clearly stated & communicated to all parties involved in testing process
2) how test info will be used is clearly stated & communicated to all parties involved
3) procedures for dealing with enquiries & complaints about process of testing are clearly stated & communicated to all parties
Tester & test administrators expect test takers to
1) ask questions prior to testing if uncertain about why test is administered, how administered, what they will do & what will be done with results
2) Inform about any condition they believe might invalidate results of which test takers would wish to have taken into consideration
3) follow instructions of administrator
4) be aware of consequences of not taking a test if choose not to & accept them
After the test session
Tell test takers what to do with any equipment or materials they’ve used & what they make take away with them
Tell what will happen next, when told results & any feedback
Open administration
Tears available for completion by anyone on demand
Controlled administration
Provides with restricted access to test session, but administration is carried out without someone being present to supervise
Supervised administration
Traditional mode for tear administration in group testing
Provides level of control needs for maximum performance testing
Data protection
Personal data may only be kept in a form that permits identification of pp for no longer than is necessary for purposes for which it was processed
Can keep anonymised data for as long as you want
Test maladministration
Any act that affects the security, confidentiality or integrity of the assessment or could lead to results that don’t reflect the pp’s real scores
Examiners should be aware that interaction with test takers can influence the results
Stereotype threat
May have very important effects on test scores
May explain 50-80% difference between males & females on SAT maths section
Being aware of negative stereotype may inhibit performance on tests & academic performance