W3 - Chapter 8 - Test Development - DN Flashcards

Question

item-characteristic curve (ICC)

Answer 1

* **graphic** representation of the **probalistic relationship** between a person's **level of trait (**ability, characteristic) being measured and the **probability** for **responding** to an item in a **predicted** way * also known as a category response curve or an item trace line p.177, 281 p.268

Answer 2

* items cannot be too easy or too hard in order to differentiate between testtakers knowledge of the subject matter * a statistic obtained by calculating the **proportion** of the **total number** of **testtakers** who answered an item **correctly** * *p* is used to denote item difficulty * a subscript 1 refers to the item number = *p*₁ * can **range from 0-1** * the larger the item-difficulty index, the easier the item * (i.e., the higher the p, the easier the item - because ***p* represents** the **number of people** **passing** the item) p.263-264

Answer 3

* measure of item discrimination * symbolised by *d* p.264-268

Answer 4

* the name given to an item-difficulty test (which is used in achievement testing) when used in **other contexts** (e.g., personality testing) p. 263

Answer 5

* a reference to the **degree of bias**, if any, in a test item p. 271-272

Answer 6

* a reference to the **form, plan, structure, arrangement,** or **layout** of individual test items * including whether the test items require testtakers to **select or create** a response p.252-255

Answer 7

* the reservoir or well from which items will or will not be **drawn** for the final version of the test * the **collection of item**s to be further **evaluated** for **possible selection** for use in an **item bank** p.251

Answer 8

* provides an indication of the **internal consistency** of a test * the **higher the index**, the greater the internal consistency * index is equal to * the product of the item-score standard deviation (*s*) and * the correlation (*r*) between the item score and the total test score p.264

Answer 9

* a statistic designed to provide an indication of the **degree** to which a **test is measuring** what it **purports to measure** * **important** when a test developer's **goal** is to maximise the **criterion-related validity** of a test * the higher the item-validity index, the greater the test's criterion-related validity * to calculate we must first know * the item-score standard deviation (symbolised as *s*_1,*s*_2,*s*₃etc.) * and the correlation between the item score and the criterion score * then we use the item difficulty index *p*₁in the following formula * *s*₁ = square root of *p*₁ (1 - *p*₁) * the correlation between the score on item 1 and a score on a criterion measure (*r*_1c)is multiplied by item 1's item-score standard deviation (*s*₁) * the product is an **index of an items validity (*s*₁ *r*_1c)** p.264

Answer 10

* **summative rating scale** with **5 alternative responses** * ranging on a continuum from e.g., "strongly agree" to "strongly disagree" p.247

Answer 11

* the testtaker is presented with two columns * *premises* on the left & *responses* on the right * task is to determine which response is best matched to which premise * young testtakers (draw a line) * others typically asked to write a letter/number as a response p.253

Answer 12

* a **scaling** method * a **pair of stimuli** (e.g., photos) is selected **according to a rule** * (e.g., "select the one that is more appealing") p.248

Answer 13

* one of the three types of **selected-response** item formats * three elements 1. a stem 2. a correct alternative or option 3. and several incorrect alternatives (referred to as distractors or foils) p.252

Answer 14

* also referred to as pilot study & pilot research * **preliminary research** surrounding the creation of a prototype test * general objective is to determine how best to * **gauge** * **assess**, or * **evaluate** the **targeted construct**(s) p.243-244

Answer 15

* **non-statistical** procedures designed to explore how individual test items work * both compared to **other items** in the test & in the **context** of the **whole test** * unlike statistical measures, they involve **exploration** of the issues by **verbal means** * (e.g., interviews & group discussions with testtakers & other relevant parties) p.272-275

Answer 16

* techniques of **data generation & analysis** * rely primarily on **verbal** rather than mathematical or statistical procedures p.272

Answer 17

* a system of **ordered numerical** or **verbal descriptors** * used to make **judgements** about the **presence, absence, or magnitude** of a particular trait, attitude, emotion, or other variable p.205, 247, 371

Answer 18

* 1) in **test construction** * the process of **setting rules** for **assigning numbers** in measurement * 2) the process by which a measuring device * is designed and calibrated & * the way numbers (or other indices) are assigned to different amounts of a trait, attribute, or characteristic being measured p.244-251

Answer 19

* an **item-analysis** procedure * entails **graphic mapping** of a testtaker's **responses** p.250

Answer 20

* a **discrepancy** between the scoring in an **anchor protocol** and the scoring of **another protocol** p. 280

Answer 21

* a form of test item * requiring testtakers to **select a response** * (e.g., true/false, multiple choice, and matching items) * as opposed to creating one - contrast with constructed-response format p.252

Answer 22

* a **study of test items** * usually during test development * items are examined for **fairness** to all prospective testtakers * for the presence of offensive language, stereotypes, or situations p.274

Answer 23

* may also be referred to as a completion item * a word, term, sentence or a paragraph may qualify * anything beyond this is an essay item p.254

Answer 24

* an index derived from the **summing of selected scores** on a test or sub-test p. 247

Answer 25

* an early stage of the test development process * when an **idea** for a particular test or test revision is **conceived** p.240, 241-244

Answer 26

* a stage in the process of test development * entails **writing test items** (or **rewriting/revising** existing items) * as well as **formatting items, setting scoring rules**, and otherwise **designing** and **building** a **test** p.240

Answer 27

* an umbrella term for all that goes into the process of creating a test p. 240-284

Answer 28

* action taken to **modify** a test's **content** or **format** * for the purpose of **improving** the test's **effectiveness** as a tool of **measurement** p.240

Answer 29

* a stage in the process of test development that entails **administering a preliminary version** of a test to a **representative sample** of testtakers * under **conditions** that **simulate** the **conditions** under which the **final version** of the test will be administered p.240, 261-262

Answer 30

* a method of **qualitative** item analysis * examinees **verbalize** their **thoughts** as they take the test * useful in understanding how * **individual items function** in a test * testtakers **interpret or misinterpret** the **meaning** of the individual items p.274

Answer 31

* a **binary-choice** item * i.e., contains only one of two responses * requires testtaker to indicate whether a statement **is or is not a fact** p.254

Answer 32

* the **decrease** in item validities that inevitably occurs **after cross-validation** p. 278

Answer 33

* usually **midpoint** between **1.0** and the **probability** of answering **correctly** by **guessing** * which is called the **chance success proportion** * multi choice (50% chance of getting it right by guessing) - .5 +1.00 = 1.5 divided by 2 = .60 10:00 p.263

Answer 34

* this can be achieved by **plotting** each item's * item-validity index and * item-reliability index p.265 Fig 8-5