Week 4 Flashcards
describe test conceptualisation
Review of exisiting tests –> decide if need to develop own. target contruct written as a definition of what the test measures and applicable population.
Describe test contruction
aim to generate an item pool with good content validity. over inclusion of items.
what are some giudlines for test contruction
use staightfoward language
avoid double barrelled items, slang and colloquial terms (they become obsolete)
consider the impact of positive and negative items
items can be reponded to by majority
straightfoward and non judgment language
what is nominal data
simplest form of measurement in which infomation is placed into catergories (football team, yes/no.
what is ordinal data
catergorisation incorporates rank but no unit of measurment (itunes top 20 countdown, runners in a race)
what is interval data
categories are of measument and there are equal interval before each number. E.g. height average of adults, results of an IQ test
what is ratio data
measurement with equal intervals which contain an ultimate zero. E.g. ruler, or number of times someone has experienced something, measuring grip strength.
what are some pros and cons of the likert scale
pros: measures degree ot trait. easy to use. informative.
Cons: number of reposnses need to be considered
Odd vs even number responses.
what are some pros and cons of the Binary choise scale (T/F)
pros = easy to construct, easy to score, quick, can have more items bc quick.
Cons = allows guessing
interpretation of question may be vague. only suits questions with dichotomour repsonses. content not a rich.
what is a paired comparison?
Test-taker has to choose one of two options (e.g., a statement, object, picture) on the basis of some rule. The value (e.g., 1 or 0) of each option in each paired comparison is determined by judges prior to test administration.
what is comparitive scaling?
Sorting or ranking stimuli (e.g., statements, objects, photographs, etc.) according to some rule (e.g., best to worst, most justifiable to least justifiable, etc.)
what is catergorical scaling?
Categorising stimuli (e.g., statements, objects) according to some rule (e.g., “best” and “worst”, “always justifiable” and “never justifiable”)
What are some pros and cons of MCQ
pros = lots of content, quick administer and score, prevents bluffing.
Cons = suppresses creativity. doesnt suit all subject matter.
what are some pros and cons of Essays?
pros = complex, imaginitve original knowledge used. written comunication. infomation generated no recognised (memory reliance)
Cons = narrow content, bluffing, hiding behing good writing, time consuming to score, inter-rater reliability issues.
describe the test tryout phase
test is administered to a representative sample with standadised instructions.
Describe item analysis
data from tryout used to improve psyhometrics through assessment of difficulty, dimentionality and distribution.
what item difficulty and how is it assessed?
performance of each individual item should differe on the basis of the test takers knowledge.
if one item is answere 100% correct or incorrect it is bad.
IDI = examiness who answered correctly / total number of examiness (m = .5 range = .3 - .8).
what is item distribution?
remove items which a skewed.
i.e. items everyone answeres the same
keeps items with high varience or with a mean close to centre or possible range.
describe how dimentionality is assessed?
Usually starts with exploratory factor analysis (EFA) to identify a manageable number of factors to extract.
Confirmatory factor analysis (CFA) used when number of factors is known
what does Factor analysis assist with?
determine the number or underlying latent variable or contructs (do all items measure one thing?)
destinguish between oblique and othagonal rotations
Oblique = assumed factors correlated Othagonal = factors uncorrelated.
what is an item-scale correlation when is it used?
the correlation between the score for the test item and the scale score
what is test revision?
used when modifying an old or creating a new test. after analsysis performed and items removed or edited goes back to test tryout phase.
why do tests ‘age’?
domains and interpretations change stimuli age Words chnage meaning Test norms become outdates Theories behind tests change
what is cross validation?
administer same test to 2 different sample? are results correlated?
what is validity shrinkage
lower validity second time round - but near enough it good enough.
what is co-validation ?
two different tests measuring the same construct administered to the same sample should have similair scores. (useful for test batteries)