Psychometrics Flashcards
Can only do psychometrics on what types of tests?
standardized norm-referenced
Standardized (norm-referenced ) tests must be:
- given standard administration
- valid
- reliable
- diagnostic accuracy
Standardized
talking about the method; tells you what you’re supposed to say-standardizes responses
Norm referenced
normed against or given to a large group of children in our case to find out the range of scores for what normal looks like-that allows for a meaningful comparison among children.
A good standardized norm referenced tests should have what 3 things
- validity
- reliability
- diagnostic accuracy
4 types of validity
- construct
- content
- face
- criterion-related
What is validity?
the extent to which the test accurately measures what it says it’s measuring
Construct Validity
the idea that what items we are choosing to use actually go with that theoretical construct. So all the steps you’re taking to get happiness
ex: if testing receptive, ask a series of questions-people would have to agree-doesn’t have to be a questionnaire-With construct you cannot directly measure it, you have to get at it in different ways.
* a lot of what we do is construct because of behavior*
Content Validity
the extent to which this measures the entire body. experts in the field or statistics are who drive this
-2 questions within content are: what degree does the test include a respresentative sample of all important parts of that behavioral domain and to what extent is the test free from the influence of irregular variables
2 questions within content validity
- to what degree does the test include a representative sample of all the important parts of that behavioral domain
- ex: if testing math in 3rd graders and just had multiplication problems that does not have good validity - to what extent is the test free from influence of irregular variables
- ex: on the math test, do everything but word problems- threatens validity because you could actually be measuring reading and math
Face validity
not necessarily done by experts in the field: do you look at it and think “yeah that’s what it measures”
-very close to the construct validity but face validity is much broader and lighter
Criterion-related validity
when you see if the test is related to some other gold standard. so one way is to look at concurrent validity (do they score similarly on this other test)
-are 2 tests supposed to measure the same thing giving you the same answer is the question for criterion related validity
construct
happiness, anger, motivation & we can try to get at these constructs by asking certain questions-assume these things drive our human behavior
predictive validity
how well test predicts future performance on related tests
Reliability-3 types
- inter-rater
- test-retest
- internal consistency
Reliability
is it doing a good job of measuring language?
inter-rater reliability
2 judges are deciding if the types of responses you’re getting are the same-want 2 judges to get identical/close to the same results. This is where you use statistics and look at how correlated they are
want inter-rater reliability to be 90% or greater
Test-retest reliability
to see if test is stable over time
- gre
- tend not to fluctuate hugely..there’s a problem if they do
Internal consistency reliability
looks at individual items in a test
- the higher the % the most confident you can be it’s testing that item
- teach individual item will get a score. not pulling away from what your construct is
Normative sample & derived scores
- normative sample
- raw scores
- convert to standard scores
- percentile rank
- age/grade equivalent scores
Normative sample
who you are assessing, SES/range
Raw scores
uninterpretable!!! because of age..why they get converted to standard scores
Standard scores (z-scores, t-scores, scaled scores)
developed through assesing your sample; model the test and find out what the mean and standard deviation are-on average how far from teh mean is the group. if standard deviation is big/far away from mean you have a flat curve
Percentile rank
score you performed at or better than. It is not a percentage of how many you got correct on a test. if average then your percentile rank is 50
Age/grade equivalent score
takes raw score and converts it
-worst method because language disorders vary so much
Downfall to age/grade equivalent scores
- suggests to parents that their child has the language of whatever that age is.
- tiny differences make big differences the older you get