Exam two Flashcards
reliability is
consistency
how close is an individual’s obtained score to his or her true score
we can never get to true score because of error
true score theory says there is a definite score but we cannot get exactly there ever
t
the closer a number is to one the stronger the reliability
t
test-retest reliability
t
alternate or parallel forms reliability
Consistency of results among two or more different forms of a test.
split-half reliability
one time administration, one group, measured with spearman brown formula
types of reliability
- test retest
- alternate or parallel
- split half
symbol for reliability
Rxx
formula for reliability
Rxx=S squared over Sx
X=T+E
observed true score error
SEM is
the standard error of measurement
what is good reliability
- .9 and greater is excellent
- .8 to .89 is good
- .70 to .79 is adequate
4.
low ESM is good for reliability
t
reliability means
variance found in observed scores is similar to true score variance
content sampling
sampled appropriately from known areas of information
time sampling
the results will be more similar if the test-retest time is smaller
interrater scorer differences
comparison of scores between raters
factors influencing reliability
- test length (longer over shorter)
- range of scores
- test difficulty (too hard, too easy)
- speed tests (timed tests) (not good for determing reliability)
mean
X with a bar over it
variance
S squared
correlation
r
summation
greek E
standard deviation
SD or SX
SEM and confidence limits
if a person gets an 100 on an iQ test and they get a three on the sem we can be confident that it should be close to true, we’ll then take 100 add three for 103 and subtract three for 97 which makes us sure it could be between those two, the less sem there is like 1 or 2 to subtract the more confident you can be .
what does an sem of 4 mean
add and subtract four from whatever the score is, same with any other number
validity 2 and 3 apply the most
- what trait is measured by the test
- what behaviors can be predicted by the test scores
- what does the test measure
- how well does it measure that attribute
validity symbol
R squared xy
v stands for variance
t
i stands for invalid variance
t
formula for validity
R squared xy=S squared v /S squared x
three types of validity
- content
- criterion
- construct
content validity
- how adequately a set of items sample some domain
2. sample of items from some universe of items
criterion validity
- how well a test measures or predicts an outcome
- test predicts some kind of behavior
- how well the test score correlates with some external behavior
content validity is concerned with
- describe the domain
- representative sampling
- no content bias (don’t overemphasize some things and neglect others)
methods for determining content validity
- expert judges
- pretest-postest
- correlate scores from similar tests
- develop expert rating scales
- face validity
what is a criterion
- standard of mastery
2. measure of success
characteristics of a good criterion
- must be reliable
- must be relevant
- free of bias
how to obtain criterion validity
- validity coefficient
for the .89 and .70 and what not you will sqaure the number and whatever is left out is left out.
t
how to obtain criterion validity
- validity coefficient
2. multiple predictors
construct validity
- measure some attribute
methods of measuring construct validity
- internal structure
- correlative with another similar measure
- discriminate validity (not measuring the quality being looked for)
validity range
- > 35 excellent
- .21 to .34 good
- > .1 1 to .20 useful
item is another word for
question on a survey
an instrument is another word for
survey
standardization
the process of putting different variables on the same scale.
Norms
A statistical average is called the norm.
Z scores
transformation of a raw score to an absolute value score
parallel forms
differing versions of tests or assessments that contain the same information, only in different order.
test-retest
Repeatability or test–retest reliability is the closeness of the agreement between the results of successive measurements of the same measure carried out under the same conditions of measurement.
K-R 20
KR-20/KR-20 are measures of test reliability, Kuder-Richardson Formula 20, or KR-20, is a measure reliability for a test with binary variables
Spearman brown
is a formula relating psychometric reliability to test length and used by psychometricians to predict the reliability of a test after changing the test length.
relationship between reliability and validity
reliability is if a result is consistent, validity is if a result is coherent with the test
Z score looks like
Z=x(with a bar over it)-x/S
percentiles
each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable.
quartiles
each of four equal groups into which a population can be divided according to the distribution of values of a particular variable.
deciles
each of ten equal groups into which a population can be divided according to the distribution of values of a particular variable
stanine
any of the nine classes into which a set of normalized standard scores arranged according to rank in educational testing are divided,
group norms
informal guidelines of behavior and a code of conduct that provides some order and conformity to group activities and operations
developmental norms
used to assess whether infants, toddlers, children, and/or adolescents are developing cognitive, communication, motor, socioemotional, and adaptive skills at approximately the same rate as their peers.