Reliability Flashcards
refers to the consistency of scores obtained by the same person when re-examined with the same test on diffeerent ocasssions or with different sets of equivalent items or under other variable examinig condition
Reliability
true or false
measurement error is common in all fields of science
true
true or false
test that are relatively free ofmeasuremen error are considered reliable while test that contains relatively large measurement of error are considered unreliable
true
identification
the difference between observed score and the true score results
measurement error
identification
the standard deviation of the distribution of errors for each repeated application of the same test on an individual
standard error of measurement
true or false
although it is impossible to eliminate all measurement error, test developers do strive to maximize psychometric nuisance through careful attention to the sources of measurement error
false. minimize
t/f
the greater the numberof items the higher the reliability
true
identification
a problem in the use of limited number of items to represnt a larger and more complicated construct
domain sampling model
3 sources of measurement error
item selection
test administration
test scoring
identify
one source of measurement error is the instrument itself
item selection
identify what sources of measurement error and
true/false
general environmental conditions , momentary fluctuation of anxiety, motivation, attention and the examiner can contribute and be the source of measurement error in the process
true
test administration
identification
with the help of acomputer the item difficulty is calibrated to the mental ability of the test taker
item response theory
t/f
correlation coefficient (r) expresses the degree and magnitude of a linear relationship between two sets of scores obtained from the same person
true
forms of reliability
test-retest
parallel forms
inter-rater reliability
split half reliability
it is established by comparing the scores obtained from two successive measurements of the same individuals and calculating a correlation between the twosets of scores
test-retest reliability
other term for test-retest reliability that measure the error associated with administering a test at two different times
time sampling reliability
corresponds to the random fluctuations of performance from one test session to the other
error variance
you took an iq test today and you will take it again after exactly a year, if your scores are almost the same then the measure has a good ________? what kind of reliability is the example
test-retest reliability
limitations of test-retest reliability
carryover effect
practice effect
occurs when the first testing session influences the results of the second session and this can affect the test-retest reliability of a psychological measure
carryover effect
a type of carryover effect wherein the scores on the second test administration are higher than they were on first
practice effect
t/f sometimes a poor test-retest correlation do not mean that the test is unreliable. it might mean that the variable under the study has changed
true
It is established when at least two different versions of the test yield almost
the same scores.
parallel form reliability
other term for parallel form reliability that knowns to compares two equivalent forms of a test that measure the same attribute to
make sure that the items indeed assess a specific characteristic.
item sampling or alternate forms reliability
The Purdue Non-Language Test (PNLT) has Forms A and B and both
yield slightly identical scores of the test taker is an example of what reliability
parallel forms reliability
Tests should contain the same number of items and the items should
be expresses in the same form and should cover the same type of
content is an example of what reliability
parallel forms
true or false
The error of variance in parallel forms represents fluctuations in
performance from one set of items to another, but not fluctuations over
time.
true
true or false
The range and level of difficulty of the items in parallel forma of reliability should be equal.
true
It is the degree of agreement between two observers who simultaneously
record measurements of the behaviors.
inter- rater reliability
Two psychologists observe the aggressive behavior of elementary school
children. If their individual records of the construct are almost the
same, then the measure has a good ______reliability.
inter rater reliability
Two parents evaluated the ADHD symptoms of their child. If they both
yield identical ratings, then the measure has good __________ reliability
inter-rater
what statistic inter-rater reliability use in order to assess the level of agreement among
several raters using nominal scales.
kappa statistics
It is obtained by splitting the items on a questionnaire or test in half,
computing a separate score for each half, and then calculating the degree of
consistency between the two scores for a group of participants.
split-half reliability
This model of reliability measures the internal consistency of the test which
is the degree to which each test item measures the same construct. It is simply
the intercorrelations among the items.
split-half
t/f
The test can be divided according to the odd and even numbers of the items
(odd-even system).
true
what are the formula used to measure the internal consistency of a test in split half reliability
spearman-brown, kuder-richardson and cronbach’s alpha
A statistics which allows a test developer to estimate what correlations
between the two halves would have been if each half had been the length of
the whole test and have equal variances.
Spearman- Brown Formula
Spearman Brown Formula
rSB= 2r hh/ 1+ rhh
A statistics which allows the test developer to confirm that a test has
substantial reliability in case the two halves of a test have unequal variances.
Cronbach’s coefficient alpha
The statistics used for calculating the reliability of a test in which the items
are dichotomous or scored as 0 or 1.
Kuder-Richardson 20 (KR20) Formula
t/f
For tests that has two forms, use parallel forms reliability.
true
true /false
For tests that are designed to be administered to an individual more
than once, use split-half reliability
false. test-retest
t/f
For tests with continuum Likert scale, use Cronbach’s coefficient
alpha.
true
t/f
For tests which involve dichotomous items or forced choice items, use spearman-brown formula
False. KR20
t/f
For tests with items carefully ordered according to difficulty, use inter-rater reliability
false. split-half
t/f
For tests which involve some degree of subjective scoring, use test-retest reliability
false. inter-rater reliability
is a statistic that quantifies
reliability, ranging from 0 (not at all reliable) to 1 (perfectly reliable).
reliability coeffecient
usually refers to a mistake of some sort that could have
been prevented had a person been more conscientious, more skilled, or better informed.
error
refers to the inherent uncertainty associated with any measurement, even
after care has been taken to minimize preventable mistakes
measurement error
consists of unpredictable
fluctuations and inconsistencies of other variables in the measurement process
random error
refers to the
proportion of the total variance attributed to true variance.
reliability
When the interval between testing is greater than six months, the
estimate of test-retest reliability is often referred to as the ________
coefficient of stability
The degree of the relationship
between various forms of a test can be evaluated by means of an alternate-forms or parallel-forms
coefficient of reliability, which is often termed the
coefficient of equivalence
refers to an estimate
of the extent to which item sampling and other errors have affected test scores on versions of the
same test when, for each form of the test, the means and variances of observed test scores are equal.
parallel forms reliability
are simply different versions of a test that have been constructed so as
to be parallel. Although they do not meet the requirements for
the legitimate designation “parallel,” they
are typically designed to be equivalent with respect to variables
such as content and level of difficulty.
alternate forms
refers to an estimate of the extent to which
these different forms of the same test have been affected by item
sampling error, or other error.
alternate forms reliability
refers to the degree of correlation among all the
items on a scale.
inter-item consistency
seek to estimate the extent
to which specific sources of variation under defined conditions are contributing to the test
score.
domain sampling theory
tool used to estimate or infer the extent to
which an observed score deviates from a true score.
standard error of measurement
t/f
standard error of a score and
denoted by the symbol σmeas,
true