RELIABILITY Flashcards
measure of the accuracy of a test or measuring instrument obtained by measuring the same individuals twice and computing the correlation of the two sets of measures
- It is used to determine how much of the variability in observed scores is due to true differences in the construct being measured, as opposed to measurement error.
Ex: Imagine a test designed to measure job satisfaction. If the reliability coefficient is 0.85, it suggests that 85% of the variance in test scores is due to true differences in job satisfaction, while 15% is due to random or measurement error.
reliability coefficient
this assumes that each person has a true score that would be obtained if there were no errors in measurement
classical test theory
refers to, collectively, all of the factors associate with the process of measuring some variable, other than the variable being measured
- If a thermometer reads 1°C too high (systematic error) and the readings fluctuate slightly (random error), these combined are referred to as___
“Both type of error”
measurement error
source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process
- Measuring heart rate with slight variations each time due to changes in the subject’s movement or the observer’s timing.
“Unpredictable fluctuations”
random error
a source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true value
- A weighing scale that always adds 2 kg to the actual weight introduces a systematic error.
“bias in one direction”
systematic error
item sampling or content
sampling - terms that refer to
variation among items within a
test as well as to variation
among items between tests
test construction
sources of error variance that occur during test administration may
influence the test taker’s attention or motivation
test administration
scorers and scoring systems are potential sources of error variance
test scoring and interpretation
an estimate of reliability obtained by correlating pairs of scores from the SAME PEOPLE on TWO DIFFERENT ADMINISTRATIONS of the same test
- Di pwede sa dynamic test
-To determine whether a test produces consistent results across time.
test retest reliability
estimate of test- retest reliability when the interval between testing is
greater than six months
• If a personality inventory is administered today and then again after six months, the correlation between the scores would represent the___
coefficient of stability
occurs when the first testing session influences the results
of the second session, and this can affect the test-retest
reliability of a psychological measure
- Kapag di 15 days
carry over effect
-a type of carryover effect wherein the scores on the second test administration are higher than they were on the first
• A student taking the same cognitive test multiple times might score higher because they’ve become familiar with the questions, not because their cognitive abilities have improved.
- Kapag masyadong malapit yung administration (Ex: 5 days)
practice effect
uses one set of questions divided into two equivalent
sets(“forms”), where both sets contain questions that measure the same construct, knowledge or skill
- This means they have the same difficulty level, the same number of items, and measure the construct in an identical way.
parallel forms reliability
- an estimate of the extent to which these different forms of the same test have been affected by item
sampling error, or other error. - means different versions of a test that have been constructed so as to be parallel
- The forms are equivalent in content but may not be statistically identical.
alternate forms reliability
one of the most rigorous and burdensome assessments of reliability since test developers have to create two forms of the same test practical constraints make it difficult to retest the
same group of individuals
limitations of parallel and alternate forms reliability
obtained by correlating two pairs of scores obtained
from equivalent halves of a single test administered
once Three (3)steps
Step 1. Divide the test into equivalent halves.
Step 2. Calculate a Pearson r between scores on the
two halves ofthe test.
Step 3. Adjust the half-test reliability using the
Spearman–Brown formula
split half reliability
a statistics which allows a test developer to estimate
what correlations between the two halves would
have been if each half had been the length of the
whole test and have equal variances
spearman brown formula
if they contain items that measure a single trait
homogeneity
the degree to which a test measures different
factors
heterogeneity
the statistics used for calculating the reliability of a test in which the items are dichotomous or scored as 0 or 1 above 0.50 is reasonable reliability coefficient and the test is considered homogeneous if the reliability coefficient is above 0.90
kuder-Richardson formula 20 (KR20)
- the preferred statistic for obtaining an estimate of internal consistency reliability
- may be thought of as the mean of all possible split-half
correlations, corrected by the Spearman–Brown Formula - appropriate for use on tests containing non-dichotomous
items
coefficient alpha (Cronbach’s Alpha)
the degree of agreement or
consistency between two or more scorers (or judges or raters) with regard to a particular measure
interrater reliability
the best method for assessing the level of agreement between raters is __
- Kapag 2 or more ang Raters
kappa statistics
___ have reasonably high internal consistency than heterogenous test (measuring multiple factors or traits)
- CRONBACH’S ALPHA ang ginagamit if 1 lang ang Factor
- If 2 or more FACTORS ANALYSIS
Ex: all participants share the same trait, like age group or occupation, ensuring the group is uniform for analysis.
homogeneity