Chapter 5 Reliability Flashcards by Rics RPm2024

It refers to consistency in measurement, it only really refers to something that is consistent–not necessarily good or bad but simply consistent

Reliability

How well did you know this?

Not at all

Perfectly

an index of reliability, a proportion that indicates the ratio between true score variance on a test and the total variance.

Reliability Coefficient

How well did you know this?

Not at all

Perfectly

An index describing the consistency of the scores across context

Reliability Coefficient

How well did you know this?

Not at all

Perfectly

This theory states that a score on an ability test is presumed to reflect not only the test taker’s true score but also the error

Classical Test Theory

How well did you know this?

Not at all

Perfectly

a portion of our observe score which extent of our ability, characteristics and behavior

True score

How well did you know this?

Not at all

Perfectly

the component of the observed test score that does not have to do with the test taker’s ability

Error

How well did you know this?

Not at all

Perfectly

It is a statistic useful in describing scores of test score variability. It is useful because it can broken down into components

Variance

How well did you know this?

Not at all

Perfectly

What are the two components of variance?

True variance and Error Variance

How well did you know this?

Not at all

Perfectly

It is a variance from true differences

true variance

How well did you know this?

Not at all

Perfectly

Variance from irrelevant various sources

error variance

How well did you know this?

Not at all

Perfectly

The greater the proportion of the total attributed to true variance, the more____is a test

reliable

How well did you know this?

Not at all

Perfectly

all factors associated with the process of measuring some variable, other than the variable being measured;

measurement error

How well did you know this?

Not at all

Perfectly

It is a source of error caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process. It is also called “noise”

Random error

How well did you know this?

Not at all

Perfectly

A type of error that is typically constant or proportionate on what is presumed to be the value of the variable being measured

systematic error

How well did you know this?

Not at all

Perfectly

True or False. Is Systematic error can be fixable once it is discovered?

TRUE

How well did you know this?

Not at all

Perfectly

True or False. Is Systematic error does affect the score consistency?

FALSE

How well did you know this?

Not at all

Perfectly

According to this theory, we can estimate the true score by finding the mean of the observe scores from repeated administration

Basic Sampling theory

How well did you know this?

Not at all

Perfectly

Which source of error is this situation? The Extraversion Personality test constructed by the students of Ms. Salinas has a variation among items within a test and variation among items between items

Test construction under item/content sampling

How well did you know this?

Not at all

Perfectly

What are the three sources of error under test administration?

Test-environment, test-taker variables, and examiner-related-variables

How well did you know this?

Not at all

Perfectly

A source of error in which the sample was actually represent the population but it is not enough

Sampling error

How well did you know this?

Not at all

Perfectly

A sources of measurement error in which a variability inherent in a test score as a function of the fact that they are obtained at one point in time rather than another. Same test given at different points in time may produce different scores, even if given to the same test takers tests of relatively stable traits or behavior may be prone to this error

Time Sampling

How well did you know this?

Not at all

Perfectly

A sources of measurement error that results from selecting test items that inadequately cover the content area that the test is supposed to evaluate

Item/content sampling

How well did you know this?

Not at all

Perfectly

A sources of measurement error that is concerned with the intercorrelations between items in a test.

If the test is designed to measure a single construct and all items are equally good candidates to measure that attribute, then there should be a high correspondence among items

Inter item inconsistency

How well did you know this?

Not at all

Perfectly

A sources of measurement error in which different judges observing the same event may record different numbers

Observer Differences:

How well did you know this?

Not at all

Perfectly

A reliability estimate in which it correlates pairs of scores from SAME people who are administered the SAME test at two DIFFERENT times.

Test-Retest Reliability Estimate

what is the result of time sampling error in test-retest reliability

scores are likely to fluctuate as a result of time sampling error longer interval = lower correlation

What is the ideal interval of re-administering a test in test-retest reliability?

(ideal interval between tests is 2-4 weeks

What are the statistic procedure use in test-retest reliability?

statistics: Pearson r -> interval and ratio Spearman rho -> ordinal

What are the possible intervening factors that will happen if we did not follow the interval of 2-4weeks of re-administration of a test in a test-retest reliability?

carry over effect Practice effect Mortality Changes in Participants Combination of all these factors

explain the carry over effect in test-retest reliability

--possible na maalala yung tests at mag review occurs when the first testing session affects the second testing (e.g. remembering test items) only of concern when it is random or if it is unpredictable and affects only some respondents if systematic error, where it affects all respondents, then reliability is not affected

explain the Practice effect in test-retest reliability

when a test takers score better because they have sharpened their abilities with the passing of time—-- development shows plasticity hahaha

explain the coefficient stability

Kapag naman masyadong matagal 6 months for example wala nang masyadong maalala si test taker–ang tawag pag ganito is Coefficient of Stability

Explain Mortality in test-retest reliability

dropping out of the study.

Explain changes in Participant in test retest reliability

-non normative changes and normative history graded influences

It is the degree of the relationship between various forms of a test that can be evaluated by means of alternate forms of a test or parallel forms coefficient of reliability

Coefficient of equivalence

In this reliability of estimate, the means and variances of observed test scores are equal

Parallel forms

In theory, the mean scores obtained on ______ correlate equally with true score and with other measures

Parallel Forms

A reliability estimate in which simply different versions of a test that have been constructed to as parallel. It is a mere different versions without the sameness of the observed scores

Alternate Forms

It is typically designed to be equivalent with respect to variables such as content and level of difficulty

Alternate Forms

How to obtain the estimates in parallel/alternate forms

- two administrations with the same group are required (Form A at one point and Form B at the other) - the equivalent form is administered either immediately or fairly soon -after administration, correlation coefficient is obtained between the results of the two forms

What are the statistic procedure that we can use in Alternate or Parallel forms?

- statistics Pearson r for scale measurements (interval/ratio) - Spearman rho for ordinal

Source of error in immediate Alternate/Parallel forms

content sampling

Source of error in delayed Alternate/Parallel forms

content sampling and time sampling

True or False. Alternate forms is are INDEPENDENTLY CONSTRUCTED TESTS

True

Give the following that alternate and parallel forms should have

- the same number of items - items expressed in the same form - items that cover the same type of content - same difficulty - same instructions -same time limits, format, and all other aspects of the test

degree of correlation among all the items on a scale calculated from a single administration of a single form of a test

Internal Consistency

Explain the term "assess homogeneity" in Internal consistency

extent to which items measure a single trait

True or False. Is possible that heterogenous items can be homogenous

True

Explain why is it possible that heterogenous items can be homogenous

possible pa rin naman kahit heterogenous yung items basta i-measure ang items per subscale Homogenous pa rin ang test na may subscales as long as per subscale iisang construct lang yung minemeasure.

In this reliability estimates, the two scores are obtained for each person by dividing the test into equivalent halves

Split-Half Reliability Estimate

What are the steps of Split-Half Reliability Estimate?

- divide the test into two equivalent halves - calculate Pearson r between scores on the two halves of the test - adjust half-test reliability using the Spearman-Brown formula

What are the acceptable ways to divide the test in Split Half Reliability?

- random assignment - odd-even - split dividing the test by content so that each half contains - equivalent items with respect to content and difficulty

Explain the disadvantage of split half reliability

reliability of the test is directly related to the length of the test, kaya pag split half yung ginamit mo possible na below .70 yung correlation ng scores. To address that gagamit ng spearman brown formula. Rule of thumb is the higher the number of items the higher the reliability

What is spearman brown formula?

- it is used in split-half tests in correcting their correlation - estimates the internal consistency reliability from two halves of a tests - can be used to determine the reliability of one test once it is shortened or lenghtened

What are the other functions of spearman-brown formula?

- it can determine the number of items needed to attained a desired level of reliability

In adding new items using spearman brown formula, what are the factors to consider

it must be equivalent in content in difficulty so that the longer test still measures what the original test measures

if the reliability is low in the context of split half and spearman brown, what we can do?

- abandon the instrument - locate or develop a suitable alternative - create new items, clarify the test instruments or simply the scoring rules

used for non-dichotomous items —walang maling sagot na test such as personality tests

Cronbach’s Coefficient Alpha

True or False. It is appropriate to use split half reliability in heterogenous test and speed test

False

According to Steiner, the value of alpha____ may be too high and indicate redundancy items

.90

Cronbach's alpha will be higher when a measure has more than ___items

used for tests with dichotomous items with varying difficulty May choices at may iisang tamang sagot lang na test and magkakaiba yung level of difficulty per item

KR-20 (Kuder-Richardson formula 20)

who is the propent of

KR-20 (Kuder-Richardson formula 20)

used for tests with dichotomous items with same difficulty or average 50% difficulty

KR-21

A relatively new measure for evaluating internal consistency of a test. It focuses on degree of differences that exist between item scores

Average Proportional Distance (ADP)

Interpretation of ADP

0.2 higer---excellent .25 to 0.2 Acceptable

ADP with .25 measure means

it suggests that there is a problem in internal consistency of the test

What is one potential advantage of APD to Cronbach's Alpha?

APD is not connected to the number of items on a measure

A reliability estimate that focus on degree of agreement or consistency between two or more raters with regard to a particular measure

. Inter-rater reliability

. Inter-rater reliability is often use in _____

evaluating non-verbal behavior

what is the source of error in Inter-rater reliability?

differences between raters assessed / scoring

What are the statistical procedure use in inter-rater reliability

Kappa Statistics: 2 raters Fleiss Statistics: 3 rater

If the test is designed for use at various times or for stable traits (e.g. employee performance, and personality trait, then what reliability estimate should we use

Test-retest reliability

If the test is for single administration only. then what reliability estimate should we use?

Internal Consistency

____ is a source of error attributable to variations in the test takers feelings, moods, or mental state overtime

Transient Errors

Chapter 5 Reliability Flashcards

(75 cards)