Reliability 논자시 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

The relationship between reliability and validity

A

What is reliability?
What is validity?
What is the relationship between them?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we measure reliability?

A
  1. Test-retest Method; stability
  2. Split-half Method, internal consistency
  3. Multiple-Forms Method (parallel form); equivalence
  4. Rater reliability (inter-rater, intra-rater)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the strengths and weaknesses of each method of measuring reliability?

A
  1. Test-retest Method
    strength:
    weakness:
  2. Split-half
    strength:
    weakness:
  3. Multiple-forms
    strength:
    weakness:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some key statistical figures to know in reliability measures? What do they mean?

A

Correlations

Cronbach alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we measure validity?

A

Content validity
Criterion-related validity (predictive, concurrent)
Construct validity (discriminant, convergent)

correlations
factor analysis
MTMM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Things to consider in understanding the relationship between reliability and validity in language testing

A

consistency of measurement
the degree to which accumulated evidence supports the inferences that are made from the scores
observed score, true score, error score

the agreement between two efforts to measure the same trait through maximally similar methods
the agreement between two attempts to measure the same trait through maximally different methods

performance assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe the features of (a) purpose, (b) content, (c) frame of reference, (d) scoring procedure and (e) testing method on test development.

A
  1. Educational purpose - used for a wide variety of decisions. Classify based on type of decision to be made.
    (a) admission - selection, entrance, readiness
    (b) identify appropriate level/areas of instruction - placement tests, diagnostic tests
    (c) learning progress - progress reports, achievements, attainments (mastery)
  2. Research purpose - test results used for comparing the performance of individuals with different characteristics, under different conditions of acquisition, or instruction. Language tests also test the hypotheses about the nature of language proficiency.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the different types of tests.

A
  1. For diagnostic tests, …
  2. For placement tests, …
  3. For selection tests, …
  4. For formative tests, …
  5. For proficiency tests, …
  6. For achievement tests, …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Difference between subjective and objective scoring methods

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Factors that influence test performance

A
  1. Communicative language ability
  2. Test method facets
  3. Personal attributes
  4. Random factors

Test method facets:
testing environment, test rubric, input, expected response, relationship between input and expected response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Criterion referenced (CR) test vs. Norm-referenced (NR) test

A

Results of a language test can be interpreted in two ways depending on frame of reference.
NR interpretation - interpreted in relation to the performance of a group (or norm). The group is a large group of individuals who are similar to the individuals for whom the test is designed. Results are usually interpreted with the group they are taking it with instead of a separate norm group. (mean, median, sd, percentile rank) Scores distributed on a normal distribution, ranked. Maximize distinctions among individuals in a given group. (Standardized tests have fixed content, standard procedures administering and scoring, rigorously tested and empirically validated)
CR interpretation - interpret a score based on a criterion level/ability/content (mastery of subject). Must specify reference points (criterion level of ability/domain). Items are selected based on how adequately they represent these ability levels. Need good coverage of content domain. Subject matter experts evaluate test items against test specifications. Mastery would result in an A, regardless of how many get an A.
Keep in mind that it would be harder to implement a cut-off score when most scores cluster in one area (High, middle, low). unnaturally low variance, low reliability estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly