Chapters 2, 3, 4, Test Development and Psychometrics Flashcards

1
Q

What are four basic types of measurement scales

A

nominal, ordinal. interval, and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the most elementary of the measurement scales, involves classifying by name based on characteristics of the person or object being measured.

A

nominal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

provides a measure of magnitude and, thus, it often provides more information than does a nominal scale.

A

ordinal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

units are in equal intervals; thus, a difference of 5 points between 45-50 represents the same amount of change as the differences of 5 points between 85 and 90 points.

A

Interval scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Has the same properties of an Interval scale without the existence of a meaningful zero.

A

ratio scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

an individual’s score is compared with scores of other individuals who have taken the same test

A

norm-referenced instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

an individual’s score is compared with an established standard or criterion

A

criterion-referenced instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

predetermined cutoff score indicates whether the person has attained and established level

A

Mastery component

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

converting scores into a _________helps understand how your score compares with the others who also took the test

A

Frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

often used assessment because this graphic representation makes the data easier to understand (x-axis -horizontal/y-axis vertical)

A

Frequency polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three central tendencies that are useful in interpreting individual’s results on an instrument

A

Mode
Median
Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

is the most frequent score in a distribution - highest frequency of any of the scores

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

the score at which 50% of the people had a score below it and 50% of the people had a score above it.

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The arithmetic average of the scores

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Provides a measure of the spread of scores and indicates the variability between the highest and the lowest scores. _________ is calculated by simply subtracting the lowest score from the highest score.

A

Range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

To avoid the problem of getting a zero- square the deviations, add these together and divide by the numbers of scores

A

Variance or mean square deviation

17
Q

The square root of the variance is the ___________

A

standard deviation

18
Q

scores on an instrument fall into a ______

A

normal distribution or a normal. curve (bell-shaped)

19
Q

________is one in which the majority of scores are at the lower end of the scores.

A

positively skewed distribution

20
Q

Where the majority of scores are on the higher end of the distribution

A

negatively skewed distribution

21
Q

What are example of three standard Scores

A

z Score
T Scores
Stanines

22
Q

What are the three major theories related to reliability?

A

classical test theory,
generalizability theory
item response theory

23
Q

Based on the degree that there is an error within the instrument. _________suggests thatt every score has two hypothetical components: a true score and an error component

A

Classical test theory

24
Q

_______means there is a system-methods are planned, orderly, and methodical

A

Systematic

25
Q

means lack of system. -occurrences are presumed to be unsystematic (eg. not consistent, such as coffee spill on just one person’s test)

A

Random error

26
Q

estimates range of
scores if someone to took test over and over again.

A

Standard error of measurement

27
Q

When there is a long period between the administrations of the instrument

A

coefficient of stability

28
Q

Measures consistency of how the test measures a particular construct within the group

A

Internal consistency

29
Q

Helps measure internal consistency by splitting group in half and comparing results of each which should be consistent

A

Split-half Reliability

30
Q

Measurement of consistency between
test scores for different test scorers.

A

Interscorer (interrater reliability)

31
Q

Two forms given to same person and then tested for
reliability.

A

Parallel forms

32
Q

Norming Groups. Created through:

A

*Simple random sample (every person in pop has same chance of being sampled)

*Stratified sample (used in assessments –where test developer match percentage or population
in terms of ethnic groups, geographic location, socioeconomic status)

*Cluster sample (create sample from groups – ie random sample of child achievement achieved through random selection of clusters – ie schools).

33
Q

Consistency of a measure.

A

Reliability

34
Q

Typically Pearson r – estimates a test
reliability – ranges from -1 to 1 – where 1 or -1 is a perfect relationship between the variables with no measurement error. The closer to 1 or -1, the better. 0 indicates there is no association between the variables.

Small association: .1 to .3 or -.1 to -.3
Medium association: .3 to .5 or -.3 to -.5
Strong association: .5 to 1.0 or -.5 to -1.0
Correlation NOT causation!

A

Reliability Coefficient

35
Q

The extent to which a test measures what it’s meant to measure.

A

Validity

36
Q

How well the test or assessment evaluates all aspects of the topic, construct or behaviour

A

Content-related variability

37
Q

Extent to which test or instrument measures outcome (SAT as a predictor of college performance)

A

Criterion-related validity

38
Q

Extent to which test or instrument measures or reflects the intended construct (such as intelligence).

A

Construct validity

39
Q

Allows us to predict values for a response.

A

Regression
(Regression analysis used to estimate slope)