Practice quiz questions Flashcards

1
Q

Name at least 3 commonly used indicators of reliability?

A
  • Split Half Method
  • Test-Retest
  • Alternate Forms
  • Interrater Reliabiltiy (Kappas)
  • KR20
  • Cronbachs Alpha
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Increasing reliability:
1 Increase the number of ______
2 Derive unidimensional tests _____ _______ to reduce heterogeneity
3 _________ for ___________ to increase estimated correlation between tests
4 Apply c_________ measurement models to obtain composite variables

A

Items; Factor Analysis; Correction for attenuation; Congeneric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which are the 2 most commonly used types of psychological tests?

A

Ability & Personality tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which scientist first extensively measured individual differences in late 19th Century? Which
discipline did he specialise in?

A

Francis Galton, biologist/physiologist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which experimental psychologist set up the first laboratory for making systematic observations?

A

Wilhelm Wundt – set up first psychological laboratory in Leipzig in 1879; standardized
conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which American psychologist first coined the term “mental test”?

A

James Cattell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the three overlapping concepts described by the term “human ability”?

A

Achievement, aptitude and intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Who constructed the first widely used psychological test? What for, and when?

A

Binet & Simon, for individually testing the intelligence of children for classification of mental
retardation in France in 1905.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Who devised the first widely used group intelligence test? What for?

A

Robert Yerkes – American psychologists for group testing among army recruits in the 1910s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Who developed the first personality questionnaire? What was that called?

A

Woodworth in 1920 developed the Personal Data Sheet, a structured paper-pencil group test,
to screen military recruits in WW1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Who first provided systematic description and review of published tests? (O_______ Bu______)

A

Oscar Buros

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which is the most referenced test in the psychological literature?

A

MMPI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What characterises a psychological test, and what makes it a special type of psychological measure.?

(a) A psychological test is characterised by s__________ a____________ and
scoring, the use of a m______ and usually the availability of population n_____ to
assist interpretation
(b) A set of items that has accepted levels of r_______ and v______, and allows
measurement of some attribute of an individual

A

standardised administration; manual; norms; reliability; validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What characterises a psychological test, and what makes it a special type of psychological measure.?

(a) A psychological test is characterised by s__________ a____________ and
scoring, the use of a m______ and usually the availability of population n_____ to
assist interpretation
(b) A set of items that has accepted levels of r_______ and v______, and allows
measurement of some attribute of an individual

A

standardised administration; manual; norms; reliability; validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Suggest 3 methods that you would go about locating information about a published personality test?

A
  • Mental Measurement Yearbooks, distributors’ (e.g., ACER)
  • test catalogues
  • test manuals
  • Psychology databases (e.g., PsychINFO).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. KR20 and coefficient alpha are both measures of:
    a. the extent to which items on a test are clearly related to the construct being measured.
    b. the extent to which items on a test are intercorrelated.

c. the extent to which items on a test are of an
appropriate level of difficulty.

d. the extent to which items on a test are truly measuring what they purport to measure.

A

a. the extent to which items on a test are clearly related to the construct being measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q
  1. Administering a test to a group of individuals, re-administering the same test to the same group at a later
    time, and correlating test scores at times 1 and 2 demonstrates which method of estimating reliability?

a. alternate forms method
b. test-retest method
c. split-half method
d. internal consistency method

A

b. test-retest method

18
Q
  1. A test has a reliability coefficient of 0.77. This coefficient means that
    a. 77% of the variance in the test scores is true score variance, and 23% is error variance.
    b. 77% of items on this test are reliable, and 23% of the items are unreliable.
    c. 23% of the variance in the test scores is true score variance, and 77% is error variance.
    d. 77% of the variance in the test scores is unexplained variance, and 23% is error variance.
A

a. 77% of the variance in the test scores is true score variance, and 23% is error variance.

19
Q
  1. Test constructors can improve reliability by
    a. increasing the number of items on a test.
    b. decreasing the number of items on a test.
    c. retaining items that measure sources of error variation.
    d. increasing the number of possible responses to each item.
A

a. increasing the number of items on a test.

20
Q
  1. Administering two supposedly equivalent forms of a test (e.g., Form A and Form B) to the same group of
    individuals yields a correlation coefficient indicating:

a. test-retest reliability
b. split-half reliability
c. alternate forms reliability
d. internal consistency reliability

A

c. alternate forms reliability

21
Q
  1. A reliability of a difference score is expected to be:

a. lower than the reliability of either test on which it is based.
b. higher than the reliability of one test and lower than the reliability of the other test on which it is based.
c. higher than the reliability of both tests on which it is based.
d. unrelated to the reliability of either test on which it is based.

A

a. lower than the reliability of either test on which it is based.

22
Q
  1. Strong inter-rater (or inter-judge) reliability is probably most important to which type of test?

a. Structured (objective) personality tests
b. Achievement tests
c. Behaviour rating scales
d. Aptitude tests

A

c. Behaviour rating scales

23
Q
  1. The relative closeness of a person’s observed score to his/her true score is estimated by the

a. test-retest reliability coefficient
b. internal consistency reliability coefficient
c. standard deviation.
d. standard error of measurement.

A

d. standard error of measurement.

24
Q
  1. Sources of error associated with time-sampling (e.g., practice effects, carry over effects) are best expressed
    in coefficients, whereas error associated with the use of particular items is best expressed in
    coefficients.

a. test-retest reliability, internal consistency reliability
b. alternate forms reliability, inter-rater reliability
c. internal consistency reliability, test-retest reliability
d. split-half reliability, alternate forms reliability

A

a. test-retest reliability, internal consistency reliability

25
Q
  1. Which of the following allows test developers to estimate what the correlation between two measures
    would have been if they had not been measured with error?

a. standard error of measurement
b. reliability of a different score
c. correction for attenuation
d. Spearman-Brown prophecy formula

A

c. correction for attenuation

26
Q
  1. Criterion related validity includes:

a. content validation
b. concurrent, predictive and construct validation
c. construct validation
d. predictive and concurrent validation

A

d. predictive and concurrent validation

27
Q
  1. Test A has been in use for many years. It is highly respected as a test of intelligence. Test B is a new test,
    much shorter and easier to administer than A. Test B also claims to be a test of intelligence. The validation of
    Test B with Test A would be:

a. face validation
b. concurrent validation
c. predictive validation
d. discriminant validation

A

b. concurrent validation

28
Q
  1. The ultimate criterion for most tests is:

a. academic achievement at school
b. personality assessment by other tests
c. actual performance in real life or on the job
d. convergent validation with moderator tests

A

c. actual performance in real life or on the job

29
Q
  1. A musical aptitude test is given to a number of students at the School of Music and also a group of
    students of the Institute of Sport. This method of validating a test is known as:

a. concurrent method
b. contrasted groups
c. criterion validation
d. predictive validity

A

b. contrasted groups

30
Q
  1. Face validity should generally be considered in test development because it is:

a. a strong clue to the statistical evidence for their selection
b. justifiable as an important concept of validity
c. a substitute for more objective kinds of evidence
d. indicative of the test’s acceptability by the type of examinees for which it is designed.

A

d. indicative of the test’s acceptability by the type of examinees for which it is designed.

31
Q
  1. Which of the following is an example of criterion contamination in an industrial study?
    a. Many workers unable to perform the job adequately were fired prior to the gathering of the criterion data.
    b. The criterion used was unreliable and subject to large inter-scorer differences.
    c. The predictive test scores for examinees were known by the individuals making the criterion assessments.

d. There was an unpredictable change in the choice of a criterion after the study began due to economic
downturn policy changes.

A

c. The predictive test scores for examinees were known by the individuals making the criterion assessments.

32
Q
  1. There are two computer aptitude tests, one of which is individually administered at a cost of $75 per
    assessment. The second is a group test administered at a cost of $5 each. The second test would be used if its
    results could be shown to be closely similar to the results of the more expensive individual test. What type of
    validation would be most relevant to this decision:

a. content
b. face
c. concurrent
d. predictive

A

c. concurrent

33
Q
  1. A high correlation between measures of test anxiety obtained through a self-report inventory and through
    a physiological technique eg. GSR or urinalysis illustrates:
    a. discriminant validity
    b. concurrent validity
    c. context validity
    d. convergent validity
A

b. concurrent validity

34
Q
  1. To show good discriminant validity a test of numerical aptitude should:
    a. correlate highly with reading comprehension
    b. correlate low with reading comprehension
    c. correlate highly with arithmetic grades
    d. correlate low with arithmetic grades
A

b. correlate low with reading comprehension

35
Q
  1. The validity of a test:

a. Should be settled once and for all by the test publishers.
b. Is essentially the same concept as it was 50 years ago.
c. Addresses itself to the question of what the test really means
d. Has no bearing in the area of personality testing.

A

c. Addresses itself to the question of what the test really means

36
Q
  1. What are 3 commonly used indicators of reliability?
A
Test-retest reliability, 
inter-observer reliability, 
split half reliability,
 parallel forms,
internal consistency.
37
Q

What are 3 strategies for increasing reliability of measures?

A

Increase the number of items,
derive uni-dimensional tests through factor analysis,
correction for attenuation,
applying congeneric measurement models to obtain composite variables within structural equation modelling.

38
Q

What are 4 commonly used within-group norms that can enable us to interpret an individual’s test scores?
Per_____
_ scores,
normalised _________ scores (examples are T scores, stanines),
__’s.

A

Percentiles; Z Scores; standardised; IQ’s

39
Q

What is the relationship between validity coefficient of a measure and standard error of estimate?

Standard error of estimate is determined by the v________ c___________ of a test and the s________ d________ of criterion scores used to validate the test.

A

validity coefficient; standard

deviation

40
Q
SEest = S2 √1 – (R12)2
o Where (R12)2 = \_\_\_\_\_\_ \_\_\_\_\_\_\_\_ squared
o S2 = \_\_\_\_\_\_\_ \_\_\_\_\_\_\_\_\_  of criterion scores
A

validity coefficient; standard deviation

41
Q

Suggest 2 ways to determine construct-related evidence of validity.
A construct-validated instrument should have high correlations with other measures or
methods of measuring the same construct (c______ v________), but low correlations with
measures of different constructs (_________ v_________).

A

convergent validity; discriminant validity