Practice quiz questions Flashcards by Micah Boerma

Name at least 3 commonly used indicators of reliability?

Split Half Method
Test-Retest
Alternate Forms
Interrater Reliabiltiy (Kappas)
KR20
Cronbachs Alpha

How well did you know this?

Not at all

Perfectly

Increasing reliability:
1 Increase the number of ______
2 Derive unidimensional tests _____ _______ to reduce heterogeneity
3 _________ for ___________ to increase estimated correlation between tests
4 Apply c_________ measurement models to obtain composite variables

Items; Factor Analysis; Correction for attenuation; Congeneric

How well did you know this?

Not at all

Perfectly

Which are the 2 most commonly used types of psychological tests?

Ability & Personality tests

How well did you know this?

Not at all

Perfectly

Which scientist first extensively measured individual differences in late 19th Century? Which
discipline did he specialise in?

Francis Galton, biologist/physiologist

How well did you know this?

Not at all

Perfectly

Which experimental psychologist set up the first laboratory for making systematic observations?

Wilhelm Wundt – set up first psychological laboratory in Leipzig in 1879; standardized
conditions

How well did you know this?

Not at all

Perfectly

Which American psychologist first coined the term “mental test”?

James Cattell

How well did you know this?

Not at all

Perfectly

What are the three overlapping concepts described by the term “human ability”?

Achievement, aptitude and intelligence

How well did you know this?

Not at all

Perfectly

Who constructed the first widely used psychological test? What for, and when?

Binet & Simon, for individually testing the intelligence of children for classification of mental
retardation in France in 1905.

How well did you know this?

Not at all

Perfectly

Who devised the first widely used group intelligence test? What for?

Robert Yerkes – American psychologists for group testing among army recruits in the 1910s

How well did you know this?

Not at all

Perfectly

Who developed the first personality questionnaire? What was that called?

Woodworth in 1920 developed the Personal Data Sheet, a structured paper-pencil group test,
to screen military recruits in WW1

How well did you know this?

Not at all

Perfectly

Who first provided systematic description and review of published tests? (O_______ Bu______)

Oscar Buros

How well did you know this?

Not at all

Perfectly

Which is the most referenced test in the psychological literature?

MMPI

How well did you know this?

Not at all

Perfectly

What characterises a psychological test, and what makes it a special type of psychological measure.?

(a) A psychological test is characterised by s__________ a____________ and
scoring, the use of a m______ and usually the availability of population n_____ to
assist interpretation
(b) A set of items that has accepted levels of r_______ and v______, and allows
measurement of some attribute of an individual

standardised administration; manual; norms; reliability; validity

How well did you know this?

Not at all

Perfectly

What characterises a psychological test, and what makes it a special type of psychological measure.?

standardised administration; manual; norms; reliability; validity

How well did you know this?

Not at all

Perfectly

Suggest 3 methods that you would go about locating information about a published personality test?

Mental Measurement Yearbooks, distributors’ (e.g., ACER)
test catalogues
test manuals
Psychology databases (e.g., PsychINFO).

How well did you know this?

Not at all

Perfectly

KR20 and coefficient alpha are both measures of:
a. the extent to which items on a test are clearly related to the construct being measured.
b. the extent to which items on a test are intercorrelated.

c. the extent to which items on a test are of an
appropriate level of difficulty.

d. the extent to which items on a test are truly measuring what they purport to measure.

a. the extent to which items on a test are clearly related to the construct being measured.

How well did you know this?

Not at all

Perfectly

Administering a test to a group of individuals, re-administering the same test to the same group at a later
time, and correlating test scores at times 1 and 2 demonstrates which method of estimating reliability?

a. alternate forms method
b. test-retest method
c. split-half method
d. internal consistency method

Study These Flashcards

b. test-retest method

A test has a reliability coefficient of 0.77. This coefficient means that
a. 77% of the variance in the test scores is true score variance, and 23% is error variance.
b. 77% of items on this test are reliable, and 23% of the items are unreliable.
c. 23% of the variance in the test scores is true score variance, and 77% is error variance.
d. 77% of the variance in the test scores is unexplained variance, and 23% is error variance.

Study These Flashcards

a. 77% of the variance in the test scores is true score variance, and 23% is error variance.

Test constructors can improve reliability by
a. increasing the number of items on a test.
b. decreasing the number of items on a test.
c. retaining items that measure sources of error variation.
d. increasing the number of possible responses to each item.

Study These Flashcards

a. increasing the number of items on a test.

Administering two supposedly equivalent forms of a test (e.g., Form A and Form B) to the same group of
individuals yields a correlation coefficient indicating:

a. test-retest reliability
b. split-half reliability
c. alternate forms reliability
d. internal consistency reliability

Study These Flashcards

c. alternate forms reliability

A reliability of a difference score is expected to be:

a. lower than the reliability of either test on which it is based.
b. higher than the reliability of one test and lower than the reliability of the other test on which it is based.
c. higher than the reliability of both tests on which it is based.
d. unrelated to the reliability of either test on which it is based.

Study These Flashcards

a. lower than the reliability of either test on which it is based.

Strong inter-rater (or inter-judge) reliability is probably most important to which type of test?

a. Structured (objective) personality tests
b. Achievement tests
c. Behaviour rating scales
d. Aptitude tests

Study These Flashcards

c. Behaviour rating scales

The relative closeness of a person’s observed score to his/her true score is estimated by the

a. test-retest reliability coefficient
b. internal consistency reliability coefficient
c. standard deviation.
d. standard error of measurement.

Study These Flashcards

d. standard error of measurement.

Sources of error associated with time-sampling (e.g., practice effects, carry over effects) are best expressed
in coefficients, whereas error associated with the use of particular items is best expressed in
coefficients.

a. test-retest reliability, internal consistency reliability
b. alternate forms reliability, inter-rater reliability
c. internal consistency reliability, test-retest reliability
d. split-half reliability, alternate forms reliability

Study These Flashcards

a. test-retest reliability, internal consistency reliability

10. Which of the following allows test developers to estimate what the correlation between two measures would have been if they had not been measured with error? a. standard error of measurement b. reliability of a different score c. correction for attenuation d. Spearman-Brown prophecy formula

c. correction for attenuation

11. Criterion related validity includes: a. content validation b. concurrent, predictive and construct validation c. construct validation d. predictive and concurrent validation

d. predictive and concurrent validation

12. Test A has been in use for many years. It is highly respected as a test of intelligence. Test B is a new test, much shorter and easier to administer than A. Test B also claims to be a test of intelligence. The validation of Test B with Test A would be: a. face validation b. concurrent validation c. predictive validation d. discriminant validation

b. concurrent validation

13. The ultimate criterion for most tests is: a. academic achievement at school b. personality assessment by other tests c. actual performance in real life or on the job d. convergent validation with moderator tests

c. actual performance in real life or on the job

14. A musical aptitude test is given to a number of students at the School of Music and also a group of students of the Institute of Sport. This method of validating a test is known as: a. concurrent method b. contrasted groups c. criterion validation d. predictive validity

b. contrasted groups

15. Face validity should generally be considered in test development because it is: a. a strong clue to the statistical evidence for their selection b. justifiable as an important concept of validity c. a substitute for more objective kinds of evidence d. indicative of the test’s acceptability by the type of examinees for which it is designed.

d. indicative of the test’s acceptability by the type of examinees for which it is designed.

16. Which of the following is an example of criterion contamination in an industrial study? a. Many workers unable to perform the job adequately were fired prior to the gathering of the criterion data. b. The criterion used was unreliable and subject to large inter-scorer differences. c. The predictive test scores for examinees were known by the individuals making the criterion assessments. d. There was an unpredictable change in the choice of a criterion after the study began due to economic downturn policy changes.

c. The predictive test scores for examinees were known by the individuals making the criterion assessments.

17. There are two computer aptitude tests, one of which is individually administered at a cost of $75 per assessment. The second is a group test administered at a cost of $5 each. The second test would be used if its results could be shown to be closely similar to the results of the more expensive individual test. What type of validation would be most relevant to this decision: a. content b. face c. concurrent d. predictive

c. concurrent

18. A high correlation between measures of test anxiety obtained through a self-report inventory and through a physiological technique eg. GSR or urinalysis illustrates: a. discriminant validity b. concurrent validity c. context validity d. convergent validity

b. concurrent validity

19. To show good discriminant validity a test of numerical aptitude should: a. correlate highly with reading comprehension b. correlate low with reading comprehension c. correlate highly with arithmetic grades d. correlate low with arithmetic grades

b. correlate low with reading comprehension

20. The validity of a test: a. Should be settled once and for all by the test publishers. b. Is essentially the same concept as it was 50 years ago. c. Addresses itself to the question of what the test really means d. Has no bearing in the area of personality testing.

c. Addresses itself to the question of what the test really means

1. What are 3 commonly used indicators of reliability?

``` Test-retest reliability, inter-observer reliability, split half reliability, parallel forms, internal consistency. ```

What are 3 strategies for increasing reliability of measures?

Increase the number of items, derive uni-dimensional tests through factor analysis, correction for attenuation, applying congeneric measurement models to obtain composite variables within structural equation modelling.

What are 4 commonly used within-group norms that can enable us to interpret an individual’s test scores? Per_____ _ scores, normalised _________ scores (examples are T scores, stanines), __'s.

Percentiles; Z Scores; standardised; IQ's

What is the relationship between validity coefficient of a measure and standard error of estimate? Standard error of estimate is determined by the v________ c___________ of a test and the s________ d________ of criterion scores used to validate the test.

validity coefficient; standard | deviation

``` SEest = S2 √1 – (R12)2 o Where (R12)2 = ______ ________ squared o S2 = _______ _________ of criterion scores ```

validity coefficient; standard deviation

Suggest 2 ways to determine construct-related evidence of validity. A construct-validated instrument should have high correlations with other measures or methods of measuring the same construct (c______ v________), but low correlations with measures of different constructs (_________ v_________).

convergent validity; discriminant validity

Practice quiz questions Flashcards

(41 cards)