Week 3, Measurement, Key Terms Flashcards

1
Q

Measurement

A

The assignment of numbers to objects or events according to a set of rules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Indicators

A

In psychological measurement we do not measure constructs directly (try to put a finger on IQ…) .

Instead we measure the characteristics or properties associated with individuals.

We measure indicators (signs that point to something else).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why not measure organizational constructs directly?

A

We loose specificity as we move from micro – to macro level – easier to do direct measurement at the indiv. level than it is to do at the Org. level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Scales of Measurement

A

Psychological measurement varies in precision.

Differences in precision are reflected in the types of scales on which particular characteristics are being measured.

Four levels of measurement

Nominal
Ordinal
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Nominal measurement

A

Lowest level of measurement

Represent differences in kind

Individuals are assigned or classified into qualitatively different categories

Merely labels

Frequently used to identify or catalog individuals and events
Ex.
SS#
Assign 1 to males and 2 to females

The classes must be mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ordinal Measurement

A

Not only allows classification by category, but also provides an indication of magnitude

Rank ordered according to greater or lesser amounts of some dimension

If (a>b) and (b>c) then (a>c)

In top down selection this may be all the info that we need to know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Interval Measurement

A

Have other useful properties

Scores can be transformed in any linear fashion without altering the relationships between the scores

Allows two scores from different tests to be compared directly on a common metric

Standardization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ratio Measurement

A

Highest level of measurement

In addition to equality, transitivity, additivity, the ratio scale has a natural or absolute 0 point.

Height, distance, & weight are all ratio scales

Don’t see these scales much in psych measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Psychological Measurement

A

Principally concerned with individual differences in traits, attitudes, or behaviors.

Trait – a descriptive label applied to a group of interrelated behaviors

Based on standardized samples of individual behavior we infer the position or standing of the individual on the trait in question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Systematic Nature of Measurement

A

TEST - a systematic procedure for measuring a sample of behavior.

Procedures are systematic in order to minimize the effects of unwanted contaminants (error or bias)

What is the difference between a personality “test” and a test of cognitive ability?

Found in:
Mental Measurements Yearbook
&
Publishers
&
3rd Party (e.g. Rocket-Hire)
&
Authors* (Taking the Measure of Work)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Classifying tests

A

Content
Tests may be classified in terms of the task inherent in the scale

Ex Cognitive ability tests
Achievement
Aptitude
VS
Non-cognitive instruments (or inventories)

Tests may also be classified in terms of the efficiency with which they can be administered.

E.g.
Individual vs. Group
Speed vs. Power – designed to prevent perfect scores (always want variability on measurement tools)

Speed test – more items than you can answer in an amount of time
Power- you can take as long as you want to answer the items, scored by correct answers – the longer you take to take the test – the more variance you get in the scores – it could take someone 24 hours to take a test because they want to do the best they can – too much variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Likert Scales

A

When I am stressed, sometimes I get high.

A. strongly disagree
B. disagree
C. agree
D. strongly agree

Self-report measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Behavioral Observation

A

The other end of the continuum

Best predictor of future behavior…

Issue of Obtrusiveness:
-Heisenberg uncertainty principle (observer principle)
–When people see that you’re paying attention to them, their behavior will change
-Hawthorne effect
–Turned the heat up – performance went up, turned the lights up – performance went up, turned the heat down - performance went up – WHY? Because people are observing their performance

Can be cumbersome with large N size

To capture behavior you must be there when it occurs
Naturalistic observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Situational Judgment Test

A

The purpose is to identify a respondent’s intentions

Presents the person with a series of relevant incidents, and asks what he/she would do in that situation
The typical question is “ what would you do if …”

Often used to assess intelligence in a more “real world” fashion

Can assess a variety of constructs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Theory Based

A

Goal setting theory
Intentions or goals are the immediate precursor of a person’s behavior
Added benefit of content validity

Attitudes>Intentions>Behavior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Assessment Centers

A

Simulate the situation in which the individual will be performing

Predicts how successful that person will be in the actual situation

Exercises vary in fidelity and immersion

17
Q

Assessment Center Examples

A

AT & T developed and operated the Advanced Management Potential Assessment Program (AMPA) for itself and the Bell System Operating Companies. The program was used by all the Bell System companies from 1979 through 1983.

Dr. Rich’s example of the study he conducted in the early 2000s where he and his team immersed executives in a situation in Baltimore – testing their adaptability – they were put into different situations all over Baltimore – e.g. they were told to talk to a guy about a problem – when they got to him – they realized that he was def – so some people just gave up since they couldn’t use sign language – others would grab a napkin and a pen so that they could communicate with him.
The CEO could then see who was needed at the company and who wasn’t – like the person who would give up when they couldn’t figure out a situation.

18
Q

Psychometrics

A

RELIABILITY

If measurement procedures are to be useful, they must produce dependable scores

Consistency

Freedom from unsystematic (random) errors of measurements

19
Q

Methods to assess reliability

A

Test Re-test

Parallel (alternate) forms

Internal consistency
-Split half
–Splitting a test in half – you can split the test anyway
-Kuder-Richardson 20-
–Test with a right and wrong answer
-Alpha
–Average of all split-half reliabilities
-Omega

Test Re-test is a good way to test reliability.
The downside to giving someone the same test twice: is the practice effect – will do better since they’ve taken it once.

20
Q

Issues Related To Reliability

A

No fixed value that indicates acceptable

Reliabilities often range from .70 -.90

Range of scores (need variability)
-A range of scores is reliable

Sample size & number of items
-The more observation you have the more reliability you have

21
Q

Reliability & Validity

A

Theoretically it would be possible to develop a perfectly reliable measure whose scores were completely uncorrelated with any other variable.

This measure would have no practical value.

It would be highly reliable but would have no validity.

22
Q

Limit on validity

A

Validity is reduced by the unreliability in a set of measures

Ex. performance appraisal
-Typical reliabilities are low (.60)
-Sets a cap on possible criterion validity
-We can statistically correct for this type of unreliability

23
Q

What is Validity ?

A

The extent to which a measurement procedure actually measures what it is designed to measure

Degree to which evidence and theory support the interpretation of test scores for their intended purpose

The investigation processes of gathering or evaluating data to asses this is called validation.

Really concerned with two issues 1. What a test measures 2. How well it measures it.

24
Q

Validity

A

Tests scores are typically used to draw inferences about applicant behavior in situations beyond the testing environment

Test user must be able to justify the inferences drawn by having a cogent rationale or empirical support linking the test score to the inferred outcome

Nobody cares about the test score – what they care about are the consequences (inferences)

25
Q

Validation Strategies

A

Content - Related Evidence

Criterion - Related Evidence

Construct - Related Evidence

Standards (1999, 2014)

26
Q

Standards

A

Standards for Educational & Psychological Testing (2014).
Sources of validity evidence based on:
Test Content
Response Processes
Internal Structure
Relations to other Variables
Consequences of Testing

27
Q

Content Validity

A

The content of the test is drawn from the domain of interest

28
Q

Content Validation

A

Concerned with whether or not a measurement procedure contains a fair sample of the domain of situations it is supposed to represent
-Ex. suppose your first test had items drawn completely from texts that were not assigned for reading or covered in the lecture…

Our domain is usually job performance

Can also be other aspects of work, ex. Training proficiency

MUST provide evidence that a selection procedure samples knowledge or skills required for a job

MUST be based on accurate job information NEED A JOB ANALYSIS

MAY restrict job content domain to important or frequent activities (minimize the irrelevant)

In conducting a content validation study:
Content strategies are relatively data free

Need a panel of SMEs to rate each item on the relevancy to the job

Can be quantified (CVI)

Most of the inferences of validity are supported by the documentation surrounding the development of the test

29
Q

Criterion Validity

A

The criterion variable is a measure of some attribute of outcome that is of primary interest

The choice of the criterion and the measurement procedures used to obtain criterion scores are of central importance

Companies often overlook good measurement during the criterion (use cheap easily accessible criterion)

Requires data – nothing complex, but it needs data – needs at least 100 subjects.

“G” = trait

If we get a statistically significant relationship, evidence of criterion validity

30
Q

Feasibility

A

Job is reasonably stable and not in a period of rapid evolution

Relevant, reliable and uncontaminated criterion measure
Contaminated: Measuring things other than performance

Based on a sample that is reasonably representative

Statistical power
As you approach the statistical value of .3, the normal curve starts to form

31
Q

Predictive & Concurrent Validity

A

P - data on the selection procedure are collected at the time applicants are hired - after employees’ performance levels have stabilized criterion data are collected. Applicant data on validated measure is not used in decision!

C - the predictor and criterion data are collected on job incumbents at approximately the same time

32
Q

Construct Validity

A

Am I measuring what I intended to measure

Specifying the meaning of the construct

Distinguishing it from other constructs

Indicating how the construct should relate to other variables

Nomological Network

33
Q

Conducting construct validation

A

Analysis of internal consistency
Factor analysis (establishing that items or item clusters share common variance)
Establishes that it is one construct

Correlations of a new procedure with established measures of the same construct (convergent validity) and with measures of unrelated constructs ( divergent evidence)
Establishes what that construct is

34
Q

Construct Advanced Methods

A

Factor invariance

Does factor structure change when conditions change (when moderators are present)
Constructs/items are different around the world
They are interpreted differently

For example does factor structure change across cultures

Big Five versus Chinese Personality Assessment Inventory

Etic vs. Emic