Week 1: Psychometrics Flashcards

1
Q

The quantitative assessments of latent (hidden/ concealed/ not yet manifested) psychological constructs are also known as:

A

Psychometrics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Precision and accuracy are examples of _____ properties (think very general)!

A

psychometric properties

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Representativeness is a sample quality.

What does this quality define?

Define Kaplan’s paradox of sampling which relates to this.

A

Representativeness describes how well a sample reflects the population.

The paradox of sampling is that we can’t test representativeness, and if we could, we wouldn’t need a sample in the first place!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name the quality which describes the degree of systematic (or random) error present in the sample. This quality can produce over or underestimates of population values, and comes in many different forms.

A

Biasedness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the instability of psychological attributes present in the population with which the sample is drawn from. How is this measure expressed?

A

The degree of homogeneity/ non-homogeneity in the members of the population reflects the instability of that psychological attribute in that population. eg if the psychological attribute was stable then it would also be homogeneous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Standardised questionnaires are an example of _____ scoring whereas an assessors judgement of a vignette or a projective test is an example of ____ scoring

A

Standardised questionnaires are an example of objective scoring whereas an assessors judgement of a vignette or a projective test is an example of subjective scoring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

IQ is an example of a _____ score

A

standardised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Z-scores, T-scores and area transformations (quartiles, deciles, percentiles) are all examples of:

A

Standardisations and scale transformations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How would a sample size limitation effect errors inferred by the test results?

A

Errors tend to be inversely proportional to sample size, therefore a limited sample size could mean that a large error if inferred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In psychometric testing, the degree to which a claim is correct or true is known as ____. This also reflects the appropriateness, usefulness or meaningfulness of test scores and their interpretations.

A

validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The levels of logical biases or statistical errors in the test construction and conclusions/outputs will greatly affect the ____ of a psychometric test

A

validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do assessments of validity constructs focus on scores/ data/ outcomes and functions?

A

Because they are measurable. If the outcome or function operates in the way that we claimed it would, then the construct is valid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In statistics, unknowns are seen as _____, in the same category as mistakes

A

In statistics, unknowns are seen as errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is construct validity also known as factorial validity? What does this validity type relate to?

A

Because all the constructs should fit one factor. Therefore the test is measuring the construct (or factor) that it claims to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is operationalisation, in regards to creating psychometric tests?

A

Operationalisation is a way of constructing psychometric tests which allows for empirical assessment of constructs or variables in the test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

High levels of correlation (statistical relation) between

(a) items that make up the same or related constructs,

or
(b) tests that measure the same or related constructs
describes which type of validity?

A

Convergence validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Low levels of correlation between

(a) items that make up unrelated constructs, or
(b) tests that assess unrelated constructs describes which type of validity?

A

Discriminant validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Concurrent validity and predictive validity are both examples of ____-____ validity. This validity type refers to the degree to which a test correlates with one more parallel or outcome citeria.

A

Criterion-related validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Concurrent validity represents which kind of criterion-related validity?

Explain what long vs short or parallel forms of a personality measurement represents.

A

Concurrent validity represents criteria which are in the present.

Long versus short forms are different versions of the same assessment, with more or less questions (eg 500 items vs 50 items).

Parallel forms are different tests which use the same criterion eg both are measuring neuroticism.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Although psychological testing has existed since ancient times, the systematic approach currently used has only been developed over the past ___ years

A

100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Valid construct or assessment consistency across different settings, e.g. samples, populations, age, cultures, time-periods, etc, refers to ____ validity

A

external validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The degree to which a test score reflects a construct’s or phenomenon’s natural behaviour in the world is known as ____ validity, a type of external validity.

A

ecological validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The degree to which a relationship is not equal eg Nik can mark my assignments but I can’t mark his, is know as _____

A

asymmetry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The degree of confidence on the nature of asymmetric causal relations (treatment and outcome relationships), between the measured constructs is known as internal validity.

What does this mean?

A

Internal validity represents the degree of confidence on the nature of asymmetric relations between the measured constructs:

How confident are we that some constructs that we measure cause other constructs that we also measure?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When apples fall to earth,the term gravity is used to describe that behaviour. This is a good example of a ____

A

construct

26
Q

Validity testing of a test using a select group of people (eg university students) to generalise a criteria back to the general population presents statistical problems. Please explain.

A

The correlation between test scores and criterion measures is restricted when the range of test scores is restricted eg. if all scores are above 60, the correlation will be clustered above 60, whereas without restricting that range, correlation will be spread between the lowest test scores and the highest level of critera

27
Q

Why do naturalistic designs, such as questionnaires, interviews and participant observations have good external validity (EV) and bad internal validity (IV)?

A

Naturalistic designs tend to have good EV and bad IV because in the real world there are so many variables, errors and unknowns.

28
Q

The less chance there is of confounding in a study, the higher the ____ validity

A

internal

29
Q

How does a normative score relate to a standardised score?

A

A normative score is a standardised score which is ranked against the other scores

30
Q

What are the following 2 examples of?

  1. A measure of job performance
  2. A GPA used to select employees.
A

They are both examples of criterion-related validity.

A concurrent criterion would be a measure of job performance.

A predictive criterion would be a GPA used to select employees.

31
Q

An aptitude test followed by a future job performance test to test the validity of that aptitude testis an example of what type of criterion-related validity?

A

An aptitude test followed by a future job performance test to test the validity of that aptitude testis an example of predictive (criterion-related) validity.

32
Q

A psychometric measure designed for a clinical population may not have good external validity. Explain why this is a good thing.

A

We may not want external validity outside of the clinical population in questions, we may want the questionnaire items to measure that population only.

33
Q

What is an asymmetric causal relationship?

Give 2 examples.

A

An asymmetric causal relationship is an irreversible causal relationship. Eg. A causes B, and the process cannot be reversed by eliminating or reducing A.

Another example is that when Nico marks our exams, he will have an effect on the students but they will have no effect on him eg A causes an effect but A is not effected by that effect.

34
Q

What is internal validity?

A

The degree to which asymmetrical causal relationships are consistently represented in the measure. eg. A construct represented by some item measurements consistently causes another construct represented by other item measurements.

35
Q

Why does Naturalistic design have good external validity but poor, or sometimes to internal validity?

A

Naturalistic designs have good EV but bad IV because the natural world is so unpredictable, it’s unlikely that the asymmetrical causal relationships will stay constant.

36
Q

Why do experimental designs have good IV (internal validity) but bad EV (external validity)?

A

Experimental designs have good IV because the conditions in an experimental design are controlled, and bad EV because the conditions in an experimental design are unlikely to be mirrored in non-experimental conditions.

37
Q

What is content validity?

A

Content validity is reflected by scores or test outputs representing the content area or domain which they claim to.

38
Q

Sampling bias, cluster bias, systematic error (accuracy bias) and the ceiling/floor effects represent what type of validity?

A

Sampling bias, cluster bias, systematic error (accuracy bias) and the ceiling/floor effects represent content validity

39
Q

What are ceiling floor effects and what might be causing them?

A

Ceiling/ floor effects represent when the responses are either clustered at the bottom or the top of the score values. One possible cause could be that the scale is too narrow.

40
Q
  • The degree of consistency or stability of measurement scores across time or context
  • The degree of absence of construct fluctuations that are unaccounted by the measurement’s scores (output)
  • The degree of random error (unreliability) in the observed variability (changes) of measurement scores

These all represent what quality of psychometric tests?

A

Reliability.

41
Q

Classical Test Theory (CTT), can be represented by the following equation: X = T + E

What does the equation mean?

A

CTT purports that people (objects/entities) have a true score (T) whereas measurements (and individuals) have errors (E).

An observed score (X), is the sum of the true score + error

42
Q

The equation X = T + E can also be expressed in terms of variance (σ^2), in the following equation: σ^2X = σ^2T + σ^2E.

What does this mean?

A

The variance of an observed score (squared) = the variance of the true score squared + the observed score squared.

43
Q

In classical test theory (CTT), a theoretical reliability of a measurement could be expressed via the reliability index (r) as r = σ^2T / σ^2X (reliaility= variance of the true score/ variance of the observed score).

If r=0.9, how much of your observed score is representative of your true score, and how much is error?

A

If r=0.9, 90% of your observed score is your true score and 10% is error

44
Q

Idiosyncratic and generic are 2 sources of individual measurement error. Elaborate on these.

A

Idiosyncratic = language/ mood/ fatigue/ memory

Generic= lying to yourself or others (faking: Desirability, impression formation and self deception)

45
Q

The assumption that error (eg test retest) is random, or ‘noise’ is a questionable assumption is psychological testing because…. This is a problem with _____ testing CTT

A

Psychological circumstances may have changed between initial and following test. This is a problem with reliability testing in CTT

46
Q

The multiplicity problem with reliability testing in classical test theory (CTT) refers to:

A

Multiplications are interactions eg biological interactions with psychological well-being. However, CTT asserts that all errors (variation) can be added to ‘true score’, rather than multiplied.

47
Q

The concept of a true score relies on the assumption that a (psychological) construct exists.

This is an example of an issue with which theory?

A

CTT (classical test theory)

48
Q

Data metrics: Variables create columns, people create rows. This can be transposed into q metrics, so that:

Therefore, analysing individuals sources of measurement error is also known as _ analysis

A

People create the columns and variables create the rows

49
Q

Metrics are always which type of data?

A

Metrics are always quantitative data

50
Q

Endophenotypic error is another word for which type of idiosyncratic individual measurement error in psochometric testing?

A

Endophenotypic error is another word for memory error

51
Q

Acquiescence and nay-saying bias, 2 of the generic individual sources of measurement error exposed using q analysis, refer to:

A

Acquiescence bias refers to agreeing with what is suggested in the test, whereas nay-saying bias is the opposite, disputing everything suggested.

52
Q

Random responses and mid-point or extreme responses (floor/ ceiling) are all examples of which type of Q-analysis source of individual measurement error?

A

They are all examples of generic sources of individual measurement error

53
Q

Content-related, format-related and administration-related errors are all types of which type of measurement error (also known as R-analysis)?

A

Content-related, format-related and administration-related errors are all types of item/scale measurement error (rather that individual/response error)

54
Q

The degree of homogeneity in responses to scale-items which measure the same construct reflects which type of reliability?

A

Internal consistency reliability

55
Q

Cronbach’s alpha coefficient is a _____ coefficient.

rα= 0 means that only ____ variance is present in the measurement, whereas rα= 1 means that only true scores are present

A

Cronbach’s alpha coefficient is a reliability coefficient.

rα= 0 means that only error variance is present in the measurement, whereas rα= 1 means that only true scores are present

56
Q

Test-retest reliability is also known as ____ reliability

A

Test-retest reliability is also known as temporal reliability

57
Q

Dropouts / non-response rates (bias)
Temporal instability of constructs
Optimal time-interval

These are all issues which could effect the ____reliability coefficient

A

temporal

58
Q

Inter-rater reliability, which is measured using Cohen’s Kappa coefficient refers to which process involving experts?

A

Inter-rater reliability (Cohen’s Kappa Coefficient) refers to the process of at least 2 experts rating the reliability of test measures. Their responses are then compared.

59
Q

An index of the average degree of random error is known as _ _ _

A

SEM (standard error of measurement) is a measurement of the average degree of random error

60
Q

± 1 SEM, ± 2 SEM, ± 3 SEM refer to standard error measurements from the observed (mean) score, which represent confidence intervals that the true score sits within those SEMs. What are the intervals?

A

There is a 68% chance that the true score lies within ± 1 SEM of the observed (mean) score, a 95% chance that the true score lies within ± 2 SEMs from the observed score, and a 99% chance that the true score lies within ± 3 SEMs from the observed score.

61
Q

What is the difference between validity and reliability?

A

Validity represents the degree to which scores represent the variable being measured, whereas reliability refers to the consistency of that measurement across time (temporal), test items and researchers.