3: Foundations of Quantitative Measurement Flashcards by Nick McCaughey

In the social sciences, quantitative and qualitative approaches are typically associated with what?

Deep philosophical differences in epistemology.

How well did you know this?

Not at all

Perfectly

What is the classical test theory measurement model? What concepts does it underpin?

Observed score = true score + error.

Underpins concepts of reliability and validity.

How well did you know this?

Not at all

Perfectly

Levels of specificity in variables are predicated on constructs, operational definitions, and measures. Define each.

Construct: a psychological concept that is not directly observable. No tangible existence outside person’s mind.

Operational definition: clear, measurable definition of a construct based on theory. May capture only a portion of entire construct.

Measures: clearly defined set of procedures for obtaining a measure of the construct of interest. Must be clear and precise enough to be replicated by other scientists.

How well did you know this?

Not at all

Perfectly

What is construct validity?

How well you translated/transformed construct into a functioning, operating reality - measuring what you wish to measure.

How well did you know this?

Not at all

Perfectly

In what way is it useful to think of construct validity?

As an umbrella variable that encompasses all other forms of validity.

How well did you know this?

Not at all

Perfectly

What is content validity? Provide an example. What can improve it?

Does measure cover all aspects of underlying construct?

E.g., measure missing some of depression DSM-5 components may lack content validity.

Multiple measurements of construct can improve content validity.

How well did you know this?

Not at all

Perfectly

What is face validity? Provide an example.

Extent measure ‘appears’ to measure underlying construct.

E.g., item “nervous” has face validity for measuring anxiety, item “jealous” does not.

How well did you know this?

Not at all

Perfectly

When studying older adults, some items for personality disorders may have poor _____ due to developmental changes.

Face validity.

How well did you know this?

Not at all

Perfectly

What is criterion validity?

How well measure correlates with established “gold standard” measures of the same construct.

How well did you know this?

Not at all

Perfectly

List the two subtypes of criterion validity. Provide examples.

Concurrent validity: “at the same time”(e.g., correlate questionnaire with a clinical interview).

Predictive validity: “predicting the future”(e.g., does your anorexia measure predict future weight loss).

How well did you know this?

Not at all

Perfectly

Two concepts are specific to criterion validity for clinical psychology and medical diagnosis. What are they?

Sensitivity: how well it picks out people with the disorder (i.e., few false negatives).

Specificity: how well it avoids diagnosing healthy people with a disorder (i.e., few false positives).

How well did you know this?

Not at all

Perfectly

How does one calculate sensitivity and specificity using signal detection theory?

Sensitivity: hit / (hit + miss)

Specificity: correct rejection / (false alarm + correct rejection)

How well did you know this?

Not at all

Perfectly

When finding optimal cut-off score to use to get the best balance of sensitivity and specificity via an ROC curve, what indicates such?

Larger values on the y-axis indicate better sensitivity (% of hits). Smaller values on the x-axis indicate better specificity (% of correct rejections).

How well did you know this?

Not at all

Perfectly

As sensitivity increases, what happens to specificity?

Decreases.

How well did you know this?

Not at all

Perfectly

Define convergent and discriminant validity.

Convergent validity: correlate with other measures that it should be related to.

Discriminant: not correlate with measures that it should not correlate with.

How well did you know this?

Not at all

Perfectly

Unreliability is the amount of _____ in the measurement.

Study These Flashcards

Error.

What is test-retest reliability? When is it generally more useful?

Study These Flashcards

Is the measure consistent over time - do scores stay more or less the same when repeatedly measured?

Useful for constructs which are theorized to be stable (i.e., personality) rather than transient states (e.g., fear).

Test-retest reliability is often assessed with what? What do higher values indicate?

Study These Flashcards

Correlation coefficient. Higher values indicate higher test-retest reliability.

Describe the concept of internal consistency.

Study These Flashcards

Used to assess a questionnaire with multiple items. Do all the items in the questionnaire more or less measure the same thing?

The statistic most commonly used to measure internal consistency is Chronbach’s Alpha α. Conceptually, it’s calculated as a function of what two things? How can you increase it?

Study These Flashcards

Number of items, the average intercorrelation among all the items.

Increase number of items or remove items that are very weakly correlated with other items.

Describe the concept of inter-rater reliability. When is it used?

Study These Flashcards

Two or more trained coders independently review data, provide their ratings. Ideally, ratings from all coders are similar.

Used when scores are derived from a trained coder looking at raw data.

What are the two nominal scales used to determine inter-rater reliability?

Study These Flashcards

% agreement.

Cohen’s Kappa: more conservative, controls for agreement which occurs by chance alone.

What are the two ordinal, ratio, or interval scales used to determine inter-rater reliability?

Study These Flashcards

Pearson correlation.

Intraclass correlation: more conservative, more complex calculations for different situations.

What are the rules of thumb when evaluating reliability and validity stats that determine whether stats are:

good
acceptable
marginal
poor

Study These Flashcards

Good: 0.80 (reliability); 0.50 (validity)

Acceptable: 0.70 (reliability), 0.30 (validity)

Marginal: 0.60 (reliability), 0.20 (validity)

Poor: 0.50 (reliability), 0.10 (validity)

What are the four types of basic hypotheses?

Descriptive: what is X like? Descriptive/comparison: does group 1 differ from group 2? Correlation: do X and Y covary? Psychometric: is a measurement reliable and valid?

What are the three most common hypotheses in published psychology research?

Mediation: X leads to M (mediator), which in turn leads to Y. Moderation: relationship between X and Y varies depending on the value of the moderator, M. Incremental validity: X1 predicts Y over and above another known predictor (X2).

3: Foundations of Quantitative Measurement Flashcards

(26 cards)