Week 8: Interpretation of test results Flashcards by Myrthe Bouquet

A person’s raw score has little meaning without which two things?

A comparison to a normative sample
A method for interpreting the meaning of the comparison

How well did you know this?

Not at all

Perfectly

What do we call a measure that can be used to compare values from different data sets?

Relative standing

How well did you know this?

Not at all

Perfectly

What is it called when interpreting and communicating test performance depends on having an appropriate comparative sample and a common ‘language’ of descriptions?

Rule of thumb

How well did you know this?

Not at all

Perfectly

What is a positively skewed distribution?

More scores fall below the mean compared to above the mean (left side higher and right side lower)

How well did you know this?

Not at all

Perfectly

What is a negatively skewed distribution?

More scores fall above the mean compared to below the mean (left side lower and right side higher)

How well did you know this?

Not at all

Perfectly

What happens when scores from a normative sample are not normally distributed (2)?

The mean and median are not identical
Z-scores will not accurately translate into sample percentile rank values

How well did you know this?

Not at all

Perfectly

A … sample size will produce a … normal distribution, but only if the underlying characteristic in the population distribution obtained is normal

Larger, more

How well did you know this?

Not at all

Perfectly

When can a truncated distribution (where the starting point is not 0) occur (2)?

When scores are restricted at one side of the distribution
When specific subgroups are purposefully excluded from inclusion in the normative sample

How well did you know this?

Not at all

Perfectly

A truncated distribution of scores can lead to (3)?

Identification of normal individuals as low functioning
Difficulty estimating the severity of impaired performance
An increase in number of persons identified as impaired

How well did you know this?

Not at all

Perfectly

When is it useful to compare scores between tests (2)?

The raw score distributions for tests that are being compared are approximately normal in the population
The scores that are being compared are derived from similar samples

How well did you know this?

Not at all

Perfectly

When comparing test scores, it is important to consider the … of two measures and their …

Reliability, intercorrelation

How well did you know this?

Not at all

Perfectly

The relationship between normative scores and percentiles are lineair/non-lineair

Non-lineair

How well did you know this?

Not at all

Perfectly

What is defined as the presence of truncated tails in the context of limitations in range of item difficulty?

Ceiling and floor effects

How well did you know this?

Not at all

Perfectly

What does a high floor in scores mean?

When a large proportion of the examinees obtain raw scores at or near the lowest possible score

How well did you know this?

Not at all

Perfectly

What indicates a high floor in test scores?

That the test lacks a sufficient number and range of easier items

How well did you know this?

Not at all

Perfectly

Floor and ceiling effects can lead to?

Study These Flashcards

Misinterpretations results

What does extrapolation entail?

Study These Flashcards

The action of estimating or concluding something by assuming that existing trends will continue. When norms fall short in terms of rang this technique is often used

Comparison of performance across tests is affected by: (5)

Study These Flashcards

Measurement error
Score magnitude
Extreme scores
Ceiling and floor effects
Extrapolation/ interpolation of derived scores

It is important to carefully consider how to interpret isolated low scores. The likelihood of obtaining low scores increases when (3)?

Study These Flashcards

The number of tests increases
The cut off for defining low scores becomes more open-minded
With lower levels of baseline cognitive functioning

The degree of agreement between different people that are observing or assessing the same thing = (Inter-rater reliability/Test-retest reliability/Parallel-forms reliability/Internal consistency reliability)

Study These Flashcards

Inter-rater reliability

Measure the consistency of the result when you repeat the measure the same thing at a different point of time = (Inter-rater reliability/Test-retest reliability/Parallel-forms reliability/Internal consistency reliability)

Study These Flashcards

Test-retest reliability

Measures the correlation between two equivalent versions of a test. This can help to avoid practice effects, but the versions should be equivalent = (Inter-rater reliability/Test-retest reliability/Parallel-forms reliability/Internal consistency reliability)

Study These Flashcards

Parallel-forms reliability

The correlation between items
within a test that are mean to measure the same construct = (Inter-rater reliability/Test-retest reliability/Parallel-forms reliability/Internal consistency reliability)

Study These Flashcards

Internal consistency reliability

What is validity?

Study These Flashcards

Validity is the degree to which a test is measuring what is was intended to measure

What is reliability?

The consistency of a measure (whether the results can be reproduced under the same conditions)

What is sensitivity?

Sensitivity is the probability of a positive test, given that the person is affected

What is specitifity?

The probability of a negative test, given that a person is healthy

What does a p-value NOT measure (3)?

1. Does not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone 2. They do not provide a good measure of evidence regarding a model of hypothesis 3. They do not measure the size of an effect or the importance of a result

What does a p-value measure?

The probability that a statistical summary of the data would be as extreme (or more) than its observed value

What is circular analysis?

Circular analysis is any form of analysis that retrospectively selects features of the data to characterise the dependent variables, resulting in a distortion of the resulting statistical test = based on data that was selected for showing the effect of interest or a related effect

What is p-hacking?

The misreporting of true effect sizes in published studies. It occurs when researchers try out several statistical analyses and then selectively report those that produce significant results

What is a spurious correlation?

Occurs when two factors appear casually related to one another but are not. Spurious correlations most commonly arise if one or several outliers are present for one of the two variables

Is the test fully representative of what it aims to measure, refers to which validity?

Content validity

Evaluates how accurately a test measures the outcome it was designed to measure, for now or in the future, refers to?

Criterion related validity

Which two types of criterion-related vaidity are there?

Concurrent validity (the ability of a test to predict an event in the present) and predictive validity (the ability of a test to measure some event or outcome in the future)

Does the test measure the concept that it is intended to measure, refers to which validity?

Construct validity

Week 8: Interpretation of test results Flashcards

(36 cards)