Wk 2 - Standardisation Flashcards

1
Q

Why do we bother engaging in the scientific method to study psychology (x2),
And what does human measurement have to do with this? (x1)

A

Because ‘natural psych’/intuition is often wrong
Need to verify which is right/wrong
Measurement is fundamental to all science, therefor human measurement is to psych

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Some twat collars you at a party and says psychology is crap. Generate your own withering put down to make them feel inconsequential and stupid, using three examples of counter-intuitive psychology findings.

A

Marilyn vos Savant: crazy IQ, said that when you have already chosen one envelope, and experimenter opens another, you would then have more chance of winning content if you swapped with presenter (intuition says it should be 50/50, but not - because presenter always knows where cheque is, and won’t open that one)
Milgram’s obedience experiments: psych students said 1/100 would zap other to death, psychiatrists said 1/1000 - reality was 7/10
Zechmeister/Shaughnessy’s massed vs distributive practice: most believe the former will give better results, but actually the latter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define psychometrics (x1)

A

The science of human measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define ‘psychological test’ (x1)

A

a measuring device or procedure designed to measure psychology-related variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the four key assumptions behind psychological tests?

A

People differ on traits – e.g. if most Ps get similar scores in test, test is meaningless, and/or perhaps construct doesn’t exist
Traits are measurable – but perhaps we don’t yet have good tool for measuring it
Traits are relatively stable over time - other factors, e.g. fatigue, vary, but many psych things, e.g. IQ, are pretty stable
Traits relate to actual behaviour – no point if it doesn’t predict anything about real world behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe how you might go about finding information on a particular psychological test (x4)

A

Look up library website for info on e.g Mental Measurements Yearbook (Buros) and EST collection
Search library for books on individual tests, or texts on psych testing
Academic journals
Publishers’ catalogues – found via Buros listing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a raw score? (x2)

A

The value given in response to a test -

Before any kind of processing that determines meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the advantages of standardising a measure?

A

Allows interpretation - was score high or low?
Gives ability for contextualised interpretation - standardised scores can be compared against appropriate sample of other people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give some examples of the sort of populations that could make up a standardisation sample under different contexts (x 3)

A

Can’t use Mark’s obs chart for kids - was designed for adults, with e.g. different blood pressure norms
May also compare to similar age, nation, state, global, uni, school
Or specific categories, e.g. Australian drivers, occupation, females, fathers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain what a norm and a normative sample are (x2)

A

The norm is the distribution given by the data from large number of test Ps, giving us the
Normative sample - the reference to which we relatively standardise our raw scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What issues we might need to consider when recruiting a standardisation sample? (x4)

A

Stability doesn’t guarantee representativeness!
May also need same:
Male/female ratio, age/ocio-economic/educational distribution, pattern of geographic origins, etc.
So that it is representative of the population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s the advantage of having a big standardisation sample? (x2)

A

In order to give:
Stability - outliers get swamped, so mean of any subsample doesn’t jump around all over the place
Representative - mean and SD more likely to be accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is the normal curve useful in psychology? (x3)

A

Because many psych tests scores approximate a normal curve,
Meaning we need just the mean and SD to define any point on the curve (see how a score compares with all others), and
Allowing powerful parametric tests, over crude non-parametric ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

By what other names might you hear the normal curve described? (x2)

A

Bell curve

Laplace-Gauss curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the general definition of cognitive impairment – and what is notable about the properties of this definition?

A

I.Q. of 2 s.d. or more below the mean (= 100, s.d. = 15; so it’s an I.Q. of less than 70).
Which makes cognitive impairment purely relative - its a comparison with the rest of the population
ie if population starts to score higher, those at the lower end have to try harder to avoid classification of impairment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why might we like the distribution of a measure to be normal? (x2)

A

Can do more powerful statistical tests, e.g. ANOVA, t test

Makes scales more comparable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the different strategies we can use to make a skewed distribution normal? (x2)

A

Redesign test items - e.g. to remove floor/ceiling effects (e.g. use precise wording to change from all drivers saying they drive fast, to differing proportions declaring different driving speeds)
Use non-linear transformations - square roots, logarithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are standard scores and why do we use them? (x5)

A
z scores (also called normal score)
Linear transformations that are
Used to anchor mean and SD, so we can:
Know meaning of score, without knowledge of original scale, and
Compare different scales
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the means and standard deviations of a z score?

A

0

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the means and standard deviations of a T score?

A

50

10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the means and standard deviations of an IQ score?

A

100

15

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How do you calculate a z score? (x1)

A

Raw score, minus the mean, divided by the standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe what linear transforms are (x1)

Plus two examples

A

Those that don’t change the shape of the histogram

z and T scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe what non-linear transforms are (x2)

Plus two examples

A

Those that change the shape of the histogram -
Stretch some areas more than others
Square roots and logarithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When might you want to use linear or non-linear transformations over the other? (x2)

A

If your data is normal, stick to linear to keep it that way/allow powerful parametric
If skewed, non-linear to try and correct (and avoid using non-parametric tests)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the advantages and disadvantages of using percentile ranks? (x2 and x4)

A

Easy for laypeople to understand,
Easy to calculate, but…
Don’t confuse with test percentage – it’s relative to everyone else, not absolute
It squashes out some people (close to 50 – as it includes a smaller range of raw scores), and spreads out others (at the tails, because of the much wider range)
ie only final 1% are in the tail (99th percentile), after about 2.25 SD from Mean, whereas about 50-98 percentiles are within first 2 SD
So, eg, maybe losing data resolution when rounding in the middle

27
Q

What are the properties of the stanine scale? (x3)

Where is it commonly used? (x1)

A

It’s the standard curve, divided into nine
Each division is 0.5 SD
Mean is Stanine 5, which covers -.25 to .25 SD
In school based tests

28
Q

How can we create a narrative report from raw test scores (or vice versa)? (x1 plus e.g.)

A

We can convert numbers to words, or other way round, as long as we have a reliable strategy for doing so
eg HD = 7, D - 6

29
Q

What are the advantages and disadvantages of using a norm-referenced test (x3) versus a criterion-referenced (x5) test?

A

Norm-referenced:
Score protected from possibility that test may be easier/harder.
Yields good distribution (= discrimination between good/bad performers)
Score affected if norm changes (e.g. disadvantaged in a smart cohort)
Criterion-referenced:
Scores affected by difficulty
Not affected if the norm changes.
Can set absolute standards based on what people can actually do
More likely to get a skewed distribution (failure to discriminate between people).
Possible for everyone to fail or pass

30
Q

Name and describe 5 types of psychological tests

A

Mental abilities – eg IQ, memory, vocab, spatial ability
Achievement – specific things/specific domains; educational tests, course/competence assessments; e.g. can drive, are ready to become a doctor
Personality – no competencies, no correct answers, tapping into a trait; eg extroversion, sensation-seeking, driving speed, empathy
Interests and attitudes – eg vocational tests, social psych questionnaires
Neuropsych – also include other categories; eg memory, psychomotor co-ordination, abstract thinking, IQ, personality, education

31
Q

Why is random sampling not necessarily an effective method of gaining a representative normative/standardisation sample? (x4)

A

Need huge numbers to be confident -
Could be easily skewed by eg recruitment techniques
Eg older drivers for study – respondents tend to be high functioning/education/socioeconomics status, and less able worry that you might take their licence
These kind of issues skew sample, and need dealing with

32
Q

What are two solutions that facilitate recruitment of a representative/standardisation normative sample?

A

Stratified cluster sampling – tweaking recruitment strategies, targeting the ratios you want
Weighting would change the overall resulting mean to be more representative
Eg if sample has 40% females – counting each female as 1.5 people/each male as 0.67

33
Q

Where would you acquire baseline population data for a standardisation/normative sample? (x1)

A

e.g. from census data

34
Q

What factors contribute to an increasingly normal curve? (x2)

A

Larger samples

Wider range of things measured

35
Q

Despite controversies/arguments about the line drawn to define cognitive impairment, why is such definition still useful? (x3)

A

Important in eg educational psych –
ie Qld school, test less than 70 immediately opens access to many special resources and programs,
So much debate about those who fall on the cutoff line

36
Q

What do we know if we know scores are normally distributed? (x4)
And what happens if the distribution starts to skew? (x1)

A
Mean = median = mode therefore 50% of people are below/above the mean
68% of scores +/- 1 s.d. around mean
95% of scores +/- 2 s.d. 
Tails are 2 to 3 s.d. from the mean
Mean, median and mode start to separate
37
Q

How do you transform a z-score into a T score or IQ score (or other standard score, defined by the same method)? (x1)

A

Z x SD + mean

38
Q

Why might you use a T score over a z? (x3)

A

Mostly aesthetic decision -
Removes any negatives, and
Less likely to get decimals/fraction

39
Q

What are two classic psych tests that utilise T scores?

A

Minnesota Multiphasic Personality Inventory (MMPI)

Stroop test

40
Q

Define percentile rank (x1)

How to calculate? (x2)

A

The percentage of people in the norm group falling below a certain raw score
Calculate z, and look up in table of standard normal distribution
(if z is positive, it’s the larger portion, if negative it’s the smaller)

41
Q

Describe positive skew (x2)

A

It has a longer tail on the right

And more scores grouped at the left/low end

42
Q

Describe negative skew (x2)

A

It has a longer tail on the left

And more scores grouped on the right/higher end

43
Q

Define norm- and criterion-referenced tests (x1 each), and give examples of each (x4 and x 3)

A

Norm-referenced is an individual score calculated relative to other people’s
IQ tests, personality tests, neuropsych measures, typical attitudinal measures
Criterion-referenced is not, it’s absolute – ie your exam score is irrespective of how other people do
UQ exams, driving licence test, typical competency/skills tests

44
Q

What is the difference between older and more recent 3020 final results distributions? (x2)
And three possible reasons for change?

A

First is negatively skewed – very few 2/3s, Mode = 4, and gradually decreasing numbers towards 7
While second is closer to normal, Mean = 5
Students were smarter OR
Examinations/assessments were easier OR
Teaching was better (OR a combination)

45
Q

True or false?
When percentage scores are converted to a grade score at the end of a course, this is an example of a linear transformation?
And why?

A

False

Because the grading bands aren’t interval - some contain 15%, others only 10%

46
Q

True or false?
A test designed to yield info about whether or not a student has mastered the ability to multiply two-digit numbers to a specified level of competency could be described as criterion-referenced
And why?

A

True

Because it’s being compared to a criterion, rather than a reference sample

47
Q

True or false?
If a distribution has a substantial negative skew, then we need to transform it before using parametric statistical tests on the data
And Why?

A

True

Because otherwise it violates the assumption of a normal distribution that is required for parametric tests to be valid

48
Q

True or false?

If a distribution is normally distributed, then we cannot use non-parametric tests on the data

A

False

We could, but wouldn’t because they’re not as powerful

49
Q

A man completes an intelligence test and his raw score is converted into a z score of –2. If his raw score were instead converted into an IQ score, what would it be?

A

70

50
Q

If a child scores 4 on an age-normed test that uses a Stanine scale (higher scores = better reading), then what does this mean? (x1)
And why? (x1)

A

They are worse than the average child their own age.

A stanine has 9 categories, where 5 includes the average (assuming a normal distribution)

51
Q

True or false?

The significance test you get with a correlation coefficient tells you whether it is significantly DIFFERENT from zero.

A

True

52
Q

True or false, and why?
The greater the degree of scatter in a scatterplot between two variables (both with Gaussian distributions), the larger the correlation coefficient between those two variables will be.

A

False

Because the greater the scatter, the SMALLER the correlation coefficient will be).

53
Q

True or false, and why? (x2)

The confidence interval associated with a correlation coefficient decreases as the sample size increases.

A

True
Because the estimate becomes more accurate the more people you sample, so the confidence interval is smaller with larger samples
(i.e. there’s a smaller margin of error in our attempt to estimate of the population correlation)

54
Q

True or false, and why? (x2)
If we’re expecting a certain population correlation to be large then there would never be any point in sampling more than a small number of people when we attempt to estimate it

A

False
Even if expecting a large correlation coefficient, a larger sample will allow us to estimate its magnitude more accurately
(even though we might not need the larger sample to deternine whether the correlation was significantly greater than zero)

55
Q

True or false, and why? (x3)
A seven-year-old boy completes a test of reading comprehension and his raw score is converted into a z score of +1.68 compared with other children his own age (higher score = better comprehension). This means he is performing within the middle 68% of children (assuming a normal distribution)

A

False
About two thirds of people fit within plus or minus one SD.
So, the boy is within the middle two thirds of the population – i.e. somewhere between -1 and +1 SD from the mean.
If his z score is over 1, this can’t be true

56
Q

True or false and why? (x2)
A six-year-old girl completes a test of reading comprehension and her raw score is converted into a z score of +2 compared with other children her own age (higher score = better comprehension). This means her T score on this test would be 70.

A

True
A z score of +2 means she is 2 standard deviations above the mean.
Given a T score has a mean of 50 and a SD of 10, this means her T score will be 50 + 10 + 10 = 70.

57
Q

True or false and why? (x2)

In psychology, raw scores must always be converted into standard scores for ease of interpretation.

A

False
Not always necessary to convert raw scores into standardized scores,
for example when the raw scores are already in an interpretable form (e.g. reaction times)

58
Q

True or false and why? (x2)
If a blood test used to diagnose a particular disease can consistently produce the same diagnosis across multiple patients, then we would consider the test to be valid.

A

False
The ability to produce a CONSISTENT reading indicates the test has RELIABILITY but not necessarily VALIDITY
i.e. just because it is consistent doesn’t mean it is necessarily giving you the correct reading

59
Q

True or false and why? (x2)
If a validity coefficient is large enough to be of practical importance, then it is not necessary for it to be statistically significant.

A

False
If a correlation coefficient from a sample is not statistically significant then we can’t be sure the population correlation isn’t zero, no matter how big the coefficient is
(this might happen if there were, for example, three people in your sample).

60
Q

True or false and why? (x3)

Spearman’s Rho is a type of correlation coefficient appropriate for use on nominal data.

A

False
Spearman’s Rho is suitable for ordinal data not nominal data:
you can’t do a correlation on nominal data (the very concept is meaningless).
Instead you’d do a chi square test or similar.

61
Q

True or false and why? (x4)
In an art competition, there is one winner decided by the judges, one second place decided by the judges, and a separate “people’s choice” winner, decided on by the public. The rest of the entries are designated “losers”. This is an example of an ordinal scale.

A

False
To be an ordinal scale there has to be a clear rank order of the categories
In this example, not explained where“people’s choice” would sit (is it better than second place as decided by the judges?)
Means you can’t put the categories into a rank order, so can’t analyse them as an ordinal scale
Therefore need to treat them as nominal

62
Q

True or false and why? (x2)
If someone has a positive z score on some measure then we can convert this into a percentile rank by referring to the column marked “smaller portion” on a typical standard normal distribution table.

A

False
If the z score is positive, you’d need to refer to the “larger portion” column
(the percentile rank MUST be greater than 50 if the z score is positive).

63
Q

True or false and why? (x2)
There will be more people in the 60% to 70% percentile rank range than in the 20% to 30% percentile rank range (assuming a normal distribution)

A

False
There would be the same number of people in each of these bands
(i.e. 10% of the sample in each case)