Measurement process and measuring behaviour Flashcards
What is Measurement?
Measurement is the assignment of values to outcomes
What are the 3 measurement principles?
- An Outcome variable belongs to one of four levels
- The qualities of one level are also characteristic of the next level
- The higher the level, the more precise the measurement
Why are levels of measurement important?
Your IV and DV need to be defined as either four levels. Then you can determine the method by which you will measure them. Every variable studied must be operationally defined.
What are the 4 levels of measurement? Explain each
- Nominal
- Ordinal
- Interval
- Ratio
What is a discrete variable?
Values that have definite boundaries and can have nothing in between two values (number of students enrolled in a unit). All qualitative variables are discrete and are referred to as categorical variables (male and female).
What is a continuous variable?
Continuous variables can assume any value on some scale and it is always theoretically possible for two values to have something in between (eg time, weight, height)
A measure has high internal consistency reliability when
each of the items correlates with other items on the measure.
More information can increase the _____ and ______ utility of your results
Power and utility
Always consider defining your variables in ways that maximises utility of information
In terms of information, higher level measurements have what properties?
They have more information about the true outcome of interest along the info/complexity scale
While behavioural and social science deals with mostly nominal and ordinal level data, most test score yield ____ level data?
Interval (caution)
How you choose to measure an outcome defines the ______
outcomes level of measurement (eg preference for a product measured in multiple ways)
What is measurement error?
The discrepancy between the data found and the true value of measurement
What could account for measurement error?
Method error (the method, tools used)
Trait error (person themselves, the participants)
Temporary individual factors (fatigue, motivation, health)
Test administration (conditions, interaction between participant and examiner)
Luck
How can we decrease measurement error?
Increase reliability
How can we increase measurement reliability?
- Increase number of items/observations
- Eliminate ambiguity
- standardise conditions
- moderate difficulty
- minimise effects of external events
- standardise instructions
- standardise scoring
What is a correlation coefficient? Also known as Pearson correlation coefficient, Pearson’s r, the Pearson product-moment correlation coefficient (PPMCC)
The correlation coefficient is a statistical measure that calculates the strength and direction of the relationship between two variables. It provides a form of reliability.
It is represented by a number between -1 and +1
What are 4 types of reliability?
- Test-retest (measure of stability over time)
- Parallel forms (different forms of same test given to same participants)
- Interrater-Reliability (multilple raters agree in their observations of same thing)
- Internal consistency (responses at one time, focusses on consistency of items)
What are 4 types of validity?
- Face validity - extent to which items on a test appear to measure the construct
- Content validity - extent to which the content of the measure compares with the universe of content that defines the construct
- Criterion-related validity - (predictive OR concurrent) extent to which a score indicates a level of performance on an criterion against which it is compared
- construct validity - extent to which an assessment corresponds to other variables as predicted by a theory
What are two types of Criterion-related validity?
Predictive and concurrent
What are two types of construct validity?
Convergent validity and discriminant validity
What does internal validity refer to?
Internal validity refers to whether an experimental treatment / condition makes a difference or not, and whether there’s sufficient evidence to support the claim. It refers to the amount of control and accuracy in concluding that the outcome of an experiment is due to the independent variable.
What does external validity refer to?
Variables have been operationalised and defined and are representative of the population. It refers to the amount of generalisability.
What are some threats to internal validity?
- history
- maturation
- testing
- instrumentation
- statistical regression
- selection of subjects
- mortality
- experimenter bias
- demand characteristics
* Remember John Henry effect
What are some threats to external validity?
- multiple treatments interference - treatments occur simultaneously
- reactive arrangements (participants knowledge of the experiment)
- experimenter effects
- pretest sensitisation
How can we improve internal validity?
Randomly selecting individuals
randomly assigning to groups
use a control group
How can we improve external validity?
Careful adherence to good experimental process and practices
Improve the research design
random assignment
attempt to normalise testing procedures and environment as a much as possible
Validation studies
What are two types of sampling strategies?
Probability and
Non-probability sampling
What are the four types of probability sampling strategies?
- Simple random
- Systematic
- Stratified
- Cluster
What differentiates probability from non-probability sampling?
Probability sampling - likelihood of any one member of the population being selected is known
Non-probability sampling - likelihood of selecting one member from the population is NOT known
What are the two types of non-probability sampling strategies?
Convenience
Quota
Why is sample size important?
You need to have a sample representative of the population - less representativeness means more margin for error, and the less precise your test of the null hypothesis is.
Size has implications for the power and integrity of your test; its’ sensitivity in detecting a significant result
What is the John Henry effect?
The tendency for members of a controlled group to adopt competitive attitude towards the experimental group thereby negating their status as controls
What is the Simpson’s paradox?
A trend or result that is present when data is put into group that reverses or disappears when the data is combined.
What is the Hawthorne effect?
Type of reactivity in which individuals modify aspects of their behaviour in response to being observed. Can undermine integrity of research particularly the relationship between variables
The items in a personality test correlate strongly with one another. What kind of reliability or validity does this imply?
Internal consistency
What is incremental validity?
A type of validity that is used to determine whether a new psychometric assessment will increase the predictive ability beyond that provided by an existing method of assessment.
Why is your research question so important?
The way you frame it determines the way in which you go about measuring your variables. ie the intent of the research.
What is a test?
A tool that assess behaviour, albeit generally.
It measures the extent of individual differences.
a good test will differentiate people from one another reliably, based on their true score
Why is the interpretation of a score more important than the score itself?
Ie. A score of 10 on an exam wherein all the items are simple vs a score of 10 where everyone else in the group received scores below 5. It has to be compared to something or analysed so that it can have meaning
What are some ways in which tests can be administered?
Group vs individual
Paper&Pencil vs performance
Speed vs Power
What are achievement tests?
Achievement tests measure knowledge of a specific area; most commonly where learning is the outcome
What are the two types of achievement tests?
Standardised (WAIS)
Researcher-generated (custom made for a research problem)
Both types of achievement tests can be either _____ referenced or _____ referenced.
Norm-referenced (compared to others, Raven’s progressive matrices)
Criterion- referenced (driving test - need to obtain __% to pass, not compared to others)
Multiple choice achievement items are comprised of what elements..?
Stem
Distractors
Alternatives
List some advantages and disadvantages of multiple choice tests
Advantages: Ideal for assessing level of knowledge about a specific content domain assess any content easy to score tests for knowledge good distractors help in diagnosis
Disadvantages:
Does not test for writing skills
Presence of test anxiety
Why is using varied approaches to measuring behaviour beneficial?
Varied approaches assist in trying to falsify a theory or strengthen it. Can obtain convergent evidence (Or divergent)
What is Item Discrimination?
The ability of an item to differentiate among students on the basis of how well they know the material being tested.
ie how well does it discriminate?
What is item analysis?
Why is this important?
Item analysis generates two indices which assess the effectiveness of a multiple choice test.
It assist test authors asses the value of each item and decide whether it should be retained or replaced
What are the two indices we can analyse our test items on? How are they measured?
- Difficulty index (range of difficulty).
- the proportion of test takers who got the item correct is calculated - Discrimination index (how well does it discriminate between people?)
- the proportion of test takers in upper group who got it correct, compared to those in lower who did
What happens when difficulty (within tests) increases?
As difficulty increases, discrimination is constrained
What are some types of tests?
Achievement (assessing content knowledge) Attitude Personality Intelligence Aptitude
What are 3 types of attitude tests?
Thurstone (favourable to unfavourable ranking of statements) Likert-type scale (statements assessed from strongly agree to strongly disagree, neutral in middle) Guttman scale (responses ordered from weaker to stronger)
What are two types of personality tests?
Projective (Thematic Apperception Test, Kinetic figure drawing, Rorschach technique)
Structured (Big 5 personality)
What are two general ways we can we measure behaviour?
via tests
via observing behaviour
via questionnaires
What underlying notion does observing behaviour rely upon?
That given the same context, most people will behave in same/similar ways
What are two early examples of pioneer research into observing human behaviour?
Social Psychology - The Bystander Effect
Developmental Psychology - Mary Ainsworth - the strange situation
List some types of observational methods?
Systematic (full structure with coding system)
Naturalistic (field) observation
What are three types of naturalistic observation?
Full participant (researcher doesn’t disclose)
Participant as observer (researcher isn’t a secret but kept quiet)
Observer as participant (reliant on group members accepting observer present and over time they exhibit ‘normal’ behaviour)
What are some advantages/disadvantages to systematic observation?
Advantages - systematic, replication, high degree of reliability, more controlled
Disadvantages - lack of ecological validity and behavioural spontaneity/realism
What are some advantages/disadvantages of naturalistic (field) observation?
Advantages - ecological validity, less subject to demand characteristics
Disadvantages - poor control and replication difficult, greater potential for observer bias, ethics
Prior to observing behaviour, what are some key things you need to determine?
Decide on behavioural categories
Define behaviours (ie on basis of form or consequence)
What aspect are you measuring? Latency, frequency, duration
Who will you observe and when? (continuous - for individual, or time-sampling - for groups)
What is measurement reliability?
The extent to which measurements differ from one occasion to occasion as a function of measurement error
What is a Reliability Coefficient?
Reliability coefficient is a measure of the accuracy of a test obtained by measuring the same individuals twice and computing the correlation of the two sets of measures.
What are the examples of two different DEGREES of reliability?
Reliability coefficient
Index of Concordance
Correlation, reliability and observers - comment…
We can compare people to see if they observed the same behavioural category with the same frequency. If they correlate highly (0.7-0.8) then we can say it is reliable.
Why are questionnaire valuable?
They allow collection of data from large numbers of people
They can measure something that is not directly observable theoretical construct (ie cognitive or personality constructs)
Captures beliefs, opinions etc
What is the jingle-jangle fallacy?
Jingle-jangle fallacies refer to the erroneous assumptions that two different things are the same because they bear the same name or that two identical or almost identical things are different because they are labeled differently
What should you be aware of if you want to devise your own questionnaire?
be aware of jingles and jangles
Be aware of response acquiescence and social desirability
What are the two types of questions in Questionnaires?
Open and closed