Week 4 - Reliability Flashcards
What does quantitative research necessitates?
Measurement, whether that be physical or abstract things
What is the assumption with quantitative research?
Everything we want to measure has a true value, or a true score
What is the goal or quantitative research?
Measurement will give us a pure score
What is the reality of quantitative research?
Measurement always includes some amount of “random noise”.
Measured score = true score + noise
What does reliable mean in a quantitative study?
Tools used for measurement must be consistent and accurate (precise and unbiased)
What does valid mean in quantitative research?
Descriptions, relations and explanations (causation) must be truthful
What does reliability equal?
Consistency
What are the three broad types of consistency in research?
Consistency:
Over time (test-retest reliability)
Across items (internal consistency)
Consistency across researchers (inter- rater reliability)
What is consistency measured by?
Variability
What is an inverse relationship?
Research should indicate an inverse relationship if the two variables are related. Therefore:
High variability = low reliability
Low variability = high reliability
A reliable measure is also a what?
Valid measure
Can any measure that is not consistent provide trustworthy evidence?
No
Is a measure that is reliable always valid?
Not necessarily no.
What are the two reasons for which variability occurs?
- Causal relationship between an IV and DV variable (effect)
- when a IV changes, a DV also changes
- Variability due to random factors (noise)
- unknown (possibly unknowable) factors affect DV
Careful research designs are needed to separate effect from noise
Noise can be reduced, but not eliminated
What is variability due to noise called?
Error
What does error mean in research design and statistics error?
Means variability whose cause is unknown, which does not mean MISTAKE.
What might short term change in results indicate?
Problems with reliability (and therefore validity).
Caution: some measures are inherently variable (e.g reaction time)
Might need multiple measurements
What might long term change (with short term stability) indicate in research?
More likely to reflect a real effect. More likely to reflect real changes in variable, not merely error.
What are two things body temperature can measure with an illness.
To indicate development of illness following exposure
Track progression of illness during sickness.
But the measurement device must be reliable.
What is test retest reliability?
Relating a measure on two occasions then calculating the correlation between two occasions. High test retest reliability results in a strong positive correlation.
What is internal consistency?
Consistency of responses across test items
What do assessments of psychological, behavioural and health related constructs often use?
Questionnaires.
Why do researchers use different phrasing’s (positive and negative phrasing’s) to assess items against the same construct?
To capture the construct fully. Therefore if participants are consistent in their responding to the questions, the different items should be correlated.
What is reverse scored?
Negatively phrased items in a questionnaire need to be negatively scored to measure against positive results.
What is an example of a questionnaire which needs reverse scoring?
Rosenbergs self esteem scale (1989) uses a split half reliability (every second question is to be reverse scored)
What is Cronbach’s alpha?
Related to split half reliability. Calculates all possible split half correlations for a set. Rosenberg self esteem scale has 252 unique split half correlations.
Does Cronbach a require every individual correlation to be separately calculated?
No, but the mean of the set of correlations is equivalent to it.
What is parallel forms reliability?
Two versions of test for the same test. Scores for the two versions should be highly and positively correlated.
What is interrater reliability?
Many assessments rely on judgements by different raters/researchers. Important that these scores are consistent.
How is interrater reliability assessed?
Using a statistics called Cohen’s kappa (k) which is analogous to Cronbach’s a.
What is validity?
The extent to which measured represent what they are intended to represent (loosely the truthfulness of a measure)
What are the three major forms of validity?
Construct validity
Internal validity
External validity
What is construct validity?
Extent to which manipulated variables (IVs) and measured variables (DV) represent what they are interned to.
What are some examples of construct validity?
Are IQ tests a measure of intelligence?
Can intelligence be assessed by measuring head circumference?
Can reaction time be a measure of decision making speed?
What is face validity?
Do IVs and DVs appears to reflect or measure what they’re intended to measure?
Is face validity an important form or validity for valid research?
No. But research is often unfairly challenged when it lacks face validity. Reliance on face validity might explain some unusual medicines.
What is internal validity?
Extent to which causal statements about the relations among variables can be made.
What is internal validity dependant on?
Study design.
Designs that effectively controls extraneous variables have high internal validity.
Designs that poor control extraneous variables have low internal validity.
Campbell (1957) identified what seven threats to internal validity?
History Maturation Instrumentation decay Statistical regression Selection Testing Mortality
How does history affect internal validity?
Measurements that occur over time are susceptible to effects from factors other than the IV.
Risk increases as time between measures do. (Longitudinal studies particularly at risk)
How does maturation affect internal validity?
Rather than change in the DV being caused by the IVs, change may be a result of natural maturation processes age-related biological change is one example of maturation (needs to be considered for developmental psychology)
What is regression to the mean and how does this affect internal validity?
When groups are formed by sale find extreme scores at pretest, those scores tend to become less extreme at post test.
Give an example of regression to the mean affecting internal validity.
People with shortest and longest reaction time are given fake medication. After it “works”, longest groups time gets quicker because they’re told it’s a stimulant. Shortest groups time gets longer because they’re told it’s a relaxant.
What is instrumentation and how does it affect internal validity?
Changing the measurement instrument during a study can affect measurement or the DV.
Instrument is a broad term and mean any of the following:
- mechanical devises
-questionnaires
-researchers
What is selection and how does it affect internal validity?
When groups are not selected through random assignment, there is potential bias in group formation. Bias could be introduced unintentionally.
How does mortality affect internal validity and what is it?
In studies with more then one group, any non random factor that results in greater drop out (attrition) rage in one group than another results in differential attrition.
What is tearing and how does it affect internal validity?
Some studies require a pretest (baseline measure or for task familiarisation)
After that participants are tested again using same test. Pre test itself might influence the post test outcomes.
What is reactivity in research?
Experiments are intended to measure behaviour, but may also produce or alter behaviours.
Is your behaviour at a part the same when you know you’re being filmed?
What are the different type of subject roles when it comes to participants in research?
Good subject roles
Faithful subject roles
Negativistic subject roles
Apprehensive subject roles
What is external validity?
The extent to which the findings of a study can be generalised beyond the study itself. Externally invalid research occurs when conclusions are limited to original setting.
External validity can only be confirmed by what?
Replication.
Why might external validity not necessarily be too important?
Because students are humans. So studying only students will translate to other settings also. Sample size does not necessarily affect external validity.
What is convergent validity?
Tests whether constructs or measures that should be related are related.
What is discriminant validity?
Tests whether constructs/measures that should be unrelated are unrelated.
What is predictive validity?
Tests whether current performance on some measure predicts future performance on a different
What is concurrent validity?
Tests whether performance on a validated tests correlated with performance on a new test will
Construct validity can be tested with what two outlets?
Convergent and discriminant (sometimes called divergent) validity
Example: affection towards ones partner. Find factors that do and do not relate to affection.
Convergent validity would be demonstrated by strong, positive correlations with things within affection factors.
Discriminant validity would be demonstrated by weak correlations among assumed unrelated measures (physical touches and pairs of shoes)