Mid Tri Exam Flashcards
Reliability
the consistency of repeatability of measures.
Validity
are we measuring what we are trying to measure.
Inter-rater or inter-observer reliability
Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon - agreement between the scores of two or more independent observers or judges.
Especially important when measures are subjective
Test-retest reliability
Used to assess the consistency of a measure from one time to another.
The correlation between scores across two admin of the measure (take one test then take it again at a later time)
Parallel-forms reliability
Used to assess the consistency of the results of two tests constructed in the same way from the same content domain
- Split Half reliability (splitting a test into 2 halves. If its reliably the answers should be similar)
- Item-total correlation (comparing the mid tri to the overall mark might not be the best because people might suck at exams but good at lab reports)
Internal consistency reliability
Used to assess the consistency of results across items within a test
Cronbach’s alpha - the average correlation among all possible pairs of items (.80 or above)
Reliability puts a ceiling on validity
If reliability is .70 validity can only reach .70
It can be reliable without being valid
It can’t be valid without being reliable
Construct validity
measurement validity - A construct refers to a behaviour or process that we are interested in studying
Both measures and manipulations must be valid!
Manipulations can be
instructional
environmental
stooges
instructional manipulations
experimental conditions defined by what you tell participants
environmental manipulations
stage and event, present a stimulus, induce a state
stooges manipulations
use fake participants to alter experiment condition
Convergent validity
Do scores on the measure correlate with scores on other similar measures related to the construct
Relates to the degree to which the measure converges on (is similar to) other constructs that it theoretically should be similar to
Discriminant (divergent) validity
Do scores on the measure have low correlations with scores on other different measures that are unrelated to the construct
Relates to the degree to which the measure diverges from (is dissimilar to) other constructs that is should be not similar to
Face validity
On its face value, thes the measure seem to be a good translation of the construct
Does it make sense?
Ask experts in the field
Content validity
Does the measure assess the entire range of characteristics that are representative of the construct it is intending to measure
Criterion validity
concurrent - Do scores on the measure distinguish participants on other variables that we would expect to be related to it (depressives from non-depressives, criminals from non-criminals)
predictive - Are scores on the measure able to predict future outcomes (attitudes, behaviours, performance)
How to Correct manipulations
Reduce random error (replicate procedure)
Reduce experimenter bias
Reduce participant bias
Ensure manipulation has construct validity
Do a manipulation check - ask participants about various aspects
External validity
extent to which the results can be generalised to other relevant populations, settings or times
Studies have good external validity when results can be replicated
Ecological validity
Population generalisation
Environmental generalisation
Temporal generalisation
Ecological validity
The extent to which the results can be generalised to real-life settings
Population generalisation
Applying the results from an experiment to a group of participants that is different and more encompassing than those used in the original experiment
Environmental generalisation
Applying the results from an experiment to a situation or environment that differs from that of the original experiment
Temporal generalisation
Applying the results from an experiment to a time that is different from the time when the original experiment was conducted
Internal Validity
ability to draw conclusions about causal relationships from the results of a study
The extent to which we can say that any effects on the DV were caused by the IV
The elimination of alternative explanations for the observed relationships
Inferences of cause and effect require 3 elements for strong internal validity
Co-variation
Temporal precedence
Elimination of alternative explanations
Threats to internal validity
Selection bias
Maturation
Statistical Regression
Mortality
History
Testing
Practice Effect
Instrumentation
observer reactivity
Social desirability
Controlling these threats
Randomly allocate participants
Treat all conditions equally except for intended IV manipulations
Use appropriate control conditions
Use double blind studies where possible
Experimenter bias
errors in a research study due to the notions or beliefs of the experimenter
Selection bias
A threat to internal validity that can occur if participants are chosen in such a way that groups are not equal before the the experiment
Differences after the experiment may reflect differences that existed before the experiment began plus a treatment effect
Maturation
changes in participants during and experiment or between the DV due to time (age, cognitive development)
Permanent - (age, biological growth, cognitive development)
Temporary - (fatigue boredom, hunger)
Most common naturally occurring developmental processes (children)
Statistical Regression
regression towards the mean
Participants with extreme scores on the first measurement of the dv tend to have scores closer to the mean on the second measurement
Subsequent scores are still likely to be extreme in the same direction but not as extreme
When you have extreme scores, it is difficult to maintain that degree of extremity over repeated measures
If participants are selected on the basis of extremes scores- regression to the mean is always going to be a possible explanation for higher or lower scores on a repeated test