WEEK 4 L7.2 data collection: case selection Flashcards
two crucial elements of data selection
- identifying full sets of units (universe of cases)
- selecting a subset/ sample of data units from that universe
key difference between case selection and sampling,
case selection is done deliberately, and involves a small number of cases (controlled by researcher)
probabalistic (e.g random) / non probabalistic
two good approaches to small n case selection
- most similar systems design: similar on most characteristics, but different on the outcome. - what causes the outcome?
- most different systems design: different on most characteristics, but the outcome is similar
=> selection starts with a causal factor.
a bad approach to small n case selection
selection by outcome: general characteristics are different, and cause and effect should both be smilar
- is based on succesful cases, that leads to selection bias
- evidence for the causal mechanism isn’t conclusive because other cases aren’t considered
summary of three approaches to small n case selection
- most similar systems design:
- gc: similar
- iv, dv: different - most different systems design
- gc: different
- iv, dv: similar - selection by outcome
- gc: different
- iv, dv: similar
challenges of case selection 6
1. selection bias
**2. outliers **(if not recognized, might led to an incorrect picture)
3. heterogeneity of cases: comparing cases that are very similar, for example qatar and russia
4. joint history/ historical contingency: a geographical area might be experiencing events, for example world wars- may lead to incorrent inferences
**5. path dependency- **once a system is chosen, it can’t be easily reversed, for example, if one studies health care systems they should recognize this path dependency
other tecniques of case selection
5
- typical case: representative of the universe of cases (on a simple regression line)
- diverse cases: capturing full variation of cases is captures (putting a grid, and selecting from each square)
- extreme cases: choosing cases from the outer margins
- deviant: different from other cases, but not necessarily extreme (move away from the regression line)
- influential: difficult to show with a plot, not typical, but represent a unique combination of factors that have a large influence on the observed relationship.
- most similar/ different cases
when is ‘typical’ case selection useful?
useful for theory testing
representativeness: yes
when is ‘diverse cases’ useful?
useful for theory generating and theory testing
representativeness: maybe
- for example, busy squares can be under represented
when is ‘ extreme cases’ useful?
theory generatig (adding an explanation to why they’re unusal)
not representative
when is ‘deviant cases’ useful?
theory generating (like extreme cases)
usually not representative
what’s an influential case selection?
difficult to show with a plot, not typical, but represent a unique combination of factors that have a large influence on the observed relationship.
when is ‘influential cases’ used?
theory testing (influential cases should confirm a theory) but since they’re not typical,they’re not representatives.
for what purposes can mdsd and mssd be used for?
theory generating and theory testing
may be representative
focus is more on testing (external validity isn’t the priority)
two other tecniques for case selection
- crucial
- pathway: cases that allow the researcher to isolate a particilar causal mechanism from alternative mechanisms, and to clearly demonstrate that this cause and nothing else is efficient to cause the effect.
- cases that are exceptionally relevant to confirm, or disconfirm a theory.
- so something for theory testing