Study Designs Flashcards
What are case reports and series studies?
A case report is the most basic type of descriptive study and documents an individual’s medical experience. A clinician may see an interesting case and describe what he or she has seen.
A case series is an extension of a case report and is based on a small group of individuals. Case reports and case series are useful for hypothesis generating, but because the is no comparison if group, their use is limited and statistical relationships between exposure and outcome cannot be assessed.
What are the advantages of case reports and series?
Advantages:
- allows reporting on unusual medical cases
- results used to generate hypotheses
- generates evidence of possible new diseases
- rapid feedback of current events int he medical community
What are the disadvantages of case reports and series?
Disadvantages:
- cannot be used to assess statistical associations
- could just be reporting a medical oddity
The rest of the study designs can all look at associations between exposure (or treatment) and outcome/disease by comparing outcome between different levels of exposure.
What are ecological studies?
Ecological studies look at the association between exposure an disease on a population or area level rather than on an individual level. They look at questions like ‘do populations or areas with high levels of exposure have high rates of disease?’, rather than ‘do individuals with higher exposure have a higher risk of disease?’
For example to look at the association between smoking and cardiovascular disease, data were collected for every state in the US on ‘average number of cigarettes sold per person’ and ‘rate of cardiovascular mortality’. It was found that death rates from cardiovascular disease are highest in the states where the greatest amount of cigarettes are sold.
The benefit of ecological studies is that data on a population level are often readily available and published routinely (eg death rates, national food consumption data, cancer statistics, hospital admission data, census data etc), and such studies can therefore be done quickly and inexpensively.
A drawback is confounding which can be a major problem and data on potential confounders are often not available. For example, those states with high rates of cardiovascular mortality may share many other characteristics than just high levels of smoking, eg high social deprivation, poor diet, high proportion of elderly etc, any one of which could be the true explanation for the high rates of cardiovascular mortality.
What is meant by ecological fallacy?
With ecological studies any associations seen are on a population level, and we cannot assume that this transfers to an individual level. To assume that associations seen one population level apply to the individual level is called the ecological fallacy. The unit of analysis is a population and it is at this level that the interpretation should be conducted. In the smoking and CHD example, we cannot assume the individuals who smoke are the ones who are more likely to die from cardiovascular disease.
How should you go about analysing data from ecological studies?
Data from ecological studies are analysed on a group level and generally presented in a scatter diagram with exposure on the x-axis (horizontal axis) and outcome on the y-axis (vertical axis). Each point on the scatter diagram represents an area/population. A correlation coefficient (r), which can vary between +1 (perfect positive correlation) and -1 (perfect negative correlation), is then computed.
Ecological studies are a good first step in investigating possible exposure-disease relationships, especially when there are restraints on time or money, and generating hypotheses. They are also good when investigating an exposure which has little variation between individuals within a population/area, but large variation between populations/areas (eg some dietary factors). However any relations seen need to be investigated further in an individual-based study where data on confounders are collected.
What are the advantages of ecological studies?
Advantages:
- inexpensive and quick to conduct
- exposure and disease information is often more readily available by area
- differences in exposure is often greater between areas than between individuals in one area
What are the disadvantages of ecological studies?
Disadvantages:
- results cannot be extrapolated to the individual level
- systematic differences between areas in recording disease frequency can occur
- quality of diagnosis
- differences in classification
- completeness of reporting
- sampling on the population can distort the results of an individual
- data often not available on confounders
What is a cross-sectional study?
A cross-sectional study collects observations on individuals at one point in time, thereby providing a ‘snap shot’ of the health of the population. These may be observations on disease status to measure prevalence of disease, or some continuous measure such as blood pressure, level of protein in serum etc. As people are surveyed at one time point only, cross-sectional studies are relatively cheap but only provide information on disease prevalence and not incidence. Study subjects should be selected so they are representative of the target population, eg if target population is adults in Nottingham, the study population may be defined as all adults registered with a GP in Nottingham, and a 1 in 4 random sample may be taken from the GP registers to provide the sampling population. Data on exposure variables are usually collected as well so that associations between exposures and disease can be explored. Confounding can occur in this study design, but as long as data on potential confounders are collected, they can be dealt with at the analysis stage (stratification or multiple regression).
Cross-sectional studies only consider prevalent case of disease (ie current cases) so any risk factors identified will be determinants of ‘having the disease’, of which survival as well as incidence are components. For example a cross-sectional study showing deprived people have a lower prevalence of heart disease than more affluent people, does not necessarily imply they get less disease. It may be that they develop heart disease at the same rate but the deprived people don’t survive as long.
How should you go about analysis of cross-sectional studies?
The outcome variables should be summarised, using statistics appropriate to the type of variable. Associations can be initially assessed by computing an appropriate measure of effect (odds ratio, mean difference etc) and 95% confidence interval, and statistical significance determined from the appropriate test (eg chi-squared test, t-test, non-parametric test etc).
What kinds of things are cross-sectional studies appropriate for?
This method is appropriate for investigating some health outcome of interest eg prevalence of a disease, and as a first step in identifying risk factors for a disease.
It is not suitable when the outcome or exposure of interest is rare as you may end up with very few (or even no) people in your sample with the outcome/exposure and therefore cannot determine associations. More than one disease or exposure can be assessed in the same study, although as with any study, an a priori primary hypothesis should be stated. Changes in prevalence can be assessed by carrying out a series of cross-sectional studies. This study design is not suitable for looking at incidence or natural history of a disease.
What are the advantages of cross sectional studies?
Advantages:
- can be used to examine how much disease there is in a population, and look at cross-sectional associations between exposure and disease
- can look at more than one disease and more than one exposure
- can be relatively inexpensive and quick to conduct and drop out is not a problem as no follow up
What are the disadvantages of cross-sectional studies?
Disadvantages:
- disease and exposure are measured at the same time therefore no temporal association can be made
- not suitable for studying rare exposures or rare outcomes
- high possibility of recall/reporting bias
Describe what case-control studies are.
The case-control study is a useful study design as is it suitable for looking at risk factors for rare diseases. However, it cannot look at how much disease there is (prevalence/incidence), only at whether associations with exposures exist.
In a case-control study subjects are selected on the basis of the presence or absence of disease. A group of individuals with the disease (cases) are selected, along with a comparison group of individuals without the disease (controls). This method of sampling reduces the number of disease-free people needed to be studied (hence good for rare diseases). The exposure of interest is then measured in the two groups (this may be past or current exposure) and compared. The effect of exposure on the risk of disease is estimated using the odds ratio.
Disease frequency cannot be measured because subjects are chosen or samples according to their disease status. To obtain a prevalence estimate, a cross-sectional study is needed, and to obtain an incidence estimate a cohort study should be conducted.
When is a case-control study appropriate?
A case-control study is appropriate when you have a single disease of interest that may be rare, and you want to look at associations with one or more exposure(s) that are relatively common.
Describe what things need to be considered when choosing your cases for a case-control study.
When choosing your cases the following needs to be considered:
- what are the case definition criteria? In other words what criteria will be used to define the outcome / disease of interest, eg clinical diagnostic criteria, laboratory results, coding on a death certificate etc. A poor choice could result in misclassification (ie cases not all truly having the outcome or controls including some diseased).
- what are the eligibility criteria for the selection of individuals for the study, eg are you interested in just adults or just children. The criteria may be to restrict the study to those potentially at risk of the outcome of interest - for example if you were studying cervical cancer, then you would only want to include women. Criteria may also be chosen to restrict to individuals who were potentially at risk of exposure - for example, if your exposure was oral contraceptive use, you may exclude women who are post menopausal or pregnant.
- what is the source of the cases? Case control studies are usually either hospital-based or population-based. In a hospital-based study we might choose our cases from those attending a particular hospital. In a population-based study cases are taken from a defined population (eg geographical area) over a fixed period of time. Population based studies are often more difficult to conduct, and to identify all the cases in a particular population it may be necessary to use more than one source. The choice of hospital or population based will often depend on the severity of the disease of interest. Remember that the cases in your study should be representative of all those with the disease in you target population. So suppose you want to study the association between an exposure and disease in Nottingham city, choosing cases from one health centre will not be representative of all cases in Nottingham.
- Are prevalent or incident cases to be included? Prevalent cases are all those with disease at a particular point (or short period) in time. Incident cases are new cases that arise within a fixed period of time. Prevalent cases differ from incident cases in that they will include individuals who have had the condition for some time. Such cases are likely to be the ones that survive longer or take longer to recover. Furthermore they may have changed their exposure (eg diet, exercise, smoking habit) because of the disease diagnosis, leading to incorrect ascertain end of exposure.
Describe what needs to be considered when choosing your controls for a case control study.
When choosing your controls, the following need to be considered:
- controls should be free from the disease of interest
- controls should fulfil the same eligibility criteria as cases. For example, if cases were restricted to pre-menopausal women aged 18 to 45, then controls should also be pre-menopausal women in this age group.
- the source of the control group, which is not always obvious, is dependent on the source of the cases. Controls are often chosen wrongly and this is where selection bias is introduced. The basic rule is that controls should be drawn from populations that gave rise to the cases. In other words, if the control had developed the disease, he or she would have been included in the study as a case.
- population based controls: if the cases are a population based sample of all incident cases over a specific time period, the. Controls should be selected from this same population during this time period. If the cases are identified through hospital admissions or other health facility, and the hospital has a defined population base and sees all cases in that population, then again, that population would provide the controls.
- hospital controls: often a hospital does not have a defined population base from which the cases it sees come. If there is some selection process that affects whether a case reaches the hospital or not, then it may be more appropriate to choose your controls from other hospital patients. When choosing hospital controls, special care needs to be taken. The purpose of controls is to represent exposure in the population the cases came from, but it is easy to inadvertently pick a group of hospital patients as controls who have a particularly high or low prevalence of exposure. For example, if we were looking at the association between alcohol consumption and liver disease, choosing controls from A and E (who are likely to be higher alcohol consumers than the general population) would give us a biased result. As a general rule, the selection of hospital controls should exclude those individuals who are identified by medical conditions (or backgrounds) that are known to be associated with the exposure of interest.
- sometimes researches include more than one control group, eg a community based control group and a hospital based control group. This can be tempting to do when it is not obvious who the controls should be. However this can lead to problems with interpretation if the two analyses produce different results.
What are the possible problems that may be associated with case-control studies?
Because in case control studies the disease status is already known at the time exposure data is collected, information bias can be a problem. As described for cross sectional studies, recall or reporting bias can be introduced. Another type of information bias common in case control studies is observer or interviewer bias. This arises when the investigator knows who cases are, which influences the way in which data is collected. For example, the interviewer may probe more deeply for information or prompt the respondent if the subject is a case. To overcome this, the investigator should be blind to the hypothesis under study and the case/control status of each subject, and the same forms/questions should be used for cases and controls.
As with cross-sectional studies, reverse causation is a possibility. Often data on past exposure are collected which helps eliminate the possibility, but without data on the timing of exposure and disease onset, it is difficult to eliminate completely.
Describe how you should go about analysing case-control data.
Data from case control studies are initially analysed by cross-tabulating the outcomes (case-control status) against the exposure.
When computing percentages previously we have computed the percentage diseased on each exposure group. Whilst this makes sense for data sets in which the subjects have been randomly selected (eg cross sectional survey or trials), case control studies are different because of the way in which the diseased and disease-free are selected. Therefore in case-control studies, we actually want to know about the percentage exposed amongst controls, not the percentage diseased amongst exposed and unexposed. To test statistical significance the chi-squared test is used since the variables are categorical.
The only measure of effect suitable for case-control data is the odds ratio. This is because the design of case-control studies means that the risk of disease, and hence the risk ratio, cannot be estimated. The odds ratio and 95% confidence interval around this should also be computed to tell us how precise this estimate is likely to be. This can be done in SPSS, but remember to always compute the odds ratio first by hand and check they match with that displayed in SPSS (and if necessary recode to get the cross-tab table on the right format, or take reciprocals if the values displayed). Odds ratios can be interpreted in the same way as risk ratios.
I’m most studies there are potential confounders that no to be considered since they may be distorting the relationship between the exposure and disease. So what we would actually like is an estimate of the odds ratio which has had the effect of the confounder(s) removed. In other words, we want an adjusted odds ratio. There are sophisticated statistical ways of getting an adjusted odds ratio called multivariate models. When the outcome is binary, such as in case-control studies, multiple logistic regression is the appropriate multivariate method.