Week 7 - Selection Bias, sampling methods and information bias Flashcards

1
Q

What is random error?

A

Random error is error introduced solely by chance and is
inherent in the sampling process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is systematic error?

A

Also called bias
Systematic error is introduced via manmade actions relating to the conduct of a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the sample vs. true population?

A

We do not measure the true population measure (mean,
%, etc) but an estimate of that based on representative
sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can we decrease the random error in epidemiological studies?

A
  • Chance/random bias decreases with increase in the
    sample size
  • Goes down to zero if the total population is included
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a confidence interval of sample estimates?

A
  • A confidence interval indicates the level of uncertainty
    around the estimated measure
  • Most studiesreport the 95% confidence interval (95%CI)
  • 95%CI indicates a range within which we can be 95%
    certain/confident that the true population measure lies
    there; the larger the sample size the narrower is the
    95%CI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we lower systematic error?

A
  • Systematic bias are not influenced by sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is selection bias?

A
  • Selection bias is systematic error resulting from the fact
    that the participants included in the study are not
    representative of the population from where they were
    selected (source population)
  • Selection bias leads to a biased sample, which almost
    always, will give rise to biased estimates
  • The sampling method of choice plays a major role in the
    representativeness of the sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a representative sample?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a non-representative sample?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the three sampling methods?

A
  1. Probability (random) sampling: sample selected by
    probabilistic methods; involves random selection,
    allowing you to make strong statistical inferences about
    the whole group
  2. Systematic sampling: sample selected according to some
    simple, systematic rule
  3. Non-probability sampling: sample selected by easily
    employed (convenient); involves non-random selection
    based on convenience or other criteria, allowing you to
    easily collect data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sampling methods summary

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is simple random sampling?

A
  • Often referred to simply as ‘random sampling’
  • The most straight-forward of all random sampling methods
  • All individuals in the sampling frame have the same
    probability of being selected independently of all others
  • It is mainly used in quantitative research.
  • Given a large sample size, random sampling ensures the
    chosen individuals are representative of the source
    population
    – Demography (e.g. age, sex, ethnicity)
    – Other important factors (e.g., clinical history, current disease status,
    lifestyle factors, etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the advantages and disadvantages of Simple Random Sampling?

A

Advantages
* Ensures a representative
sample from the source
population
– Provided that the sample size is
large enough
* Less costly and less time
consuming from other more
sophisticated sampling
methods
* Ideal for quantitative studies
& test of hypothesis
Disadvantages
* If the sampling frame is too
large and/or the population
is geographically diverse it
may be impractical to
perform
* If a large sample is required,
simple random sampling
may be time consuming and
costly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Stratified Random Sampling?

A
  • Same principles as simple random sampling but
    within strata (subgroups) of the population
    – in terms of key demographic characteristics
  • The size of the random sample should be proportional
    to the specific stratum size in the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

An example stratified random sampling.

A
  • The company has 800 female employees and
    200 male employees.
  • You need a sample of 100
  • You sort the population into two strata based
    on gender.
  • You want to ensure that the sample reflects
    the gender balance of the company so you use
    random sampling on each group, selecting 80
    women and 20 men, which gives you a
    representative sample of 100 people.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the procedure Stratified Random Sampling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the advantages and disadvantages of Stratified Random Sampling?

A

Advantages
* It allows you draw more
precise conclusions by
ensuring that every
subgroup is properly
represented in the sample.
* Enables the comparison of
population sub-groups
Disadvantages
* More time-consuming than
simple random sampling
* Higher complexity might
give rise to errors (e.g.
stratification not conducted
properly)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is cluster sampling?

A
  • Based on the hierarchical structure of natural clusters
    (groups) of individuals within the population
    – Natural clusters may be hospitals, schools, streets, city
    districts, etc.
  • Involves taking a random sample of these natural clusters,
    and then selecting all individuals in the selected clusters
  • The sampling frame is a list of all clusters.
  • If it is practically possible, you might include every
    individual from each sampled cluster. If the clusters
    themselves are large, you can also sample individuals from
    within each cluster using one of the techniques above
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are cluster sampling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the advantages and disadvantages of cluster sampling?

A

Advantages
* Good for dealing with large
and dispersed populations
* Less costly and less time
consuming
Disadvantages
* Substantial differences between
clusters can cause errors
* It’s difficult to guarantee that the
sampled clusters are really
representative of the whole
population
* Representativeness may be
compromised if
– Too few clusters are selected and/or
– Clusters are too specific and/or
– Clusters contain too few individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is multi-stage sampling?

A
  • Utilizes the hierarchical structure of natural clusters (groups)
    of individuals within the population
    – Similarly to cluster sampling
  • After randomly selecting clusters, there is a random
    selection of individuals within the cluster
  • May involve several random sampling stages:
    – Stage 1: Random selection of large clusters e.g. schools
    – Stage 2: Random selection of smaller clusters within large clusters
    e.g. class
    – Stage 3: Random selection of individuals within smaller clusters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the advantages and disadvantages of Multi-stage Sampling?

A

Advantages
* Multi-stage sampling may
improve sample
representativeness (compared to
simple random sampling)
– Especially if the population is
geographically diverse and/or the
sample is too small
* Less costly and less time
consuming (depending on the
number of stages however)
Disadvantages
* The representativeness of the
sample may be compromised if
– Too few clusters are selected
and/or
– Clusters are too specific and/or
– Clusters contain too few
individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is Systematic Sampling?

A
  • Sample selected according to some simple, systematic rule,
    but not randomly
  • Sample may end up being equivalent to a simple random
    sample, provided there was no biasing pattern in the system
    of selection
24
Q

What is the Systematic Sampling procedure?

A
25
Q

What are the advantages and disadvantages of Systematic Sampling?

A

Advantages
* An acceptable, more
convenient, alternative
approach if for some reason
random sampling is not
possible
* Faster and possibly also
cheaper
Disadvantages
* The representativeness of
the sample may be
compromised if the system
of choice selects individuals
in a non-random fashion

26
Q

What is Proportional Quota Sampling?

A
  • Same principle as stratified random sampling
    – The sample is selected on a weighted manner based on
    predefined strata (distinct population subgroups)
  • Strata instead of being filled by random sampling, they
    are filled by non-random sampling (systematic or
    other)
    – For example, if a total sample size of 1000 is required and
    the population consists of 40% women and 60% men, then
    (non-random) sampling will continue until these percentages
    are obtained and the overall sample quota met
27
Q

What is the Proportional Quota Sampling procedure?

A

Advantages
* An acceptable, more
convenient, alternative
approach if for some reason
stratified random sampling
is not possible
* Compared to simple
systematic sampling, could
ensure the original
population structure as it
uses predefined population
strata
Disadvantages
* The representativeness of
the sample may be
compromised as individuals
are selected in a nonrandom fashion

28
Q

What is convenience sampling?

A
  • Convenience sampling is the most frequent example of
    non-probability sampling
  • Individuals are selected in a non-random fashion, solely
    based on convenience (i.e. they are easy to access)
29
Q

What is the Convenience Sampling procedure?

A
30
Q

What are the advantages and disadvantages of Convenience Sampling?

A

Advantages
* Cheap, fast and convenient
Disadvantages
* The representativeness of
the sample will definitely be
compromised as individuals
are selected in a nonrandom fashion

31
Q

How do you know which sampling method to choose?

A
  • Depends on:
    – The aim of the study
    – The nature of the source population
    – The sample size
    – Other practical issues (i.e. financial resources, time availability, etc.)
  • When no financial and time constrains exist:
    – Always strongly advised to use probability (random) sampling techniques
    in order to minimize selection bias
    – Stratified random sampling is the ideal method if the sample is small
  • When non-random sampling techniques have been used:
    – The representativeness of the sample is always questionable
    – Assume that selection bias is operating at some extent
32
Q

How does sampling method affect descriptive research?

A

In descriptive research (i.e. investigating the prevalence of a
disease in a population):
– Extremely important to have a perfectly representative sample, as
selection bias will greatly influence the findings

33
Q

How does sampling method affect analytic research?

A

In analytic research (i.e. investigating exposure-outcome
associations):
– Minor deviations from a perfectly representative sample may be
acceptable
* Minor selection bias may not affect the findings at a large extent

34
Q

Which sampling method is not prefered?

A

Convenience Sampling

35
Q

What are the 2 types Systematic Error (bias)?

A
  1. Selection bias: Systematic error arising from
    mistakes conducted during the selection of the
    study sample.
  2. Information bias: Systematic error arising from
    mistakes conducted during the measurement of
    key study variables (exposure and outcome).
36
Q

What is information bias?

A
  • Information bias arises from wrong / inaccurate
    assessment of either the exposure or the outcome
    variables
  • Such mistakes may arise from the researchers’ part
    (unintentionally) or from the participants’ part
    (unintentionally or intentionally)
  • There is also instrument bias (fault of the instrument)
    which falls under researcher’s part
37
Q

What is assessor bias?

A
  • Wrong/inaccurate diagnosis due to a clinical error
  • May occur when researchers are not “blinded” to exposure or
    outcome status of participants
  • Wrong/inaccurate measurements due to a faulty
    instrument/machine
  • Wrong/inaccurate measurements due to poor training of
    assessor
  • Mistakes during recording of the data and transferring data
    from paper form into electronic form
38
Q

How can information bias arise from participant action / misinterpreting?

A
  • Wrong/inaccurate answers from participants due to
    misinterpretation of a question
  • Wrong/inaccurate answers from participants due to a
    sensitive issue relating to the question
  • Wrong/inaccurate answers from participants due to poor
    recall (recall bias)
  • Wrong/inaccurate answers from participants intentionally
  • Overall, information bias arising from participant actions
    is called response bias
39
Q

What are the 6 types of information bias?

A
  1. Recall Bias
  2. Interviewer Bias
  3. Observer bias
  4. Hawthorne effect
  5. Surveillance bias
  6. Misclassification bias
40
Q

What is recall bias?

A

Those participant with a particular outcome or exposure
may remember events more clearly or amplify their recollections –
very common in case-control studies- the primary difference arises
more from under-reporting of exposures in the control group rather
than over reporting in the case group

41
Q

What is interviewer bias?

A

A researcher’s knowledge may influence the
structure of questions and the manner of presentation, which may
influence responders – any study design (especially if they are not
blinded to exposures)

42
Q

What is observer bias?

A

Researchers may have preconceived expectations of
what they should find in an examination (especially if they are not
blinded to exposures or medical history)

43
Q

What is the Hawthorne effect?

A

Participants act differently if
they know they are being watched.

44
Q

What is Surveillance bias?

A

The group with the known
exposure or outcome may be followed more closely
or longer than the comparison group (researcher’s
bias).

45
Q

What is Misclassification bias?

A

Errors are made in
classifying either disease or exposure status
(instrument).

46
Q

What are the two types of errors?

A
  1. Systematic error:
    a. Information error
    b. Selection error
  2. Random error
47
Q

How can you minimize bias?

A
  • Be purposeful in the study design to minimize the chance
    for bias; Example: use more than one control group
  • Define, a priori, who is a case or what constitutes
    exposure so that there is no overlap; Define categories
    within groups clearly (age groups, aggregates of person
    years)
  • Set up strict guidelines for data collection
    – Train observers or interviewers to obtain data in the
    same fashion
    – It is preferable to use more than one observer or
    interviewer, but not so many that they cannot be
    trained in an identical manner
    – Optimize questionnaire
48
Q

How does information bias affect study results?

A
  1. Fundamental principle of research:
    If you want to investigate any association between two
    factors, first make sure you measure these two factors
    accurately!
  2. Information bias can be introduced in the assessment of
    both the main exposure and the main outcome, thus the
    association between them will definitely be distorted
  3. Information bias arising from participant actions is much
    more common compared to information bias arising
    from researcher actions
  4. Information bias affects mainly studies that rely on self-reports (i.e. questionnaire-based data collection)
    – In outcome assessment (measurement), in studies where self-reported disease status is used, there is usually double-checking
    (confirmation) with the personal GP of the participant or
    through medical records
    – Similarly, while assessing exposures (diet, physical activity,
    smoking, educational attainment, etc.), the most valid and
    reliable instruments have to be used
  5. If a study relies solely on self-reports, then it should be
    assumed that information bias (measurement error) is
    operating to some extent
  6. The presence of information bias always compromises
    the validity of the study results and in such a case,
    findings have to be interpreted with great caution
49
Q

What should all assessment tools have?

A
  1. Validity
  2. Reliability
50
Q

What is validity?

A

The extent to which an assessment tool (e.g.
questionnaire, instrument, etc.) measures accurately what it is
intended to measure

51
Q

What is criterion validity?

A

Criterion validity is the most common type of validity used in
medical research. In such a case, the results from the
assessment tool of interest are compared with those of an
established (known as gold standard) assessment tool

52
Q

What is reliability?

A

The overall consistency of a measure, as regards
producing the same results when administered under the
same conditions in the same group of people. Also known as
reproducibility or repeatability

53
Q

What are the two main types of reliability?

A
  1. Inter-observer reliability: The degree of agreement between
    the results when two or more researchers (observers)
    administer the assessment tool on the same people under the
    same conditions
  2. Intra-observer reliability: Describes the agreement between
    results when the assessment tool is used by the same
    researcher (observer) on two or more occasions (under the
    same conditions and in the same test population)
54
Q

What is internally valid?

A

If a determination is made that the findings of a study were
not due to any one of these three sources of error, then the
study is considered internally valid.
In other words, the conclusions reached are likely to be
correct for the circumstances of that particular study.

55
Q

What is external validity?

A

This does not necessarily mean that the findings can be
generalized to other circumstances (external validity)

56
Q

NB!

A

DO NOT COMPROMISE INTERNAL VALIDITY IN THE GOAL OF GENERALISATION