The Metagenome Flashcards

1
Q

What is Metagenomics?

A

Metagenomics is the study of genetic material recovered directly from environmental or biological systems/compartments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define microbiome and microbiota

A

Microbiome
“a characteristic microbial community occupying a reasonably well-defined habitat which has distinct physio-chemical properties. The term thus not only refers to the microorganisms involved but also encompasses their theatre of activity” [1]
Microbiota
ecological community of commensal and pathogenic microorganisms. Includes bacteria, archaea, protists, fungi and viruses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give some examples of environmental and human biomes

A
Environmental:
•	Deep sea microbiome
•	Soil microbiome
•	Hospital microbiome
•	Subway microbiome
Human:
•	Gut microbiome
•	Skin microbiome
•	Oral microbiome
•	Vaginal microbiome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How is the Microbiome unique to each individual?

A
  • Microbiome unique to each individual, even between twins. So genetics are not really playing a part
  • Changes in the microbiome have been associated with multiple human illnesses, e.g. Irritable Bowel Syndrome, depression, cancer.
  • Gut microbiome can classify individuals as lean or obese with >90% accuracy
  • Early-life gut microbiomes linked to development of allergic conditions e.g. asthma
     Stool microbiome during Clostridium difficile infection (CDI), quite different from healthy stool microbiome.
     CDI has greater effect on stool microbiome than host genetic factors
     Faecal microbiota transplant is able to cure CDI
     Restoration of the stool microbiome to that of healthy state is rapid following transplantation
    It’s not the combination of species that is relevant, but the combination of genes which is relevant. The metabolic pathways which exist in those microbiomes are actually having an effect on the patients.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the technological approaches to sequencing the metagenome?

A

Targeted PCR amplification
- Uses a single gene marker which we know has a certain amount of variation in the population between different species
o We use this as a proxy to try and identify what kind of organisms are in a sample
 For bacteria, we use 16S rRNA
 Internal transcribed spacer (ITS), 18S rRNA are used for eukaryotes such as yeast and fungi
Whole genome shotgun sequencing

o Largely conserved apart from the variable regions
o We use the variable regions to try and separate species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe 16S targeted PCR amplification workflow

A
  1. Collect a sample which will be a mixed population
  2. Extract the DNA for all the bacteria
  3. Then do the 16S PCR amplification. The amplification should correspond to the abundance of the bacteria
  4. Then take the PCR products, put them in the sequencing machine
  5. ANALYSE!
    - Have to be careful of bias
    o Some bacteria will purify better than others
    o Also some sequences will amplify better than others
    Take the sequences and compare them against a known database
    - Common ones are:
    o Greengenes
    o Silva
    o RDP
    - Try and generate a list of which species are present
    - Plot on a stacked bar chart
    Hard to get to species level using 16S so we are stuck at the genus level
    - Multiple open source analysis pipelines are available:
    o QIIME
    o Mothur
    o DADA2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which variable region do we choose?

A
  • Phylogenetic signal
    o Does that VR contain enough information to separate the species?
    o E.g. V4 is used for gut microbiome but can sometimes not be as clear as V1-V3
  • Amplicon length
    o How large can the PCR product be?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How large can the PCR product be?

A
  • If you choose V1-V3 which is roughly 500-600 bases long.
    o If you do it 2 x 300 bp
    o You will find that the sequences overlap a little bit in the middle
     Not too much, but only about 50-100
    o When you combine the two sequences into one you get the ~500bp one
     BUT all sequencing technologies have errors, random errors
     If you had three sequencing errors, the bit in the middle can be corrected via the overlapping
    • But the errors on the left and right, because they are not sequenced in both directions, the sequencing errors won’t be corrected, and they will end up in the final data.
  • If you took a smaller region such as V1-V2 or just V4, it is a much smaller region
    o You get perfect overlap, and all the sequencing errors would be corrected.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What controls to use?

A
  • rRNA gene found in all bacteria so have to be careful
  • Method very sensitive to contamination
    o Environment
    o Operator (only have one person)
    o Reagents (most biology reagents may not be sterile)
  • Especially important for low biomass samples
  • “Kitome” – we all use kits for the sequencing so we could be sequencing the molecular biology reagents rather than the samples.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we mitigate potential contamination?

A
  • Randomise samples
  • Note batch numbers of reagents
  • Sequence negative controls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe Whole genome shotgun workflow

A
  • The same method as 16S PCR amplification except we use the whole DNA sequence
    o There is no bias here as we are not just looking at bacteria
     We are looking at everything, e.g. hosts, viruses, yeasts
  • Once we have the whole genome shotgun, we re-assemble it to where we think they should go
  • After assembly, we look at the taxonomic diversity but we can also try and predict genes.
    o By predicting genes, we can perhaps see which metabolic pathways are present.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the problems with Whole genome shotgun workflow?

A
  • Host cells often in excess in the sample
  • No amplification step to enrich for bacterial DNA, we have all the host cells
  • Sample dependent, typical yields of contaminating human reads:
    o Faecal: <10% human reads
    o Saliva, nasal, skin samples: >90% human reads
  • How to enrich without amplification?
    o Pre-extraction
     Differential lysis of mammalian cells
    • Enriches for intact microbial cells
    • Potential bias towards gram-positive bacteria
    o Post-extraction
     Enzymatic degradation of methylated nucleotides
    • Tends to target mammalian DNA
    o Provides bias against AT rich bacterial genomes however
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

give a summary of 16s PCR AMPLICATION AND WHOLE GENOME SHOTGUN SEQENCING

A
Targeted 16S PCR amplification
-	Assess taxanomic diversity in sample
-	Biased, only bacteria
 Whole genome shotgun sequencing
-	Assess taxanomic diversity in sample
-	Assess composite gene functions in sample
-	Unbiased, all micro-organisms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly