The Metagenome Flashcards
What is Metagenomics?
Metagenomics is the study of genetic material recovered directly from environmental or biological systems/compartments
Define microbiome and microbiota
Microbiome
“a characteristic microbial community occupying a reasonably well-defined habitat which has distinct physio-chemical properties. The term thus not only refers to the microorganisms involved but also encompasses their theatre of activity” [1]
Microbiota
ecological community of commensal and pathogenic microorganisms. Includes bacteria, archaea, protists, fungi and viruses
Give some examples of environmental and human biomes
Environmental: • Deep sea microbiome • Soil microbiome • Hospital microbiome • Subway microbiome Human: • Gut microbiome • Skin microbiome • Oral microbiome • Vaginal microbiome
How is the Microbiome unique to each individual?
- Microbiome unique to each individual, even between twins. So genetics are not really playing a part
- Changes in the microbiome have been associated with multiple human illnesses, e.g. Irritable Bowel Syndrome, depression, cancer.
- Gut microbiome can classify individuals as lean or obese with >90% accuracy
- Early-life gut microbiomes linked to development of allergic conditions e.g. asthma
Stool microbiome during Clostridium difficile infection (CDI), quite different from healthy stool microbiome.
CDI has greater effect on stool microbiome than host genetic factors
Faecal microbiota transplant is able to cure CDI
Restoration of the stool microbiome to that of healthy state is rapid following transplantation
It’s not the combination of species that is relevant, but the combination of genes which is relevant. The metabolic pathways which exist in those microbiomes are actually having an effect on the patients.
What are the technological approaches to sequencing the metagenome?
Targeted PCR amplification
- Uses a single gene marker which we know has a certain amount of variation in the population between different species
o We use this as a proxy to try and identify what kind of organisms are in a sample
For bacteria, we use 16S rRNA
Internal transcribed spacer (ITS), 18S rRNA are used for eukaryotes such as yeast and fungi
Whole genome shotgun sequencing
o Largely conserved apart from the variable regions
o We use the variable regions to try and separate species
Describe 16S targeted PCR amplification workflow
- Collect a sample which will be a mixed population
- Extract the DNA for all the bacteria
- Then do the 16S PCR amplification. The amplification should correspond to the abundance of the bacteria
- Then take the PCR products, put them in the sequencing machine
- ANALYSE!
- Have to be careful of bias
o Some bacteria will purify better than others
o Also some sequences will amplify better than others
Take the sequences and compare them against a known database
- Common ones are:
o Greengenes
o Silva
o RDP
- Try and generate a list of which species are present
- Plot on a stacked bar chart
Hard to get to species level using 16S so we are stuck at the genus level
- Multiple open source analysis pipelines are available:
o QIIME
o Mothur
o DADA2
Which variable region do we choose?
- Phylogenetic signal
o Does that VR contain enough information to separate the species?
o E.g. V4 is used for gut microbiome but can sometimes not be as clear as V1-V3 - Amplicon length
o How large can the PCR product be?
How large can the PCR product be?
- If you choose V1-V3 which is roughly 500-600 bases long.
o If you do it 2 x 300 bp
o You will find that the sequences overlap a little bit in the middle
Not too much, but only about 50-100
o When you combine the two sequences into one you get the ~500bp one
BUT all sequencing technologies have errors, random errors
If you had three sequencing errors, the bit in the middle can be corrected via the overlapping
• But the errors on the left and right, because they are not sequenced in both directions, the sequencing errors won’t be corrected, and they will end up in the final data. - If you took a smaller region such as V1-V2 or just V4, it is a much smaller region
o You get perfect overlap, and all the sequencing errors would be corrected.
What controls to use?
- rRNA gene found in all bacteria so have to be careful
- Method very sensitive to contamination
o Environment
o Operator (only have one person)
o Reagents (most biology reagents may not be sterile) - Especially important for low biomass samples
- “Kitome” – we all use kits for the sequencing so we could be sequencing the molecular biology reagents rather than the samples.
How do we mitigate potential contamination?
- Randomise samples
- Note batch numbers of reagents
- Sequence negative controls
Describe Whole genome shotgun workflow
- The same method as 16S PCR amplification except we use the whole DNA sequence
o There is no bias here as we are not just looking at bacteria
We are looking at everything, e.g. hosts, viruses, yeasts - Once we have the whole genome shotgun, we re-assemble it to where we think they should go
- After assembly, we look at the taxonomic diversity but we can also try and predict genes.
o By predicting genes, we can perhaps see which metabolic pathways are present.
What are the problems with Whole genome shotgun workflow?
- Host cells often in excess in the sample
- No amplification step to enrich for bacterial DNA, we have all the host cells
- Sample dependent, typical yields of contaminating human reads:
o Faecal: <10% human reads
o Saliva, nasal, skin samples: >90% human reads - How to enrich without amplification?
o Pre-extraction
Differential lysis of mammalian cells
• Enriches for intact microbial cells
• Potential bias towards gram-positive bacteria
o Post-extraction
Enzymatic degradation of methylated nucleotides
• Tends to target mammalian DNA
o Provides bias against AT rich bacterial genomes however
give a summary of 16s PCR AMPLICATION AND WHOLE GENOME SHOTGUN SEQENCING
Targeted 16S PCR amplification - Assess taxanomic diversity in sample - Biased, only bacteria Whole genome shotgun sequencing - Assess taxanomic diversity in sample - Assess composite gene functions in sample - Unbiased, all micro-organisms