Meta-Omics Flashcards
What is amplicon sequencing (16S gene) for?
What is shotgun sequencing for?
Amplicon - Metataxonomics
Shotgun - Metagenomics
What was done before high-throughput sequencing to do microbiome research?
Do 16S PCR, run DNA product across gradients and it would migrate out to form bands
Separation of genes with mutations
- Diversity
What did scientists do when they sailed around the world?
Sampled the oceans for microbial DNA for shotgun sequencing
Drastically changed the number of proteins in genomic databanks
How did marine microbiome differ to soil microbiome research?
Marine microbiology – Frontier of microbial ecology and technology; Using metagenomics approach
Soil microbiology – Lagged behind due to the complexity/diversity of soil microbiome and contaminating substances
What advancements allowed for the rise of metagenomics?
Prices of sequencing dropped massively
Computational power, bioinformatics, sequencing technology all improving
Amplicon vs Metagenomic Sequencing (hint - single vs all)
Amplicon involves sequencing several copies of reads from 1 target gene in a mix of many fragments
- Use primers to do PCR of gene of interest
Metagenomics is to do with sequencing short sequence from all the DNA in an environmental sample
What is meant by “no genomic context” as a disadvantage for amplicon sequencing?
Certain genes move around on plasmids and mobile genetic elements; Not part of chromosomes
If you sequence these out of the environment, you have no genomic context for where these things are
These genes move around via horizontal gene transfer; Sequence phylogenies won’t match organisms they’re coming out of
Disadvantages of metagenomics? (2 main ones)
More computational power
Mis-annotation of functional genes; Assigns a gene a function which is incorrect
Amplicons sequencing uses degenerate primers. What are these?
Degenerate primers target regions of high conservation like active sites or specific folding regions; Regions encoding function
Within this region there is a degree of variability; Degenerate primers are mixtures of similar primer sequences that take this variability in specific regions into account
What are the 2 problems with amplicon sequencing? (hint - primers)
Primer bias; Primers may only be biasing towards 50,000 and not picking up the other 50,000
Need to know what you’re looking for so you can develop specific primers
What is the top-down approach for metagenomics?
Not looking at specific function
Sequence everything and see where differences are
Then generate general hypotheses
This method generates lots of new data
What is the bottom-up approach for metagenomics?
Work out novel function for a novel protein
If it’s in an environmental bacteria, you want to know what this protein is doing in the environment, its distribution, whether it associates with any environmental niches etc.
Can recycle available data in public databases
What can primer bias vs shotgun approaches give?
Variations in data with areas of overlap but also areas of abundance where they are more enriched in either metagenome or 16S
What was TARA Oceans and what meta- methods did it utilise?
What did all this data help us do?
Huge sampling effort of the oceans using metagenomic (DNA) and metatranscriptomics (RNA) data
Allowed us to understand how the oceans are working oin a molecular level
What did the metagenomic data from TARA lead to?
Creation of a freely available web tool; Ocean gene atlas
What is the method used for assembling and quantifying gene abundance? (hint - 2 levels of assembly; Inter- and intra-site)
Assemble contigs and identify genes and then take all the small reads that made up the contig and map them back to the contig
- The more reads assigned to a contig/specific gene, the more abundant that contig/gene is in environment; Generates abundance profiles
Then for each site, you can look for a specific gene and see which site has more reads mapped to it to see its abundance at a site
- Can also compare different genes within 1 site and see which gene is more abundant
How can the method/algorithm for database searching impact conclusions?
Can show show differing results e.g. One gene being more distributed or abundant than other when they other method finds the opposite
Explain Basic Local Alignment Search Tool (BLAST) (hint - E)
Set a stringency (cut-off); Expect (E) value – Lower E means better match
Position by position comparison
- Does 1:1 matching of query protein sequence
Insertions and deletions reduce score (not E)
Explain Profile Hidden Markov Modelling (pHMM) in relation to BLAST
Using pHMM more PhoA hits were found than using BLAST. What does this say about BLAST? (hint - false)
Takes regions of conservation of protein into account; Can be more confident
Same stringency as BLAST search (e-60)
BLAST alone is not sufficient as it doesn’t fully uncover diversity
- False positives and/or false negatives
Functional genes for P cycling? (4 genes)
phoX
phoD
phoA
glpQ
How does abundance of PhoX and other phylogeny differ across regions and sites?
Their abundance varies across regions and sites
Explain STrain Resolution ON assembly Graphs (STRONG)?
Assembly method allowing distinguishing of different strains
- Relies on building contigs
- Iterative approach
Get short reads from sequencing and co-assemble reads via algorithms to make contigs; Then map them to build up the genome and resolve different strain genomes
How do phosphonates differ to most organic phosphorous bonds?
Phosphonate bonds are C-P which are stronger so they need specialised enzymes to be broken down
What molecules can C-P lyase break down?
What does it have associated? (hint - transporter)
What is meant by it being promiscuous?
Phosphonates (C-P bond)
Has associated ABC transporter
Promiscuous; Works on multiple different substrates (phosphonates)
What is an oligotroph?
Are they the least abundant group of organisms on the planet?
Oligotrophs (adapted to low nutrient environments)
No - Most abundant group of organisms on the planet
What is the marine methane paradox?
How does methyl phosphonate breakdown link? (hint - limited phosphorous)
Large amounts of methane coming from oxygenated waters; Confusing as methanogenesis is an anoxygenic process
In regions of the ocean where inorganic Pi is limited, bacteria (e.g. SAR11) switch to breaking down methyl phosphonate; More methane production which is bad
In regions of the ocean where phosphorous is available, they don’t touch methyl phosphonate
Why do some bacteria have 2AEP degradation pathways?
Microbes can degrade 2AEP as a sole nitrogen source
What did metagenomics and MAGs enable? (hint - phosphonate)
Identification of many putative (predicted) phosphonate catabolism genes located near phosphonate transporters
What does metagenomics only provide a snapshot of and how can metaproteomics help?
Metagenomics only gets a snapshot of genetic potential
Metaproteomics helps link structure and function
When is metagenomics useful?
Metatranscriptomics can take us a step further through what?
Why is metaproteomics better than these
Analysing long-term shift in microbiome composition and function
Analysing how post-transcriptional regulation occurs in bacteria
Investigates the ‘functional entities’ of cell (protein); Machinery that drives reactions and causes effects
Difficulties with metatranscriptomics and metaproteomics? (1 for each)
RNA is hard to handle, as once you sample bacteria their environment changes, so they change their transcriptome
Large computational requirements for metaproteomics
Using metaproteomics what can we identify? (hint - nutrient stress)
Identify proteins (corresponding genes) that are induced under different nutrient stress conditions
e.g. Phosphatases
What is bulk soil and rhizosphere?
Difference in metagenomic and metaproteomic profiles?
Bulk soil - Soil away from plant roots
Rhizosphere - Thin soil layer around plant root
So large difference in their metagenomic profiles but huge separation in metaproteomic profiles
How would a rhizosphere adapted bacteria such Pseudomonas spp. be adapted to this environment?
Key trait of bulk soil adapted bacteria?
Produce rhizosphere specialist (plant) proteins with more pathways relating to P cycling
More oligotrophic due to low nutrient availability Can use a wider variety of nutrients
When else is meta proteomics useful? (hint - growth)
Give an example
Identify pathways enriched during growth on various compounds e.g. complex polysaccharides which require many enzymes to hydrolyse each type of bond
Using microbiota inside anaerobic bioreactors, what can we identify? Give an example
Distinct proteins like CAZymes which break down sugars
What is the cause of the large computational cost of metaproteomics? (hint - data)
Combination of 2 datasets - DNA sequencing data with mass spectrometry data
Problems with using mass spec. in metaproteomics?
Couldn’t separate all the peptide so we didn’t sequence all the peptides
Mass spec. doesn’t tell us everything about what peptide it is; Just m/z ratios