Metagenomics Flashcards

1
Q

why are internal standards important when comparing metagenomic samples across spatial and temporal scales?

A

Because they allow for absolute quantification of organisms/transcripts in a sample (genes/L or avg transcripts/cell)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If an internal standard isn’t used, how do most studies collect meta-omics data?

A

In a relative framework, in which abundance of genes is calculated as percent of the sequence library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How are internal standard reads recovered and quantified after sequencing? (2 steps)

A
  1. By using a BLASTn homology search for the template sequence against the reference genome sequence for the internal standard 2. Then take the initial BLAST hits and use a BLASTx search against the RefSeq database to identify all the protein encoding reads
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is a second BLAST step needed to recover and identify reads from an internal standard after sequencing?

A

to account for false positives in the BLASTn homology search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What amount of internal standard DNA should be added to a metagenomic sample?

A

Enough to quantify but not so high as to dominate the reads ~0.5% of expected yield (should yield 0.1-5% of total reads)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of DNA should be used as an internal standard?

A

DNA from a sequenced and cultured microbe that is not present in the environment. Example: a hydrothermal vent organism such as Thermus thermophilus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What paper are the equations for metagenome normalization using internal standards found in?

A

Stainsky et al. 2013 Chapter 12 in Methods in Enzymology Vol. 531 “Use of Internal Standards for Quantitative Metatranscriptome and Metagenome Analysis”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How many hypervariable regions are on the SSU ribosomal gene?

A

9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which hypervariable regions are best for distinguishing pathogenic bacteria?

A

V2, V3, and V6.

V1 better for gram-pos, and V4-9 were are less discriminatory (Chakravorty et al. 2008, looked at 110 blood borne pathogens)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When creating a 16S library, what is the next step after PCR amplification?

A

Adding Illumina adapters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many reads are generated by a MiSeq Illumina run?

A

~25 million reads/flowcell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the max read size using MiSeq?

A

300 bp (can be paired or single end)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Assuming 96 indexed samples, how many reads can you get per sample on a MiSeq?

A

>100,000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How many reads can be generated by a HiSeq 4000?

A

2.5 billion per flow cell (312.5 million bp per lane)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the max read size for a HiSeq 4000?

A

150 (can be paired or single)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How many bp sequence data are estimated to be needed for the de novo assembly of a marine metagenome?

A

~1e13 (HiSeq4000 flowcell = ~3.75e11)