Next Generation Sequencing Flashcards
Typical design of a gene panel for NGS
- Entire exonal sequence of the genes
- +10 base pairs into intronic sequences (NOT deep intronic sequences)
- Promoters are NOT covered (eg, TERT promoter)
- Large indels (about 100 bp or more) are usually missed due to insufficient priming
Hotspot Panels
Focus on hot spot regions which are frequently associated with SNVs and small indels
Panels are not faster, but can be run on poorer quality / less DNA
NGS sequencing is run in. . .
. . . batches, to reduce costs.
Meta-mutational data
For example, MSI or UV signature – patterns of mutation
Require larger DNA sequence input/reading, since these are effectively statistical assays that require a large N.
Overestimation of tumor percentage risks . . .
. . . a false negative.
Evidence Tiers
Evidence tiers are primarily determined by. . .
. . . evidence type, not necessarily evidence quality
Assessing VAF
Sample isolation techniques
Emulsion PCR
PCR, but the aqueous phase is interrupted and spread across many individual cells within an oil emulsion.
Enables many parallel reactions to occur simultaneously.
This is the fundamental technique which produces the massively parallel component of next generation sequencing – many reactions are run in tiny emulsion chambers which in theory may contain different substrate and allow for numerous separate but simultaneous PCRs.
Amplification in emulsion PCR
(454 method)
The genome is fragmented through one of numerous possible techniques.
3’ overhangs are digested and 5’ overhangs are filled in to create a library of blunt dsDNA fragments. Then, A and B dsDNA adaptor sequences are added to the ends of each DNA fragment. A and B adaptors have 3’ hydroxyls, but lack 5’ phosphates (to prevent A-B pairing).
The non-ligated half of the dsDNA adaptor sequences are melted off and the overhangs are filled in by PCR. PCR enrichment then ensues with A’ and B’ primers, which selectively result in amplification of the library fragments (A-A and B-B form lariats, A- and B- only extend linearly, therefore only A-B amplifies).
Illumina method for emulsion PCR amplification
Rather than using separate A and B adapters which both lack a 5’ phosphate, Illumina utilizes the same Y-shaped adapters with a region of homology at the library dsDNA interface, but which then branch out into nonhomologous strands with sequences of A and B’.
When the first round of PCR takes place, you then get your full adaptor sequences connected to the first PCR product (A and B’, B and A’), and the PCR amplifies.
Emulsion PCR setup in 454 NGS (after library preparation)
Ideally, you have created a libary fragment:emulsion bubble ratio such that each bubble only contains one fragment at most – minimizing fragments with multiple samples.
Each bubble also contains a magnetic bead with the A’ primer for PCR, as well as a B’ primer which is free floating in solution.
Sequencing step in 454 NGS
NGS is a pyrosequencing-based approach. After library preparation and amplification, beads with attached libary amplification product are singly isolated into picoliter wells.
Pyrosequencing via a flow-based sequencing by synthesis is performed in each well.
When the correct nucleotide is flowed in, it is added to the strand and a pyrophosphate is released. The pyrophosphate is then utilize by ATP synthase to make ATP, which powers firefly luciferase to cleave luciferin and create a flash of light, indicating that this was the correct nucleotide. This occurs across millions of bound library amplification products bound to the same bead.
What is the rate limiting step of the 454 NGS sequencing phase?
The speed at which nucleotides are flowed into the picoliter wells.
What is the biggest challenge in the 454 NGS sequencing phase?
Quickly washing the wells to ensure that only one nucleotide is present at a time for sequencing by synthesis.
Why does the signal:noise ratio decrease with your position in 454 NGS?
Not every position on the bead will incorporate every time, and so more and more beads will be synthesizing out of sequence with the rest as time goes on.
This mostly creates problems with serial nucleotides over 5, since the variability in signal:noise ratio makes it difficult to precisely estimate the expected value of many serial nucleotides.