Next Generation DNA Sequencing Flashcards
What is “Next Generation Sequencing” (NGS)?
In general, the phrase “Next Generation Sequencing” refers to a methodology that uses;
- A ‘flowcell’ enables ‘massively parallel sequencing’ of millions of DNA molecules at the same time
- Requires ‘clonal amplification’ of individual DNA molecules into a cluster or on a bead to enable visualisation of the reaction.
- Relys on the chemical process of ‘sequencing by synthesis’ i.e. the DNA sequence is decoded by the stepwise synthesis of a DNA strand one nucleotide at a time.
What else is NGS commonly referrd to as?
aka “Second Generation Sequencing” in reference to coming after Sanger which is “First Generation”
What are the main workflow stages to performing NGS?
- DNA extraction
- Library preparation*
- Sequencing*
- Bioinformatic analysis
- Data Interpretation
*Varies based on platform/system used.
What is “library prep”?
- NGS requires the preparation of “DNA libraries” ready for loading onto the NGS instrument
- Library prep refers to the process of processing genomic DNA into a sample suitable to load onto an NGS instrument.
- The main objective of library prep is to end with a sample of DNA fragments where the molecules from separate samples are fused with adapters so that multiple samples can be pooled together on a single run.
What different strategies can be taken towards library prep?
Two broad categories;
Targetted; Specific regions of the genome, ranging from a handful of genes up to the entire human exome are enriched during the library prep process such that only these regions are sequenced
Non-targetted: The DNA is processed into fragments with adapters required for seq etc but no enrichment is performed resulting the entire genome from a given sample being sequenced.
What are the advantages of targetted NGS vs non-targetted NGS?
- Much cheaper per sample costs as far less sequencign is performed and many patients can be loaded onto a single sequencing run.
- Coverage can be better as enrichment assays can be optimised to fill difficult to sequence regions
- less chance of incidental findings
- less chance of VUSs as genes in target region should be clearly linked to indication
What two methods are available for targetted NGS strategies?
- PCR-based (amplicon) Methods
- Hybridisation (capture) Methods
Describe the principle of PCR-based methods for target enrichment.
- Sequencing ready libraries can be produced from amplifying fragments with PCR primers.
- PCR primers are designed to contain the adapters for flow-cell attachement and barcodes.
- The fragments are then purified and sequenced.
- There are many commercially available platforms based on amplicon sequencing with most companies offering predesigned off-the-shelf panels in addition to custom panels
What is one of the most popular amplicon-based library prep kits on the market?
- Illumina’s Truseq methodology represents an alternative to PCR but based upon similar principles.
- Rather than relying on error-prone PCR, Truseq uses a single primer extension methodology to generate target regions flanked by appropriate sequencing adaptors and molecular identifiers (MIDs)/barcodes.
What are the avantages to using amplicon library prep?
- Is a relatively cheap method.
- Can focus on small regions and multiplex many samples per run.
- technically simple
- Can utilise long range PCR (LR-PCR) to amplify large genomic regions containing multiple exons - very useful for diseases where there is a common pseudogene (PKD, CAH)
What are the disavantages to using amplicon library prep?
- There is a big drawback to using amplicon selection assays: PCR duplicates
- PCR duplicates should not be used to generate accurate read depth (vertical coverage) metrics and cannot be used to provide an estimate of copy number across the target area.
- Only unique reads should be used to generate coverage data, and it is therefore recommended to avoid the use of read depth for amplicon-based assays.
Describe the principle of hybridisation-based methods for target enrichment.
- Relies on solution-based capture of target regions.
- genomic DNA must be fragmented
- then tagging of fragments with sequence-ready adaptors and barcodes
- the targetted regions are captured using RNA- or DNA-based oligonucleotide ‘baits’ containing biotin
- These oligos anneal to specific regions of the genome to result in a tiling of captured fragments representative of the entire region of interest.
- Magnetic beads coupled to Streptavidin can then be used to physically separate the fragments bound to the baits from the rest of the input DNA.
What is one of the most popular hybridisation-based library prep kits on the market?
- Agilent’s SureSelect kit.
- Starting from gDNA, a shearing step produces small fragments
- Prepare library with sequencer specific adaptors and sample-specific barcodes
- Hybridise sample with biotinylated RNA library baits. Agilent uses ultra long 120mer RNA baits for the highest specificity.
- Select targeted regions using magnetic streptavidin beads
- Amplify and load on the sequencer
What are the avantages to using hybridisation-based library prep?
- Big advantage is that PCR duplicates can be bioinformatically removed because the DNA is fragmented prior to adding PCR adaptors
- This means that analysis of read-depth can give insight into copy number allthough this remains a challenge.
- Give better horizontal coverage than amplicon sequencing as difficult to sequence regions can be isolated by including many more baits than average for the rest of the panel.
What are the disavantages to using hybridisation-based library prep?
- Big diadvantage is that DNA must be sheared before beginning. This often required sonication of gDNA and can be very expensive and time consuming.
- Can be more expensive
- Can require extensive optimisation to improve coverage
- Technically more complicated than amplicon sequencing requiring more manual handling time
What major library prep innovation facilitated the uptake of enrichment based library preparation without the drawbacks of performing expensive DNA shearing?
Illumina Nextera “tagmentation” of DNA simultaneously fragments and tags DNA without the need for mechanical shearing.
Tagmentation uses a transposase enzyme to simultaneously fragment gDNA and insert seuqncing adapters onto the dsDNA fragments.
Enzymatic fragmentation could be worse than physical shear methods when it comes to bias, but has shown to be consistent in the long-run and is now widely used.
What levels of targetted NGS analysis are commonly performed in diagnostic laboratories?
- Targetted panel: Assay targets disease specific genes only.
- Clinical Exome: All gene with a known disease association are included (~7,000)
- Whole Exome: All protein coding regions of the genome are included.
What are the advantages and disadvantages of choosing a Targetted panel approach?
Advantages
- More samples per run = cheaper per patient
- Less compute power and storage required
- Assay optimised for 100% coverage
Disadvantages
- Inflexibility, can not incorporate novel disease loci without redesigning the capture design
- Need high referral rate or runs could be delayed whilst waiting to ‘batch’
What are the advantages and disadvantages of choosing a Clinical Exome approach?
Advantages
- Virtual panels enable complete flexibility to use the same assay for many diseases
- Enables efficient web-lab processes e.g. do not need batch for disease-specific assays = cost effective
- Sequencing not wasted on non-disease associated genes
Disadvantages
- Can not optimise assay so could be gaps
- Newly discovered disease genes may not be included
- Vast majority of the sequence data not used = wasted sequencing, compute and storage