SU DNA sequencing Flashcards
Sanger sequencing
First generation sequencing
Principle: use DNA polymerase to copy the DNA, incorparate chain-terminating agents dideoxynucleotides (ddNTPs) to stop DNA synthesis, generating many DNA strands of varying lengths. Each fragment ends with a labeled nucleotide, determination with electrophoresis, length seperation. While this method produces highly accurate sequences (error rate <0.1%), it is limited to short read lengths (up to 900 base pairs) and relatively low throughput.
- Reagents with target DNA, ddNTP’s of all nucelotides, polymerase, dNTP’s and primers
- Primer annealing and chain extention
- ddNTP binding and chain termination
- fluorescently labelled DNA sample
- Capillary gel electrophoresis and flueorescence detection
- Sequence analysis and reconstruction
Illumina sequencing
Second generation sequencing
In Illumina sequencing, you get short reads, typically between 50 to 300 base pairs long. These reads are produced in large quantities and are highly accurate. Longer fragments are not possible as during the bridging it can break. They can be single-end (reads from one end of the DNA fragment) or paired-end (reads from both ends, providing more information, longer reads possible).
Process:
1. Sample Prep: Fragment DNA and add adapters.
2. Cluster Generation: DNA binds to the flow cell, bridge amplification creates clusters of the same sequence.
3. Sequencing by Synthesis (SBS): Fluorescent nucleotides are added and detected one by one.
4. Data Collection: Fluorescence is recorded, building sequence reads.
5. Read Assembly: Reads are aligned or assembled de novo.
6. Data Analysis: Bioinformatics tools analyze the sequences for biological insights.
Both second and first generation require DNA amplification prior to sequencing
Third generation sequencing
- Amplification limits the read length: PCR is limited to thousands of nucelotides, after hundreds of synthesis cycles clusters lose sync (sequencing out of phase)
- Not all DNA amplifies with equal efficiency (e.g. bias against CG-rich DNA)
- Does not need amplification beforehand, uses single molecules
PacBio
- Long reads, older
PacBio single-molecule, real time (SMRT) - PacBio sequencing technology is based on the principle of sequencing while synthesizing, with sequencing lengths up to 30kb and throughputs up to 20 Gb. It happens in small wells which produce stable envoirnment.
Can detect polymerase kinetics to detect bas modifications like methylation. - Ligation of adapters to double-stranded template
- DNA polymerase added
- Sequencing takes place in millions of tiny wells (zero-mode waveguides)
- DNA polymerase adds complementary, fluorescently labelled bases to the DNA strand.
- With incorporation of base, fluorescence signal emitted: sequence reading in real time.
- Sample Prep: Fragment DNA and add adapters for circularization.
- SMRTbell Formation: DNA fragments form circular templates, called SMRTbells.
- Real-Time Sequencing: DNA polymerase replicates the circular DNA, adding fluorescently labeled nucleotides.
- Data Collection: Fluorescent signals are detected in real-time as the polymerase incorporates nucleotides.
- Long Reads: Generates continuous long reads (CLR), often 10-50 kb in length, with high coverage.
- Data Analysis: Bioinformatics tools assemble long reads and correct errors for accurate genome assembly or variant detection.
This method is ideal for sequencing long, repetitive regions and complex genomes.
Nanopore
single DNA strands pass through a tiny nanopore, and changes in electrical current are measured to identify the sequence of nucleotides in real-time. It produces ultra-long reads and allows immediate data analysis, making it ideal for complex genomic regions and real-time applications.
- Sample Prep: Fragment DNA and add sequencing adapters.
- Pore Binding: DNA is guided through a protein nanopore embedded in a membrane.
- Real-Time Sequencing: As DNA passes through the nanopore, changes in electrical current are measured.
- Data Collection: The electrical signal corresponds to the sequence of nucleotides.
- Long Reads: Nanopore sequencing produces ultra-long reads (sometimes over 100 kb).
- Data Analysis: Signals are decoded into base sequences, which can be assembled or analyzed directly.
Nanopore sequencing typically has lower accuracy compared to Illumina sequencing
Summary
- Sanger sequencing is usefull for checking single clone
- Illumina if the NGS standard, and for quantative applications
- PacBio and nanopore for genome sequencing and structural studies
Sanger: technology dideocy chain termination, length <900, run time a few hours, error <0.1%
Illumina: sequencing by synthesis, 2x150bp length, 13-48 hours, >85% Q30 (1 in 1,000)
Pac Bio: single molecule real-time (SMRT), 25 kbp HiFi, under 30 hours, Q30 HiFi
Nanopore: nanopore sensing, 10kbp-4Mbp, <72 hours, Q20 (1 in 100)
If you want to specifically determine the sequence of a single 500 bp fragment (for instance, a gene you cloned), which sequencing technology would be the wisest to use, Sanger, as it will yield the highest quality for the lowest price
Which sequencing technologies rely on the action of DNA polymeras, Sanger, Illumina and PacBio and NOT Oxford Nanopore