1. Sequencing intro Flashcards
What is the biggest currently running genome project in UK?
Darwin Tree of Life (part of Earth BioGenome Project) - aim to seqeunce all living species
- so far 500 species - mostly insects because smaller genomes - easier to sequence
What are the uses of DNA sequencing?
Why sequence DNA:
- detect new species
- genotype individuals
- identify the presence of organisms (ex take air / water samples - detect organisms from the found DNA)
- determine epigenetic patterns - gene expression regulation
- determine gene expression patterns
What are the classification on sequencing based on sequence length? What methods are used for each?
Short-read:
- Illumina 150-300 bp both ways
Long-read:
- Sanger sequencing 1000 bp
- Oxford nanopore technology (ONT)
- Pacific Biosciences (PacBio)
Explain the classical sequencing mechanism
Sanger sequencing:
- based on synthesis + base-specific termination
- ssDNA sequence - adding specific primer - must know the sequence for the primer
- adding radioactively / ** fluorescently** labelled bases - termination of synthesis by ddNTPs (ddATP/ddGTP/ddCTP/ddTTP) - **lack 3’ OH **- no further nucleotide can be added
- for random integration 99% dNTPs + 1% ddNTPs of the specific base A/T/C/G - produces different length fragments - figure out position of the ddNTP
What is manual and automated Sanger sequencing
Manual: radioactively labelled ddNTPs - manually figure out the sequence of bases by travelled fragment distance
Automated: fluorescentlly labelled ddNTPs - use detector to record fluoresence at each fragment - sequencing chromatogram (peaks of each base)
Compare all sequencing technologies based on read lengths and error rates
- Illumina: short read, low error rate - uses universal adaptor (no primer)
- Sanger: medium length reads, low error rate, requires a sequence-specific primer
- ONT: very long reads, high error rate + minion portable sequencer, uses universal adaptor (no primer)
- PacBio: long reads, low error rate (because of HiFi), uses universal adaptor (no primer)
What measure evaluates error rates in sequencing?
Q value - Phred quality score
Explain Illumina sequencing
Illumina NGS:
1) Sample preparation: generating DNA library by sonication (DNA fragmented)
2) Cluster generation: ligation to 2 adaptors - ‘bridge amplification’ (cluster amplification) - when enough bridges - denaturation of one strand => high density clusters
3) Sequencing by synthesis: sequencing using dNTPS (dATP, dGTP, dTTP, dCTP) + reversible 3’ - universal primer annealed - DNA pol - sequening of all sites started at once - imaging records fluorescent colour at each position - after imaging dye cleaved => cycle repeated many times for all bases to be sequenced
4) Data analysis: overlapping reads aligned - data anaysed
How are next generation sequencing (NGS) technologies different to original DNA sequencing methods?
Different NGS technologies compared to original methods:
- Sequence DNA directly
- DNA cut into small fragments ~200 bp (ex by sonication)
- DNA fragments immobilised into solid support - DNA molecules physically separated
Describe the physical platform used in Illumina NGS
Illumina NGS uses a glass flowcell - short ss oligonucleotides adaptors (P5, P7) bound to surface or nanowells - dense lawn formed for adaptors (ligated to sequences) to bind to their OH end
The bound oligonucletides will act as primers for DNA polymerysation - bound sequence with adaptors acts as a template strand
Explain the process of sonication
Sonication: using high-frequency sound waves to fragment DNA sequence into smaller pieces
Explain cluster generation in Illumina NGS
When sonicated DNA added:
1) sonicated DNA fragments with ligated adaptors bind to embedded oligonucleotides
2) density of attached DNA adjusted - single DNA molecule at a separated well
3) Initial extension: DNA pol adds dNTPS to make ds DNA from 3’ end - oligonucleotides P5 and P7 act as primers - sonicated DNA as template strand)
4) Denaturation performed - original sonicated DNA washed off - ss copy left
5) Cluster generation: renaturation conditions created - non-bound adaptor bind to another embedded oligonucleotide - bridge formed - DNA pol - another round of DNA synthesis = bridge amplification
=> at each step two strands separated to act as templates for next strand synthesis
Steps 3) -5) repeated x35 times to create an identical sequence cluster in close proximity
Explain sequencing part in Illumina NGS
Illumina sequencing (sequencing of all DNA fragments at once):
- universal sequencing primer annealed to adapter sequences
- DNA pol uses dNTPs with different fluorescent groups: dATP, dGTP, dTTP, dCTP + 3’ reversible block
- incorporation of fluorescent dNTP + temporarily blocks - detector reads fluorescence at each DNA fragment
- the fluore + block removed - new 3’ OH open for next polymerization step - next fluor and block = repeated in cycles until all fragment recorded (leaves the nt but fluore+block removed)
Explain adapter ligation to sample DNA fragments in Illumina
Adapter ligation: adapters ligated at both ends of DNA fragment - different on each end -> on glass flowcell adaptors bind (base pair) to oligonucelotides P5 and P7 - which act as primers for DNA polymerization (ds to the bound ss sequence)
What is the difference between the primers and oligonucleotides bound to glass flowcells in Illumina?
Primers: bind to sonicated DNA sequences - allow binding to oligos embedded on glass flowcells
Oligonucleotides (P5 and P7): embedded in glass flowcells (the surface) - after binding act as primers for ds DNA synthesis