Genomics - Next Generation Sequencing Flashcards
What are the 4 nucleotides for DNA?
- Adenine
- Guanine
- Cytosine
- Thymine (uracil)
What does DNA consist of?
Base
Pentose
Triphosphate
How does DNA extension occur?
Through the attack of 3’-OH of pentose by 5’-phosphate of the free nucleotide.
A phosphodiester bond is formed and a diphosphate is released.
What are the three stages of PCR?
1- Denaturation
2- Anneal primer
3- Extend new strand by incorporating dNTPs
How was the first bacterial and human genome sequenced?
Sanger Sequencing
What is the process for library preparation in sanger sequencing?
- DNA extraction
- DNA fragmentation
- Clone into vectors to prime the DNA
- Transform bacteria, grow and isolate vector DNA
- Sequence the library
What is a negative to Sanger sequence library preperation?
It is very labor intensive - up to 700bp per read
What are the two types of mechanical DNA shearing?
Sonication
G-tube
Describe the process of sonication and what are the positives?
- Focused beam on the DNA
- Depending on the length of time the beam is focused on the DNA effects how fragmented the DNA becomes
- It is highly controllable
- Can shear multiple samples at a time - parallel processing - up to 96 samples
- Can shear to a range of fragment sizes
Describe the process of G-tube shearing
- Uses centrifugal force to shear the DNA
- Small fragment size range compared to sonication
- Low throughput - 12 samples
Describe the process of enzymatic DNA shearing
- Nick the DNA with an enzyme
- Then the DNA is cleaved at ssDNA site with an endonuclease
- Fragment length depends on the reaction time
What is the read length of Sanger sequencing?
Up to 700bp per sequence
For a given template DNA sanger sequencing is the same as PCR except….
- Uses only a single primer and polymerase to make the new ssDNA pieces
- Includes regular nucleotides for extension but also dideoxynucleosides
What are Dideoxynucleotides?
Nucleotides that are labelled with fluorophores and have terminators
Why use fluorophores in sanger sequencing
Different fluorophores fluoresce at different wavelengths so that you can determine which bases are present - a laser is used to determine which base is at which position
What is the output of sanger sequencing
- Chromatogram for approx 600-1000bp.
- The blue bars below the chromatogram tells us the confidence level that the base identified is correct.
Max output = 1kb = 1000bp
What is the raw read accuracy of sanger sequencing?
99.99% (Q40)
What are the negatives of Sanger Sequencing?
Expensive, low throughput - single read output
Labor intensive
Low sensitivity - detection of mutations in cancer need to be present in more than 30% of cells to be detected
What are some examples of Next Generation Sequencing..
Illumina
Oxford nanopore
Roche
Life technologies
What is NGS?
- Technologies that enable you to sequence millions to billions of short sequences in a single run
- Parallel sequencing either together or single molecule
- Essentially sequences lots of DNA at once unlike sanger sequencing
Describe the library preparation for 454
- DNA is sheared and ends are polished by removing any unpaired bases
- Adapters A and B are added to each end. At this point the DNA is made single stranded
- One adapter contains biotin, which binds to a coated bead.
- Oil is added to the beads and an emulsion is created
- PCR is performed - each droplet becomes its own micro reactor
- Each bead ends up with about a million identical copies of the original DNA
What are adapter A and B in 454 library prep
Short pieces of DNA that are compatible to the downstream DNA
Why is the ratio of beads to DNA important in 454 library prep
So that most beads only get a single molecule of DNA attached to them.
Describe the sequencing process of 454
After PCR the oil is removed and beads are added to the picotiter plate.
Each well is just big enough to hold a single bead
Pyrosequencing enzymes are attached to much smaller beads and added to the wells
The plate is washed with each of the 4 dNTPs
The plate is couple in a fiber optic chip
A CCD camera records the light flashes from each well.
Describe the chemistry of 454
-When a base is incorporated a phosphate is released
- This phosphate attaches to APS in the presence of sulfurylase to create ATP
- Second enzyme luciferase uses ATP to covert luciferin to a light signal
What is the output of 454
A flowgram
Starts with a key sequence - TCAG for calibration
- 1-mer = single base 2-mer is two bases etc.
What are some negatives to 454
- Only 1 million reads per run
- Homopolymers are a big issue AAAA
A vs AA is 100% difference - AAAA to AAAAA to only 20% difference
Error rate is 99% (Q20)
What is the current market leader?
Illumina
What are the different types of illumina?
Miniseq
Miseq
Nextseq 500
Hiseq 4000
Novaseq
What are the positives of enzymatic DNA shearing in illumina sequencing?
- Rapid prep - 15 min manual - 90 min total
- Optimized for small genomes, PCR amplicons and plasmids
- Innovative sample normalization - no library quantification needed
- Fastest time to result - < 8 hours
- Ultra low input - single nanogram of DNA needed
Describe the step of enzyme sequencing in illumina sequencing.
1- Tagmentation of template DNA
Transposase enzymes cut pieces of DNA out of one area and put it somehwere else
Cut randomly
2 - PCR to add adapters and indices
3 - Clean up and sequence
Indexes are DNA that is already known
Why are indexes used in illumina sequencing?
Indexes are pieces of DNA that are already known, they allow you to separate the samples after the sequence has run to allow you to run multiple samples at once.
Each index is unique to each sample
What is the purpose of locus specific primers in illumina?
Allow for specific amplification
What is the purpose of the P5/7 tail in illumina?
Bind the sample to the flow cell
What are the advantages of indexing/barcoding?
Reduced reagent cost
Quicker turnaround time
What are the disadvantages of indexing/barcoding?
Reduced read number per sample
Introduces normalization step to minimise variation in read number per sample
Describe cluster generation and its purpose in illumina
- Adapters allow DNA to bind to the flow cell
- P5/7 act as primers for PCR
- Clonal amplification of DNA
- Purpose - generate enough clusters to get a detectable signal
- Have to manage how many DNA strands are added to ensure clusters dont form too close together
What is Bridge amplification in illumina?
Primers form a bridge and polymerase amplifies this DNA strand and forms clusters to allow the camera to detect the fluorescence
Describe the chemistry of illumina sequencing
- Use dye termination of second strand DNA
- Starting with primer new bases are added 1 at a time with fluorescent tags
- tags block 3’-OH of the nucleotide so next base can only be added once the tag is removed
- Unlike pyrosequencing you dont have to worry about monomer sequences
- Reversible terminator is what stops the reaction but can be removed
Describe the process of sequencing in illumina sequencing
- Sequencing primer binds. Reversible terminator dNTPs bind
- 4 images taken at different wavelengths
- Reversible terminator removed and next base added
- Only ever reads 1 base at a time. After each base the reaction stops, read the fluorescence, then remove the terminator and move onto the next terminator,
- Max 300 cycles
What is paired end index sequencing? - illumina
- Required for discovery of genome variation
- Better coverage uniformity
- Indel events can be detected between pairs
- Can read in forward and reverse to ensure complimentary reads to increase confidence
What are the positives of illumina sequencing?
Scalable - 1 mill - 3 bill reads per run
Market leader
Error rate 99.9% (Q30)
Environmentally friendly technology
What re the different types of PacBio sequencers?
Sequel I
Sequel II
Revio System
Describe the 5 steps of sample preparation for PacBio
1 Fragment DNA to desired size
2 Repair ends
3 Ligate adapters (known sequences) forms a hairpin
4 Purify DNA
5 Sequencing - bind primer and DNA polymerase will run around the sample until the reagents run out
What are the 2 sequencing modes of Pac Bio?
1 - Standard (LS - long sequence reads)
Large insert sizes
Generates one pass of each molecule sequenced
2 - Circular consensus (CCS - high quality sequencing reads)
Small insert sizes
Generates multiple passes of each molecule
Describe the Chemistry of PacBio
- Uses triphosphate linked fluorophore, unlike sanger and illumina that use base linked
- Reduces steric hinderance (doesnt slow down the reaction)
- Allows sequencing in real time
- ZMWs (wells) are a specific shape to amplify the signal, you amplify the signal rather than the DNA
Describe the sequencing process of PacBio
- Diffusion loading into ZMWs
- Single polymerase and DNA fragment per ZMW
- Fluorescent signal is held
- Laser used to excite the fluorophore and measure the fluorescence
- 10 bp/sec
- Interspace duration - Distance between 2 bases being incorporated - this increases when a base is methylated
- Single molecule resolution in real time - short waiting time and sample workflow
- no amplification required
- Direct observation
- Long reads - identify repeats and structural variants
What is the raw read and consensus error rate of PacBio
Raw Read = 90% (Q10)
Consensus
4 pass = Q20
10 pass = Q30
20 pass = Q40
>25 pass = Q50
Each pass is how many times around the DNA molecule the polymerase goes
What are the different types of Oxford nanopore sequencers
Fongle
MinION
promethION
GridION
Describe the sequencing process of Oxford Nano pore
- Doesnt use polymerase like all other technologies
- Protein pores imbedded in the membrane of the flow cell
- Current passes through the membrane and nanopore
- Tether protein attaches to the pore
- The motor protein unzips the DNA through the nanopore
- The amount the DNA strand distrupts the current when going through the pore correlates to what base is travelling through the pore at that time
- 400 base/sec
What are the advances in oxford nanopore technologies?
Single pass read = Q20
Duplex sequencing (Q30) - motor protein on both ends
Methylation detection via current variances