Recap Next generation sequencing technologies Flashcards
Explain Sanger sequencing
In Sanger sequencing, DNA strands are added to a tube along with a primer dNTP (deoxyNucleotide Triphosphate) and ddNTP (dideoxyNucleotide Triphosphate) and DNA polymerase. The polymerase copies the DNA using the dNTP and sometimes incorporating ddNTP. When ddNTP is incorporated it ends the process.
As this happens randomly copies of all lengths exist in the mix and these are run on a gel using electrophoresis. The ddNTPs fluoresce at different colours and fluoroscopy can decide which bases are at the end of different strands. This makes it possible to infer the sequence.
What are the characteristics of Sanger sequencing?
The electrophoresis step limits the read length of the sequence, 1000 bp is upper limit
Can only read one DNA sequence at the time
Quality of base calls is generally high Q>30
Explain base call quality
High quality is a strong peak in one colour on the electropherogram with no overlapping peaks.
Measured on a Phred scale, which is a log scale and expresses the probability of error
What is the purpose of the Pred scale and how does it work?
It expresses the probability of error and thus the quality or certainty of each base call.
A quality score of 10 means that there is a probability of error in 1/10 bases.
20 = 1/100 (1%)
30 = 1/1000 (0.1%)
40 = 1/10.000 (0.01%)
i.e. the first number of the quality score is equal to the number of zeros after the 1. And the larger the number, the higher the quality.
What are the requirements for a sequencing method?
For both first and next-generation methods:
1. We need to know the position along the sequence
2. We need to know what base is in a given position
For next-generation sequencing:
3. We need to know which sequence is being read (i.e. where it came from, since we’re reading a lot at once)
Explain 454 “Pyrosequencing”
(First widely used NGS method to read multiple DNA molecules in parallel in the early 2000s)
As each dNTP is incorporated into the growing strand it releases PPi (pyrophosphate). PPi is turned into ATP (by sulfurylase) which is then used to produce light (by Luciferase). This light emission is measured in real-time.
Sequencing happens on a bead which have had one type of DNA bound to it (a lot of copies to amplify the light signal). Beads then get put in a well in a particular position so we know which DNA piece is where when we take photos.
dNTPs are washed over the beads separately and light emitted is recorded, so that lets us know the identity and position of the bases. For each base a photo is taken, and the intensity of the light is related to how many bases are incorporated.
Characteristics of 454 sequencing
Can sequence 1000s or 100s of thousands of different DNA molecules per run.
Read length is limited to 400-500 bp.
Potential for the copies of DNA on one bead to get out of sync, which reduces the signal as the run proceeds.
Major issue is homopolymer errors; when the same base is repeated more than 3-4 times it is difficult to know the number of bases in repeat. (e.g. AAAA looks very similar to AAAAA)