Sequencing DNA Flashcards
primer
provides free 3’ OH for synthesis to begin from (5’-3’ synthesis - required free OH for addition of next nucleotide)
sanger sequencing primer
need to know primer sequence to begin sanger sequencing
many organisms share primer sequences
can then infer the seqeunce after the primer
sometimes more difficult than this
deoxynucleotides
added to growing chain during in vivo DNA synthesis
a diphosphate (PPi) is released and the nucleotide is covalently added leaving a hydroxyl OH available for the next base to be added (further synthesis)
Terminating ddNT
lack 3’ OH
stop synthesis when they are added to growing chain
sanger dideoxy sequencing
supply mix of 3 deoxy nucleotides and one dideoxynucleotide
(only one of bases is dideoxy)
synthesis reaction will terminate when a dideoxynucleotide is incorporated
measaure length of terminated fragment and this tells us where the termination happened
this is not so useful as only gives the first time this base appears
instead supply all 4 dNT and one type of ddNT
radiolabel the NT
run fragments on gel
fluorescent label dideoxy sanger sequencing
radiolabelled ones not so good as radioactive substance
instead use 99% dNT and 1% differently fluorescently tagged (depending on base) ddNT
can then run on gel
measure length and colour of the terminated fragments
this tells us:
-where the terminations occured in the sequence
-what base was involved
can infer sequence from this
automating sanger dideoxy sequencing
run many reactions in parallel capillaries
automatically recording the fluorescence signal
parallel decoding of fliorescence signals
gives us a sequencing chromatogram
with the fluorescence signal at each location and its corresponding base in sequence
sanger dideoxy error rates
use PCR to amplify DNA samples for sequencing (need millions of copies to produce detectable signal)
PCR can introduce errors (about 1 in 10e4)
-meaning occasionally a base is misincorporated 1 time in 10e4
base call quality - reported as the probability of the call being an error
10 - 1 in 10 - 90% base call accuracy
20 - 1 in 100 - 99%
30 - 1 in 1000 - 99.9%
40 - 1 in 10,000 - 99.99%
50 - 1 in 100,000 - 99.999%
sanger sequencing reads usually 300-100 bases in length of >Q30 base calls
- so there is a probability of base calls being an arror
-and it is slow, serial, expensive
illumina sequencing
short read sequencing
is NGS
invention of NGS tech (mainly illumina) caused insane drop in cost per megabase of sequencing
similar to sanger - as in it uses terminators
BUT it is REVERSIBLE terminator sequencing
uses fluorescently labelled “reversibly-blocked” nucleotides
allows the sequence to be read one base at a time
-incorporate fluorescently labelled base
-read fluorescence signal (different depending on base)
-remove block on 3’OH
-remove fluorophore
-then incorporate the next fluorescently labelled nucleotude
commercialised by Illumina
Illumina set up
sequencing takes place in many flow cells
doesnt require knowledge of a primer seqeunce
instead uses adapters and primers are used for those adapters
lawn of adapters stuck to surface of slide act as primers to amplify the DNA fragments
BRIDGE AMPLIFICATION
clusters grow clonally from same individual fragment
need to do this as need to make many copies of sequence so signal can be seen by illumina machine detectors
each cluster identified by physical location on the slide
sequencing is detected by order of colout of fluorescence the cluster
gain a sequence for each cluster
illumina benefits
dont need to know primer sequences
Illumina NovaSeq can generate up yo 3 terabases (3x 10^12) per run
up to 20 billion reads (2x 10^10) per run
150 bases per read each way - 300 total max length
average Q>=30
Illumina drawbacks
other machines can produce longer reads - which are more useful for genome sequencing
so illumin not as good for that as them
errors occur
sometimes systematic - due to underlying properties of sample sequence
long read technologies
2 main players:
-Pacific Biosciences single molecules long read - PacBio SMRT/ SEQUEL
-Oxford nanopore technologies (ONT). Synthetic nanopores and minION, PromethION instruments
promethion = multiple minION put totgether
difference - use fluorescently labelled dNT (no temrinators)
PacBio SMRT / SEQUEL
single molecule sequencing using
FLUORESCENTLY LABELLED DEOXYNUCLEOTIDES
fluorescent label is on the PPi which is removed from dNT when incorporated into DNA chain
ssDNA input
dsDNA out
process of incorporating this fluorescently labelled dNT releases light which can be detected by the machine
-zero mode waveguide illumiation of the polymerase
-real time monitoring of nucleotide incorporation
DNA pol is fixed to bottom of the well
light of diff wavelength released for each base
can detect this peak and infer sequence from the order
outputs of the three types of sequencing so far
sanger - labelled bars
illumina - pictures
PacBio - movie - lots of data to analyse - requires powerful computer