Lecture 1: Sanger Sequencing Flashcards
What are the four different approaches to DNA sequencing?
Chemical degradation of DNA
Sequencing by synthesis (SBS)
Ligation-based
Nanopore
Talk about the chemical degradation apprach to sequencing
The first method of DNA sequencing
Not carried out anymore
Uses too many dangerous chemicals
Talk about the sequencing by synthesis (SBS) method of DNA sequencing
The most common approach to sequencing
Developed by Sanger, also known as Sanger sequencing
Works by primer extension through the use of DNA polymerase and nucleotides
DNA must be denatured into single strands first, primer must be annealed and then oligonucleotides added
Talk about ligation-based methods of DNA sequencing
Sequencing using short probes that hybridise to the template
Talk about using nanopores as a sequencing method
The newest way of sequencing DNA
Inferring sequence by change in electrical current as ssDNA is pulled through a nanopore
Briefly describe the structure of DNA nucelotide
Consists of a 5 carbon sugar with a nitrogenous base attached at the 1’ C and a phosphate group attached to the 5’ carbon
A hydroxyl group is present on the 3’ carbon
What are the different nitrogenous bases in DNA vs RNA
Adenine, Guanine, Cytosine
+ Thymine in DNA
+ Uracil in RNA
How does uracil differ from thymine?
Uracil lacks a methyl group
Talk about the hydrogen bonds between G and C and A and T and why they are important to consider
Adenine and Thymine form two hydrogen bonds to hold together
Guanine and cytosine form three hydrogen bonds to hold together
This is important to consider as you may have to heat a GC bond rich segment of DNA to higher temperatures to get it to denature
How does ribose differ from deoxyribose
The 2’C in ribose has a hydroxyl group attached
The 2’ C in deoxyribose lacks an oxygen (hence deoxy) and only has a H attached
Describe the directional structure of DNA and why this is important
DNA is an anti-parallel structure i.e. its two strands are in opposite directions, 1 strand in the 5’ to 3’ direction and the other in the 3’ to 5’ direction
DNA polymerase can only work in the 5’ to 3’ direction ie. a strand can only be built in the 5’ to 3’ drection
What bonds are formed to join oligonucleotides to form a strand of DNA and how are they formed
Phosphodiester bonds
These bonds are formed between the phosphate of one nucleotide and the 3’ hydroxyl group of another
There is the loss of a hydrogen ion with the formation of a phosphodiester bond (this H loss can be detected and used in some other forms of sequencing)
How does modern day Sanger sequencing differ from when it was first used
When Sanger was first used to sequence the human genome plasmid clones were used to create multiple clones of a sequence, now we use PCR to amplify up multiple copies of the DNA before sequencing
How does modern day Sanger sequencing differ from when it was first used
When Sanger was first used to sequence the human genome plasmid clones were used to create multiple clones of a sequence, now we use PCR to amplify up multiple copies of the DNA before sequencing
What strand acts as the template for sanger?
Reverse strand acts as the template for sanger
Template is 3’ to 5’ direction so that nucleotides can be added on in the 5’ to 3’ complementary to that
What are chain terminators
Dideoxy nucleotides i.e. DNA nucleotides which lack their 3’ OH and therefore cannot be further extended (cannot form phosphodiester bonds)
They are used to terminate a sequence chain
How do chain terminators work?
Removal or modification of the 3’ hydroxyl group needed in the formation of phosphodiester bonds prevents DNA polymerase from incorporating any further nucleotides, thus the sequence ends
Dideoxy nucleotides are the chain terminators used in sanger sequencing, these lack an oxygen on their 3’ hydroxyl group (i.e. they have only a 2H on their 3’) which terminates the reaction
How was Sanger sequencing originally carried out using dideoxynucleotides
Sanger set up 4 individual reactions each containing a mixture of DNA nucleotides as well as a small concentration of dideoxynucleotides (either ddTTP, ddATP, ddGTP or ddCTP)
The presence of low concentrations of a ddTP meant there was a low chance of the reaction randomly stopping which produces DNA sequences of different lenghts
A polyacrylamide gel was then used to separate out the DNA sequences according to their weight
From a single fragment of DNA we know only the identity of the last nucleotide in the sequencing but when all are combined in a gel we can read the order of the nucleotides to determine the sequence
How did Sanger visualise his sequence
Sanger used radioactive signals -> he used 32P as a radioactive label
Gel was exposed to an x-ray film to make an auto-radiogram
Why use polyacrylamide gel and not just agarose gel?
Polyacrylamide gel allows one to differentiate DNA fragments down to the individual base pais e.g. allows one to tell between a 99 base pair chain and a 100 base pair chain
Why use polyacrylamide gel and not just agarose gel?
Polyacrylamide gel allows one to differentiate DNA fragments down to the individual base pais e.g. allows one to tell between a 99 base pair chain and a 100 base pair chain
What kind of primer must be used in Sanger sequencing?
A primer with a nucleotide sequence complementary to the 3’ end of the region to be copied (reference)
What do you need to set up a Sanger sequencing reaction?
Template DNA
Nucleotides
Dideoxyucleotides
Buffer
DNA polymerase
Forward or reverse primers
What do you need to set up a Sanger sequencing reaction?
Template DNA
Nucleotides
Dideoxyucleotides
Buffer
DNA polymerase
Forward or reverse primers
What happens when a dideoxynucleotide is added?
Replication of the strand will stop i.e. terminate
What is the difference between a dideoxynucleotide and a deoxynucleotde?
A dideoxynucleotide is missing a 3’ hydroxyl group on its ribose
What were some areas that needed improvement in Sanger sequencing?
Had to be ran on a very large gel -> preferred if it could all be done on one lane to save
Would be better if signals could be read at the same point in the gel -> youre unable to read the sequence near the top as the bands get closer and closer together
Fluorescent labels would be safer and easier to read then radioactive labels -> can also use different fluorescent labels of different colours which can be read by a laser
What was the imporoved Sanger sequencing called?
Fluorescent Sanger Sequencing “Dye terminators”
How does Fluorescent Sanger Sequencing differ from normal Sanger?
Each of the 4 ddNTPs are labelled with a different fluorescent dye instead of a radioactive label
Fluorophores added to ddNTPs -> this made the ddNTPs much bigger though so a modified DNA polymerase had to be used
What imporovement was made to fluorescent sanger sequencing?
Caltec automated the process in the mid1980s by addition of a gel in a capillary and a laser
The process is ran as normal except instead of using a large gel a mini gel contained inside of a capillary is used
A laser is then used to read the fluorescent signals
How is fluorescent sanger sequencing interpreted?
You get four signals of four different colours, one for each type of ddNTP that has been incorporated
The laser reads the position and colour of the signal, this is interpretted by a base caller system to provide us with a readable trace (sequence)
What is Sanger Base Calling?
A software which provides an estimation of the quality of a sanger sequence -> a software which tells us the possibility that a base has been interpreted incorrecly -> possibility of error
Talk about the quality of a sanger sequence
(2)
Poor quality in the first 15-40 bases of a sequence due to primer binding
Deteriorating quality of sequence traces after the 700-900 bases due to polymerase gradually loosing steam i.e. loosing its ability to add on nucleotides
What base calling software do we use?
Phred
What base calling software do we use?
Phred
Explain in your own words what a base calling software like Phred does?
Software which interprets the fluorescence and names the base it corresponds with
Software which calculates how accurate a base pair read is i.e. how correct a fluorescence has been labelled
What is the name of the score we use to determine how much we can trust a sanger sequence read?
Q scores
How do you interpret a Q-score?
You want a Q score as close to 99 as possible
Dont trust any Q score less than 20
Q20 = a 1 in 100 chance the base is incorrect
Q99 = 1 in 10^-10 chance the base is incorrect
What is CHOPS syndrome?
Cognitive impairment
Heart defects
Obesity
Pulmonary Involvement
Short Stature
Skeletal dysplasia
Mutations of what gene are responsible for CHOPS?
AFF4 mutations
Explain how you would read fluorescent sanger results when looking for a mutation e.g. AFF4 mutations in chops
Three known dominant mutations of AFF4 cause chops
Sequence these regions
Look for two peaks at the one signal -> this indicates one normal copy and one mutated copy
*if the condition was recessive you would need two copies of the mutated gene or a combination of mutated genes etc etc
How has sanger sequencing improved?
Moved from polyacrylamide gels and radioactive labels to capillary gels and fluorescent labels
Used to be read by hand but are now automatically read by a laser
Low throughput on old methods, would have taken years to squence whole genome, now you can sequence 1-2 million bp/day using sanger