Next Gen Sequencing Flashcards
Steps of NGS?
1) DNA
2) fragmentation
3) seperate fragments
4) amplification
5) sequenceing reaction
isolate the dna, break into small fragments, seperate out those fragments, THEN ampligy, then run seq reaction and SIMULTANEOULSY identify nucelotides.
Steps of Sanger?
1) dna
2) amplification
3) sequecning reaction
4) seperate fragments
5) identigy labelled molecules
Explain differences of sanger and NGS?
The main difference is due to the number of sequencing reactions we can carry out simultaneously. For Sanger sequencing it is usually 96, whereas it is millions for NGS
NGS key features?
- It is massively parallel – we’re sequencing literally billions of DNA fragments on one instrument simultaneously
- It is fast because of sequencing by synthesis (i.e. we’re identifying bases as we’re synthesising replicate DNA strands.
- NGS generates huge amounts of data – A single run is several Terabytes of data that is then processed down to hundreds of Gigabytes.
Illumina NGS Key technologies?
Isothermal bridge amplification – this creates the large number of single sequence spots
Sequencing by synthesis (SBS) – bases are added and detected one at a time
how is DNA fragmentation done?
typically done by sonication (applying sound energy to agitate particles).
randomly fragmenr genomic DNA and ligate adapter to both ends of the fragments
how are fragments seperated?
bind ss fragments randomly to the inside surface of the flow cell channels.
how does amplification take place?
add unlabeled nucleotides and enzyme to initiate solid-phase bridge amplification. (isothermal bridge amplification
the enzyme incorporates nucleotides to build double stranded bridges on solid-phase substrate
denaturation leaves ss templates anchored to the substrate
several million dense clusters of double stranded DNA are generated in each channel the flow cell
explain sequencing reaction?
the first sequencing cycle begins by added four labelled reversible terminators, primers and DNA polymerase.
after laser excitation, the emitted fluorescence from each cluster is caputers an the first base is identified
next cycle repeats the incorporation of four labeled reversible terminators, primers and DNA polymerase
After laser excitation, the image is captured as before and the identity of the second base is recorded
the seq cycles are repeated to setermine the seq of bases in a fragment, one base at time
explain data analysis?
Once the sequencing reactions are completed, the raw data has to be processed (3-5 Tb) and then the millions of short sequences (100-150bp) have to be aligned to produce an overall sequence.
Sequencing differences at specific sites are identified, if appropriate to the study, as these may be disease-related
What is the exome?
The exome is strictly defined as all the protein-coding sequences in the genome
The exome is <1.5% of the genome
Contains 85% of known disease-causing mutations
why exome seq?
Much smaller than whole genome
For a given amount of money/sequencing output, you get 81x more exome sequence compared to whole genome
The exome is better understood
Data challenges with NGS?
Large volumes of data
Sanger Centre has 40 sequencers, generates 15 terabases of sequence per year and has >4 Petabytes of disk storage
Fast networks are required
High Performance
Computing essential
EXOME CAPTURE?
http://www.genomics.agilent.com/files/Media/SS_Halo/Magnet584.jpg
Name 2 software tools used?
PolyPhen
SIFT