Next Gen Sequencing Flashcards
Steps of NGS?
1) DNA
2) fragmentation
3) seperate fragments
4) amplification
5) sequenceing reaction
isolate the dna, break into small fragments, seperate out those fragments, THEN ampligy, then run seq reaction and SIMULTANEOULSY identify nucelotides.
Steps of Sanger?
1) dna
2) amplification
3) sequecning reaction
4) seperate fragments
5) identigy labelled molecules
Explain differences of sanger and NGS?
The main difference is due to the number of sequencing reactions we can carry out simultaneously. For Sanger sequencing it is usually 96, whereas it is millions for NGS
NGS key features?
- It is massively parallel – we’re sequencing literally billions of DNA fragments on one instrument simultaneously
- It is fast because of sequencing by synthesis (i.e. we’re identifying bases as we’re synthesising replicate DNA strands.
- NGS generates huge amounts of data – A single run is several Terabytes of data that is then processed down to hundreds of Gigabytes.
Illumina NGS Key technologies?
Isothermal bridge amplification – this creates the large number of single sequence spots
Sequencing by synthesis (SBS) – bases are added and detected one at a time
how is DNA fragmentation done?
typically done by sonication (applying sound energy to agitate particles).
randomly fragmenr genomic DNA and ligate adapter to both ends of the fragments
how are fragments seperated?
bind ss fragments randomly to the inside surface of the flow cell channels.
how does amplification take place?
add unlabeled nucleotides and enzyme to initiate solid-phase bridge amplification. (isothermal bridge amplification
the enzyme incorporates nucleotides to build double stranded bridges on solid-phase substrate
denaturation leaves ss templates anchored to the substrate
several million dense clusters of double stranded DNA are generated in each channel the flow cell
explain sequencing reaction?
the first sequencing cycle begins by added four labelled reversible terminators, primers and DNA polymerase.
after laser excitation, the emitted fluorescence from each cluster is caputers an the first base is identified
next cycle repeats the incorporation of four labeled reversible terminators, primers and DNA polymerase
After laser excitation, the image is captured as before and the identity of the second base is recorded
the seq cycles are repeated to setermine the seq of bases in a fragment, one base at time
explain data analysis?
Once the sequencing reactions are completed, the raw data has to be processed (3-5 Tb) and then the millions of short sequences (100-150bp) have to be aligned to produce an overall sequence.
Sequencing differences at specific sites are identified, if appropriate to the study, as these may be disease-related
What is the exome?
The exome is strictly defined as all the protein-coding sequences in the genome
The exome is <1.5% of the genome
Contains 85% of known disease-causing mutations
why exome seq?
Much smaller than whole genome
For a given amount of money/sequencing output, you get 81x more exome sequence compared to whole genome
The exome is better understood
Data challenges with NGS?
Large volumes of data
Sanger Centre has 40 sequencers, generates 15 terabases of sequence per year and has >4 Petabytes of disk storage
Fast networks are required
High Performance
Computing essential
EXOME CAPTURE?
http://www.genomics.agilent.com/files/Media/SS_Halo/Magnet584.jpg
Name 2 software tools used?
PolyPhen
SIFT
What do PolyPhen/SIFT do?
Predict possible impact of amino acid substitutions on structure and function
Considers - location in protein - current known substitutions - likely effect
Outcomes - Benign- Possibly damaging - Probably damaging
What do we do to validate NGS and why?
SANGER SEQ
to ensure NGS is correct
provide more data to support findings
Key Applications of NGS?
RNA-seq
DNA Methylation Sequencing
ChIP-Seq
Single Molecule Sequencing is NNGS or 3GS
steps of RNA-seq
- Library preparation
2.Isolate RNA
Total RNA
Deplete rRNA (95% of the RNA from a cell)
Use oligo-dT to isolate poly-A positive mRNA
3.Reverse transcribe into cDNA
why is rna-seq done?
Sequencing of RNA is carried out to determine gene expression in a sample
It is more accurate, reproducible and sensitive than using a microarray
what is Bisulfate-seq
Bisulfite sequencing is carried out to determine methylation patterns
Methylation is strongly correlated with gene expression levels
Usually, hypermethylation results in gene silencing and hypomethylation in gene transcription
Bisulfite modification changes unmethylated C to Uracil, and this reads as a T.
This conversion reduces the sequence complexity, making sequence assembly difficult
what is ChIP-Seq
Chromatin Immunoprecipitation is carried out to determine recognition sites for specific DNA-binding proteins, such as transcription factors
Absolute requirement for a specific antibody for the protein to be analysed
What is single molecule sequencing?
Alternative to the standard NGS techniques
DNA is sequenced unfragmented
Sequencing is by measuring incorporation of DNA bases into a newly-synthesised strand
Sequencing reads are far longer (>10kbp)
However, sequencing is more error-prone
What is SMRT sequencing?
PacBio sequencer detects fluorescent signals as each base in turn is incorporated as a DNA strand is extended by an immobilised DNA polymerase
what is nanopore?
As DNA passes through the nanopore each DNA base produces a characteristic change in the current across a membrane
clinical applications of NGS are?
Rare Genetic Disease
Cancer
Genomic changes
Gene expression profiles
Epigenetic profiles
Infectious disease tracking
Microbiome Analysis