Next Generation Sequencing Flashcards
Briefly describe the Polymerase Chain Reaction
- Fundamental principle for any DNA sequencing application
- Used to amplify a specific region of DNA, Primers flank the region you want to amplify
- Each cycle doubles the amount of DNA copies of your target sequence
- Amplify enough DNA molecules so that we have sufficient material
Briefly describe Sanger sequencing
- Invented by Fred Sanger in 1977
- Cycle sequencing
- Based on PCR (also needs a PCR product as an input)
- Needs primers
- Modified nucleotides: Chain terminators or nucleotide specific colour tag
What does Sanger sequencing identify?
- A single nucleotide polymorphism (SNPs) or mutations
- Can identify monogenic disease causing mutations
- Often used for single gene tests
What are the benefits of next generation DNA sequencing?
- Decrease in the cost of DNA sequencing
- Since the end of 2007, the cost has dropped at a faster rate than that of Moore’s law
When was Next generation of DNA sequencing developed?
- Development of NGS methods began 13 years ago with 454 pyrosequencing
- DNA sequencing throughput jumped 10 orders of magnitude
- Solexa sequencing by synthesis (SBS) developed end of 2005
What happened as a result of the development of next generation sequencing?
- Replaced Sanger sequencing for almost all sequencing in the lab
- Whole genome sequencing
- Whole Exome sequencing
What are the four steps of NGS?
- DNA library construction
- Cluster generation
- Sequencing by synthesis
- Data analysis
Describe the steps in DNA library construction (PART 1)
- Takes place in the wet lab
- DNA is first chopped into small fragments (typically 300 bp). called shearing
- Can be achieved chemically, enzymatically or physically (sonication)
Describe the steps in DNA library construction (PART 2)
- The end of the sheared DNA fragments have to be repaired
- Adenine nucleotide overhangs are added to end of fragments
- Adapters with Thymine overhangs can be ligated to the DNA fragments
Describe the steps in DNA library construction (PART 3)
- Adapters contain the special components to allow the library fragments to be sequenced
- Sequencing primer binding sites
- P5 and P7 anchors for attachment of library fragments to the flow cell
Describe the steps in cluster generation (PART 1)
- Hybridise DNA library fragments to flowcell
- But we can’t visualise individual single molecules of our DNA library: too small
- We need to amplify the fragments to a bigger size for a stronger signal
Describe the steps in cluster generation (PART 2)
- Perform bridge amplification to generate clusters
- Many billions of clusters originating from single DNA library molecules
- Clusters are now big enough to be visualised
- Flow cell is now ready to be loaded
Describe the steps in Sequencing By Synthesis (PART 1)
Modified 4 bases (ATCG) with:
- Chain terminators
- Different fluorescent colour dye
- Sequence each single nucleotide 1 cycle at a time in a controlled manner
Describe the steps in Sequencing By Synthesis (PART 2)
- Single Nucleotide incorporation (DNA polymerase)
- Flowcell wash
- Image the 4 bases (digital photograph)
- Cleave terminator chemical group and dye with enzyme
Describe the steps in Sequencing By Synthesis (PART 3)
- Camera sequentially images all 4 bases on the surface of the flowcell each cycle
- Each cycle image is converted to a nucleotide base cell
- Cycle number is anywhere between 50 - 600 nucleotide base pairs
Describe the last stage: Analysis of NGS data
- Short read sequences from the sequencing machine need to be pieced together like a jigsaw
- Mapping locations of our sequence reads on the reference genome sequence
- To generate a consensus sequence of original DNA sample library
Compare and contrast between NGS and Sanger Sequencing
- NGS produces a digital readout.
- Sanger produces an analogue readout
- Sanger is one sequence read
- NGS is a consensus sequence of many reads
How many genes are there in the human genome?
21,000 Genes
What part of the gene are we usually interested in?
In the gene protein coding exons or ‘Exome’ which represents 1-2% of the genome
What percentage does pathogenic mutations acquaint for?
80% of pathogenic mutations are protein coding
What can we do with the whole exome sequencing to be more efficient?
More efficient to only sequence the bits we are interested in rather than the whole Exome sequencing
What is Whole Exome Sequencing used for?
- Target enrichment
- Capture target regions of interest with baits
- Potential to capture several Mb genomic regions of interest
- Exome would be 50 Mb in size
How is Exome Data Analysis carried out?
Sequence read alignment -> Target Coverage Reporting -> ? -> Variant Annotation
How is Whole Exome Sequencing used?
- The target is to look for protein coding mutations in the exon
- The patient DNA sample is subjected to Exome sequencing
- This should show a snippet of the consensus sequence of the sequence sample
- this reveals a heterozygous mutation in the CFTR gene
What are the applications of Exome Sequencing?
- Collecting disease affected individuals and their families
- Use of NGS in disease gene identification
- Perform Exome sequencing
- Compare variant profiles of affected individuals
What is RNA sequencing 1?
RNA-seq experiments use the total RNA (or mRNA) from a collection of cells or tissue
What occurs in RNA-sequencing?
- RNA is converted to cDNA before library construction
- NGS of RNA samples are used to determine which gene are actively expressed
- Single experiment can capture the expression levels of thousands of genes
What occurs in RNA-sequencing? (PART 2)
- The number of sequencing reads produced from each gene can be used as a measure of gene abundance
- Quantification of the expression levels
- Calculation of the differences in gene expression of all genes in the experimental conditions