Next generation sequencing Flashcards
describe the human genome project
1990-2003
3 billion base pairs long
done with Sanger sequencing
PCR
main principle for DNA sequencing application
PCR used to amplify specific region of DNA
each cycle doubles amount of DNA copies of your target sequence
amplify enough DNA molecules to have sufficient material to sequence or for other DNA applications
Sanger sequencing
invented by Fred sanger
cycle sequencing
based on PCR
modified nucleotides
small proportion of free nucleotides are modified this way to allow every base in sequence to be read
one reaction = one sequence
what is Sanger sequencing used for
identifying single nucleotide polymorphisms or mutations
identify monogenic disease-causing mutations
used for single gene tests
next generation of DNA sequencing 1
technological advances since end of human genome projects
decrease In the cost of DNA
NGS 2
development of new NSG methods began 13 years ago with 454 pyrosequencing
DNA sequencing throughput jumped 10 orders of magnitude
solexa sequencing by synthesis developed end of 2005
sequencing market to this day is now dominated by illumina SBS sequencing
four steps of NGS sequencing
- DNA library construction
- cluster generation
- sequencing by synthesis
- data analysis
step 1: DNA library construction
in wet lab we need to prepare DNA sample for sequencing
essentially DNA is chopped into small fragments = shearing
can be achieved chemically enzymatically or physically
Step 1: DNA LIBRARY construction II
repair end of sheared DNA fragments
adenine nucleotide overhangs are added to end of fragments
adapters with thymine overhang can be ligated to DNA fragments
the end result is DNA library of billions of small stable random fragments, representative of our original DNA sample
step 1: DNA library construction III
adapters contain essential components to allow library fragments to be sequenced
sequencing primer binding sites
p5 and P7 anchors for attachment of library fragments to the flow cell
step 2: cluster generation
hybridise DNA library fragments to the flow cell
hybridisation to the flow cell is a random process
but we can’t measure individual single molecules of our DNA library-too small
we need to amplify the fragments to a bigger size that we can measure
step 2: cluster generation 2
perform bridge amplification to generate clusters
many billions of clusters originating from single DNA library molecules
clusters are now big enough to be visualised
flow cell is now ready to be loaded on to the sequencing platform to perform the sequencing
step 3: sequencing by synthesis I
modified 4 bases (ATCG) with chain terminators and diff fluorescent colour dye
sequence each single nucleotide
step 3: sequencing by synthesis II
single nucleotide incorporation
flowcell wash
image the 4 bases
cleave terminator chemical group and dye with enzyme
step 3: sequencing by synthesis III
camera sequentially images all 4 bases on the surface of the flow cell each cycle
each cycle image is converted to a nucleotide base call
cycle number anywhere between 50-600 nucleotide base pairs
analysis of NGS data
short read sequences from the sequencing machine need to be re-assembled
mapping location of our sequence reads on the reference genome sequence
to generate consensus sequence of our original DNA sample library
NGS v Sanger sequencing
NGS produces a digital readout
Sanger produces an analogue readout
NGS is a consensus sequence of many reads
describe whole exome sequencing
21,000 genes in human genome
80% pathogenic mutations are protein coding
more efficient to only sequence the bits we are interested in, rather than the entire genome
what does whole exome sequencing do
target enrichment
capture targets regions of interest with baits
potential to capture several Mb genomic regions of interest
exome would be 50Mb in size
application of exome sequencing
collecting disease affected individuals and their families
use of NGS in disease gene identification
perform exome sequencing
compare variant profiles of affected individuals
try to identify the variant or mutation shared by the affected individual
RNA-seq 1
NGS not just for studying DNA RNA-seq experiments use the total RNA from a collection of cells or tissue
RNA is first converted to cDNA prior library construction
NGS of RNA samples determine which genes are actively expressed
single experiment can capture the expression of thousands of genes
RNA-seq II
number of sequencing reads produced from each gene can be used as a measure pf gene abundance
quantification of the expression levels
calculation of the difference in gene expression of all genes in the experimental conditions
with appropriate analysis, RNA-seq can be used to discover distinct isoforms of genes are deferentially regulated and expressed