Next Generation Sequencing Flashcards

1
Q

Describe the human genome project:

How long is it?

How is it done?

Cost?

Nowadays?

A
  • Human Genome Project (1990 - 2003)
  • 3 billion base pairs long
  • All done with traditional Sanger Sequencing
  • Unravelled the first Human Genome Sequence to drive genetics research
  • 3 billion dollars cost
  • We can now achieve this amount of sequencing in as little time as one day!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the fundamental principle of DNA sequencing?

What does PCR achieve?

How many copies does it produce?

When is enough produced?

A
  • Fundamental principle for any DNA sequencing application
  • PCR is used to amplify a specific region of DNA; primers flank the region you want to amplify.
  • Each cycle doubles the amount of DNA copies of your target sequence
  • Amplify enough DNA molecules so that we have sufficient material to sequence or for other DNA applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe Sanger Sequencing

A
•	Invented by Fred Sanger in 1977
•	Cycle Sequencing
•	Based on PCR
•	Modified nucleotides
o	Chain Terminators
o	Nucleotide specific colour tag
•	A small proportion of the free nucleotides are modified this way to allow every base in the sequence to be read
•	One reaction = one sequence
•	Up to 800 bp per reaction
•	Accurate (99.99%), Slow and low-throughput
•	Used predominantly until late 2000s
•	Costly ££££
  • Identify single nucleotide polymorphisms (SNPs), or mutations
  • We can identify monogenic disease-causing mutations
  • Usually for single gene tests
  • E.g. CTFR in cystic fibrosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the benefits of Next Generation of DNA Sequencing (NGS)?

A
  • Technological advances since the end of the human genome project
  • Decrease in the cost of DNA sequencing
  • Since the end of 2007, the cost has dropped at a rate faster than that of Moore’s law
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the history of Next Generation of DNA Sequencing (NGS) II

A
  • Development of new NGS methods began 13 years ago with 454 pyrosequencing
  • DNA sequencing throughput jumped 10 orders of magnitude
  • Solexa sequencing-by-synthesis (SBS) developed end of 2005
  • Sequencing market to this day is now dominated by Illumina SBS sequencing
  • Next Generation Sequencing has replaced Sanger sequencing for almost all sequencing tests in the lab:

Whole genome sequencing
Whole exome sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the four steps in NGS Sequencing?

A

Four step process

  1. DNA library Construction
  2. Cluster Generation
  3. Sequencing-by-synthesis
  4. Data analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What occurs in the first step: DNA library Construction 1?

What is a DNA library?

Where is it normally derived from?

What occurs in this stage?

A

A DNA library is a collection of random DNA fragments of a specific sample to be used for further study; in our case next generation sequencing

The DNA can come from just about anywhere, but in human genetic research generally it’s derived from patients blood.

  • In the wet lab – first we need to prepare the DNA sample for sequencing
  • Essentially the DNA is chopped into small fragments (typically 300bp ). This is called shearing
  • This can be achieved chemically, enzymatically or physically (sonication)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What occurs in the first step: DNA library Construction 2?

A
  • We have to repair the end of the sheared DNA fragments
  • Adenine (A) nucleotide overhangs are added to end of fragments
  • Adapters with Thymine (T) overhangs can be ligated to the DNA fragments
  • The end result is the DNA library of literally billions of small, stable random fragments representative of our original DNA sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What occurs in the first step: DNA library Construction 3?

A
  • Adapters contain the essential components to allow the library fragments to be sequenced
  • Sequencing Primer binding sites
  • P5 and P7 anchors for attachment of library fragments to the flow cell
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What occurs in the second step: Cluster Generation 1?

A

Step 2: cluster generation

  • Hybridise DNA library fragments to the flowcell
  • Hybridization to the flowcell is a Random process
  • But we can’t measure individual single molecules of our DNA library –too small
  • We need to amplify the fragments to a bigger size that we can measure

Cluster Generation II
• Perform bridge amplification to generate clusters
• Many billions of clusters originating from single DNA library molecules
• Clusters are now big enough to be visualised
• Flow cell is now ready to be loaded on to the sequencing platform to perform the sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we sequence the library?

A

DNA libraries deposited on flowcell
-> bridge amplification
-> amplified to form ‘clusters’
• Sequencing machine processes a flowcell containing lanes
• Each lane may contain multiple samples (indexed with a DNA barcode contained in adapters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What occurs in Step 3: sequencing-by-synthesis

A

• Modified 4 bases (ATCG) with:
Chain terminators
Different fluorescent colour dye
• Sequence each single nucleotide 1 cycle at a time in a controlled manner

Sequencing-By-Synthesis II
• Single nucleotide incorporation (DNA polymerase)
• Flowcell wash
• Image the 4 bases (digital photograph)
• Cleave terminator chemical group and dye with enzyme
• REPEATED N NUMBER OF TIMES

Sequencing-By-Synthesis III
• Camera sequentially images all 4 bases on the surface of the flowcell each cycle
• Each cycle image is converted to a nucleotide base call (ACGT)
• Cycle number anywhere between 50 – 600 nucleotide base pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we analyse NGS Data

A
  • Short read sequences from the sequencing machine need to be re-assembled like a jigsaw
  • Mapping locations of our sequence reads on the reference genome sequence
  • To generate a consensus sequence of our original DNA sample library
  • In comparing this consensus sequence against the human genome reference and look for the genetic variants
  • Dedicated software and bioinformatics tools will achieve this
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Compare NGS v Sanger Sequencing

A
  • NGS (left) produces a digital readout. Sanger (right) produces an analogue readout
  • Sanger is one sequence read
  • NGS is a consensus sequence of many reads
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What occurs in target enrichment?

A
  • Target enrichment
  • Capture target regions of interest with baits
  • Potential to capture several Mb genomic regions of interest
  • Exome would be 50Mb in size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

GIve an example of what target enrichment could show

A
  • We are looking for protein coding mutations in the exons
  • Patient DNA sample subjected to exome sequencing
  • Example on the right shows a snippet of the consensus sequence of that sequenced sample
  • Reveals a heterozygous mutation in the CFTR gene
17
Q

How can we explore genetic diseases with NGS?

A
  • Collecting disease affected individuals and their families
  • Use of NGS in disease gene identification
  • Perform exome sequencing
  • Compare variant profiles of affected individuals
  • Try to identify the variant or mutation shared buy the affected individuals
18
Q

Describe RNA-seq I

What does it determine?

What can the number of sequencing reads be used for?

A
  • NGS not just for studying DNA.. RNA-seq experiments use the total RNA (or mRNA) from a collection of cells or tissue
  • RNA is first converted to cDNA prior to library construction
  • NGS of RNA samples determine which genes are actively expressed.
  • Single experiment can capture the expression levels of thousands of genes
  • The number of sequencing reads produced from each gene can be used as a measure of gene abundance
  • Quantification of the expression levels
  • Calculation of the differences in gene expression of all genes in the experimental conditions
  • With appropriate analysis, RNA-seq can be used to discover distinct isoforms of genes are differentially regulated and expressed