Next Generation Sequencing Flashcards

1
Q

As a recap, describe PCR.

A

It is fundamental for any DNA sequencing application.
PCR is used to amplify a specific region of DNA: primers will flank the region you want to amplify.

Each cycle doubles the amount of DNA copies of your target sequence.

We amplify enough DNA molecules so that we have sufficient material to sequence for other applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe Sanger Sequencing.

A

It is based off of PCR, but it is a slightly modified version of PCR. It is where you use four coloured dyes to sequence the molecule you run on the gel and then you can find the sequence.

Its very accurate, and can sequence up to 800 base pairs of sequences per reaction. However, it is very slow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe how next generation sequencing has progressed with the advent of technology.

A

The technological advances since the end of the human genome project have decreased the cost of DNA sequencing.

Pyrosequencing has now been discontinued, and about 95% of sequencing is sequencing-by-synthesis, a technique developed by a company called Solexa.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the four steps of next generation sequencing?

A
  1. DNA library construction
  2. Cluster generation
  3. Sequencing-by-synthesis
  4. Data analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a DNA library?

A

A DNA library is a collection of random DNA fragments of a specific sample to be used for further study; in our case next generation sequencing.

The DNA can come from just about anywhere, but in human genetic research generally it’s derived from patients blood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the first step of NGS.

A

The first step is DNA library construction.

It starts in a wet lab, where we first need to prepare the DNA sample for sequencing.
Essentially, the DNA is chopped up into small 300 bp fragments. This is shearing. This can be achieved chemically, enzymatically or physically (sonication).

We then have to repair the end of the sheared DNA fragments. We do this with a polymerase molecule. Adenine (A) nucleotide overhangs are added to the end of the fragments. This is known as A-tailing. Adapters with Thymine (t) overhangs can be ligated to the DNA fragments.

The end result is the DNA library of literally billions of small, stable random fragments representative of our original DNA sample.

The final DNA library is stable and encases the adapters. The adaptors contain the P5 and P7 anchors, and they also contain the primer binding sites which allows for the sequencing of the fragments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe the second step of NGS.

A

The second step is cluster generation.

Flow cells are like microscope slides that are sandwiched together. They have lanes, and they’re flooded with DNA fragments which attach to the surface of the flow cell as single molecules.

We need to hybridise the DNA library fragments to the flow cell. This is a random process.
However, we cannot see the individual single molecules of our DNA library, as they are too small. We need to use PCR to amplify the fragments to a size that we can see.

Traditionally, PCR is done in a tube, but here it is done on the surface of the flow cell in a process called bridge PCR.
The DNA fragments remain attached to the surface of the flowcell whilst we do PCR. These are amplified in to a cluster – they are now big enough to be seen. The flow cell is now ready to be loaded on to the sequencing platform to perform sequencing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the third step of NGS.

A

The third step is sequencing-by-synthesis.

We need to generate new DNA to generate the sequence. The clusters are sequenced in a controlled manner. One base is sequenced at a time.

The modified bases are assigned a different fluorescent dye colour, so that we can see which base is present, and chain terminators, which ensure that the polymerase only synthesises one base at a time - we need to remove this to sequence the second base, which gives us a lot of control.

To expand on that last part, 1) we use DNA polymerase to incorporate a single nucleotide. 2) We wash the flow cell, then 3) image the 4 bases (as a digital photograph). Finally, 4) we cleave the terminator chemical group and dye with the enzyme. We repeat this until we are happy with the sequencing.

A camera sequentially images all 4 bases on the surface of the flow cell. Each cycle image is converted to a nucleotide base call (ACGT). The cycle can number anywhere between 50 – 250 nucleotide base pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe the fourth step of NGS.

A

The fourth step is data analysis.

The short read sequences from the machine need to be re-assembled like a jigsaw to generate a consensus sequence of our original DNA samples. The short read sequences align and map against the reference sequence.

We can compare this consensus sequence against the human genome reference and look for the genetic variants. There are dedicated software and bioinformatics tools that will achieve this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the most common application of NGS?

Describe it.

A

The most common application of NGS is exome sequencing.

There are ~21,000 genes in the human genome. Often, we are only interested in the gene protein coding exons or ‘exome’, which represents 1-2% of the genome
thus, it’s more efficient to only sequence the bits we are interested in, rather than the entire genome. It costs £1,000 for a genome, but only £200-£300 for an exome.

So, you can quickly sequence patients samples and find out if they have variants or mutations in their exome (some ~80% pathogenic mutations are protein coding).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we achieve exome sequencing?

A

You want to enrich the DNA library with just the exons.

1) You use baits to capture the regions of interest from the library, and you do it using an RNA base.
2) You incubate the DNA with the RNA base complimentary to the exons.
3) You form the hybridisation step and select fragments with magnetic beads coated with streptavidin.
4) Wash away unbound fragments; the enriched library containing exons remains.
5) We could remove an RNA base with an RNAase enzyme, which will digest the RNA, leaving us with the targeted DNA library.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

List some differences between NGS and Sanger sequencing?

A
  • NGS produces a digital readout, while Sanger produces an analogue readout
  • Sanger has a one-sequence read, while NGS is a consensus sequence of many reads
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the main drives for this research?

A

You can explore genetic diseases with NGS.

First, you would collect the disease-affected individuals and their families. You would perform exome sequencing on them, and then incorporate the NGS into the disease gene identification.

We are looking for a shared mutation in the different individuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe RNA-sequencing.

A

NGS is not just used for studying DNA. RNA-sequencing experiments use the total RNA (or mRNA) from a collection of cells or tissue.

The RNA is first converted to cDNA prior to library construction. After you convert RNA into cDNA you follow the same method for DNA library fragmentation and sequence constructing.

NGS of RNA samples determines which genes are actively expressed. A single experiment can capture the expression levels of thousands of genes.

The number of sequencing reads produced from each gene can be used as a measure of gene abundance.
We calculate the differences in gene expression of all genes in the experimental conditions.

With appropriate analysis, RNA-sequencing can be used to discover how distinct isoforms of genes are differentially regulated and expressed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe third-generation sequencing.

A

It’s also known as Oxford Nanopore sequencing.

Nanopores are like cell-membrane proteins, where DNA is forced through the nanopore. This generates an electric signal and gives rise to the sequence.
We can use them to sequence much longer sequences, up to 10 Mbp.

It is single-molecule sequencing, and there is no PCR involved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly