Shane 2: Sanger Sequencing vs NGS Flashcards

Question 1

Q

Give the read length, no of reads/run, throughput, SNP error rate, Indel error rate and costs of Sanger Sequencing.

Answer

A

Read Length: 800bp
No of reads/run: 96 [<1 day]
Throughput: 6MB/day
SNP error rate: low
Indel error rare: low
Costs: 500 euro/Mb

Question 2

Q

Give the read length, no of reads/run, throughput, SNP error rate, Indel error rate and costs of Illumina

Answer

A

Read Length: 2x150bp
No of reads/run: 400,000,000 [<1 day]
Throughput: 120GB/day
SNP error rate: high (aprrox 0.5%)
Indel error rare: low
Costs: <0.05 euro/Mb

Question 3

Q

Talk about the efforts to sequence the first human genome

Answer

A

The human genome was first sequenced in 2003
At this time many different institutes were all working at the same time to sequence different chromosomes
It took about 10 years and about a billion euros to do
Sanger -> multiple sanger sequencing ran at once, all day everyday
Private industry looking to patent the human genome vs public industry

Question 4

Q

What is mean tby SNP error rate?

Answer

A

This is the ability of a system to correctly identify Snps/incorrect bases

A low error rates means a high likelihood that the sequence is correct

Question 5

Q

What are indels?

Answer

A

Insertions and deletions

Everyone has these, they are a part of normal variation of the human genome, the majority of these are harmless but some can be pathogenic

Question 6

Q

How large is the human genome?

Answer

A

3 billion base pairs of DNA on a single chromosome -> x2 copies => 6 billion base pairs in the whole genome

Question 7

Q

Why is Illumina NextSeq sequencing not done anymore and what is used instead?

Answer

A

Illumina can only do short reads

Paired end sequencing is now done -> sequencing the forward and reverse strand at a time
-> if you used Sanger you would have to design reverse and forward primers to do this etc

Question 8

Q

Why is Illumina NextSeq sequencing not done anymore and what is used instead?

Answer

A

Illumina can only do short reads

Paired end sequencing is now done -> sequencing the forward and reverse strand at a time
-> if you used Sanger you would have to design reverse and forward primers to do this etc

Question 9

Q

What are the four main next gen sequencing technologies available?

Answer

A

Illumina -> most prevalent
SOLID (life technologies)
Ion Torrent (life technologies0
Pacific Biosciences

Question 10

Q

What are the two main 3rd generation approaches to sequencing

Answer

A

Oxford Nanopore (commercially available)

Illumina Nanopore (licensed an alternative Nanopore technology)

Question 11

Q

What is the basis of ion torrent technology?

Answer

A

Uses semi-conductors to tell when hydrogen ions are released

Question 12

Q

Why are there so many different sequencing technologies?

Answer

A

Companies have to keep making new sequencing technology as you cant get a patent for Sanger etc

Question 13

Q

What are the two main sequencing template approaches to sequencing?

Answer

A

Clonal amplification of single molecules

Single DNA Molecule as a squencing template

Question 14

Q

What is meant by the clonal amplification of single molecules as an aproach to sequencing, give two examples

Answer

A

Single molecule only briefly needed as a template
— Thousands of identical molecules boost signal
— Two different methods:
• Bridge amplification of molecules immobilized on surface - Illumina
• Emulsion PCR — Ion Torrent

Question 15

Q

What is meant by the single DNA molecule as a sequencing template as an aproach to sequencing, give two examples

Answer

A

— Challenge of keeping single molecules stable during sequencing
— Avoid amplification biases
— Pacific Biosciences, Oxford Nanopore

Question 16

Q

How does clonal amplification signaling work?

Answer

A

Makes use of amplification (by PCR) to amplify up a sequence before its fluorescence detection

Used in Ion torrent and bridge aplication methods

-> generally if you detect fluorescence you have to amplify up prior e.g. in ion torrent you have to increase the number of DNA molecules to burst the signal so we can detect enough hydrogen ions

Question 17

Q

What is the main benefit of single DNA molecule sequencing

Answer

A

This allows us to sequence much longer continuous strands of DNA

Question 18

Q

What is the history behind the Illumina: Flow Cell method of sequencing?

Answer

A

Illumina: Flow Cells with “Molecular Colonies”

Originally research done in Cambridge, was known as Solexa -> chemistry department of Cambridge

Sold to Illumina in 2006

Question 19

Q

How does the Illumina Flow cell sequencing method works?

Answer

A

Makes use of flow cells, a type of slides

Short oligonucleotide sequences only a few nucleotides long are spread across the entire surface of flow cell

These are used to bind DNA onto flow cell

Clusters on flow cell are formed of the same sequence of DNA -> each cluster started of as one DNA molecule that first bound and then is amplified by PCR to produce many copies in close proximity to it

Question 20

Q

What are the main pros and cons of the illumina sequincing

Answer

A

Can sequence millions of cluster reactions at once i.e. on the same flow cell

You have to measure (take an image) every cycle i.e. the sequence is built up one nucleotide at a time

It requires specially designed chemistry using reversible dye-terminators and a polymerase

Termination is a reversable process unlike Sanger -> this alloows us to stop the reaction and on another nucleotide at any point, image it, and then continue the reaction

Question 21

Q

What kind of nucleotides are used in Illumina sequencing

Answer

A

Fluorescently labelled reversible terminater nucleotides

Question 22

Q

How are the fluorescent nucleotides used in Illumina reversible?

Answer

A

You can chemically cleave of the fluorescent group of the nucleotide and wash it away at any point -> have to wash away to prevent background fluorescence when you go to add next nucleotide

You can then block the 3’OH group until your read to add the next nucleotide group - gives us time to image the last previous nucleotide that was added (temporarily block 3’ OH)

When the OH group is freed you can add the next nucleotide

Question 23

Q

What is the main con of the Illumina sequencing

Answer

A

Takes a lot of time due to many washing steps

To sequence 150bp sequence there will be 150 wash cycles -> add reversible nucleotide, block 3’OH, read fluorescent signal, cleave off/wash signal off, add next nucleotide

150bp length can take 12 hours or longer

Question 24

Q

What kind of elongation is used on the illumina?

Answer

A

Illumina Paired End Sequencing

Question 25

Q

What is illumina paired end sequencing?

Answer

A

A method of sequencing two strands of DNA at the one time

Genomic DNA, purified, denatured with heat, fragmented into small sequences using enzymes or sonicaters

Ligate on adapter sequences onto the forward and reverse short sequences using DNA ligase

Each sequence now has an adapter 1 site (A1 site) and priming site (SP1), the complementary reverse has an A2 and SP2 sites

The A1 site allows binding onto a flow cell

The priming site is a sequence for which you can design primers for i.e. it allows you to prime the sequencing reaction, complementary design for reverse i.e. use primer 1 for forward sequence and primer 2 for reverse etc

Question 26

Q

Give a brief run down of illumina paried end sequencing

Answer

A

Fragment genomic DNA
Ligate adapters
Generate clusters - bind to flow cell
Sequence first end
Regenerate clusters and sequence paired end

Question 27

Q

Give a brief run down of illumina paried end sequencing

Answer

A

Fragment genomic DNA
Ligate adapters
Generate clusters - bind to flow cell
Sequence first end
Regenerate clusters and sequence paired end

Question 28

Q

Talk a little about the illumina flow cells

Answer

A

There are like several hundred million flow cells

Hundreds of millions of reactions on the one plate

Clusters are only a micron in size - smalled than a bacterial cell

Sequencing one nucleotide at a time -> creates a sequence about 150bp in length => 150 images put together to determine sequence

Question 29

Q

Give some examples of illumina sequencers

Answer

A

MiniSeq System -> for targeted sequencing
MiSeq series -> small genome and targeted sequencing
HiSeq X Series ->population and production scale whole genome sequencing
NovaSeq Series -> population and production scale genome, exome, transcriptome sequencing and more -> can cost up to a million euros -> none in Ireland

Question 30

Q

Illumina Sequencing
(2 pros + 3 cons)

Answer

A

Pros:
- Very high throughput - can do millions of clusters per cell
- Best price/bp but machines can be very expensive/some are affordable

Cons:
- relatively long run time -12 hours plus for a run of 150bp
- Sequencing quality decreases towards the end -> polymerase struggles to incorporate large fluorescent molecules near 150bp
- Imaging interference in low diversity libraries -> fluorescence interference in highly repetitive sequences

Question 31

Q

How does ion torrent sequencing work

Answer

A

Developed in 2010 by Life Technologies
A form of emulsion PCR using magnetic beads

Question 32

Q

How does Ion torrent work

Answer

A

Uses magnetic beads coated in short oligonucleotides
Each bead is in an oil droplet along with DNA polymerase ad nucleotides
Chips have thousands of wells each with ion sensors
Each bead fits into a well
Semi conductor detect the release of hydrogen ions released when any nucleotide is icorporated by polymerase

Question 33

Q

Why do we need to amplify the signal for ion torrent

Answer

A

Its done to increase the amouont of H+ signal prodced so that its release can be detected by an ion sensor

Question 34

Q

How do ion sensors work?

Answer

A

They detect H+ released upon incorporation of nucleotides by polymerase

They borrow the technology used in semi-conductors

Question 35

Q

Ion torrent sequencing is based on what kind of sequencing?

Answer

A

Semi-conductor sequencing

The sequencing is carried out on the chip, no imaging is required

Question 36

Q

How does H+ detection work in ion torrent sequencing?

Answer

A

H+ is released with the formation of phosphodiester bonds

This brings about a pH change

Slightly acidic

pH is measured with a sensor

Question 37

Q

Explain in your own way how ion torrent sequencing works?

Answer

A

Cycle 1: add an A, if the A is not encorporated i.e. no H+ released then wash away
Cycle 2: add a G, if not encorporated wash away
Cycle 3: add a C etc etc
Cycle 4: add a T, if encorporated then H+ released

Therefore we know there is a T at position 1, then we go onto the next cycle and keep repeating until you get a desired length

Question 38

Q

If the signal detected is twice as strong upon adding of a nucleotide in ion torrent sequencing, what does this indicate?

Answer

A

If the signal is twice as strong you know two of the same nucleotide have been added on in a row

Question 39

Q

What are two examples of ion torrent sequencers?

Answer

A

Ion PGM

Ion Proton

Question 40

Q

Talk about the Ion PGM Ion torrent sequencer

Answer

A

Personal genome machine

3 different types of chips

Can do 200 or 400bp reads

Can run up to 5.5million reads/Ion 318 chip i.e. over 5 million wells per chip

4-7 hour run time

Much quicker than sanger, no need for fluorescence etc

Question 41

Q

Talk about the Ion proton ion torrent sequencer

Answer

A

Newer ion torrent sequencer
Up to 200bp reads
Up to 60-80 million reads (way more than ion PGM)
2-4 hour run time -> short run time hence its use in hospital labs
Useful for targetted gene sequencing e.g. for cancers or genetic disorders

Question 42

Q

What are the three main pros of ion torrent sequencing and the two main cons

Answer

A

Pros:
- fast
- relatively cheap
- scalable (can buy different chips depending on need)

Cons:
- relatively high error rate
- emulsion PCR

Question 43

Q

Talk about the high error rate of ion torrent PCR

Answer

A

High error rate seen where there are repetitive sequences

e.g. 3 As in a row -> you would think the signa would be three times higher but this is not always the case

there was a very high error rate associated with this in the beginning but it has since gotten better

Question 44

Q

Talk about the difficulties of emulsion PCR in Ion torrent sequencing

Answer

A

Emulsion PCR is very technically challenging and can take a while to get it to work in the lab

Question 45

Q

What is PacBio Sequencing and who set it up?

Answer

A

Single Molecule Real Time Sequencing (SMRT)

Set up in 2011 by Pacific Biosciences

Developed in Standord University California

Question 46

Q

How does PacBio Sequencing Work

Answer

A

Single Molecule Real Time Sequencing SMRT
Chips have individual wells
One copy of DNA sequence in each well
DNA in single strand form
Inside each well there is an imobilised DNA polymerase i.e. stuck to bottom of well
There are fluorescently labelled nucleotides in well floating around
Polymerase will incorporate complemenary nucleotides
A fluorescence pulse occurs everytime a nucleotide is added on
The fluorescent pulse is measured in real time - this happens very quickly
The polymerase cannot move hence how we know the exact position where the nucleotide is bing added on

Question 47

Q

Why is the DNA polymerase immobilised in PacBio Sequencing, and how

Answer

A

Immobilised by fixing it to the bottom of the well

This stops the polymerase randomly inserting nucleotides

Allows us to know the start point of transcription

Question 48

Q

Talk about the wells used in PacBio Seuencing, why is this done

Answer

A

The wells in the chips used are very shallow

This stops any fluorescent nucleotides from floating out

Question 49

Q

Talk about the pros and cons of PacBio Sequencing

Answer

A

No PCR ie dont need to amplify DNA prior to sequencing

Can do very long reads - read lengths averaging 10-15kb and a max of 40kb

Can be used to observe DNA modifications

Throughput per run is low -> approximately 1 million reads

Run time is short

Error rate is high - same nucleotide repeats cause issues

Question 50

Q

Talk about Oxford Nanopore, what is the principle behind it

Answer

A

Makes use of bacterial nanopore proteins which drag DNA through small perforations in a chip ‘nanopore’

DNA is ‘sequenced’ as it is dragged through the nanopore

Question 51

Q

How does PacBio sequencing allow for observation of DNA modifications

Answer

A

Anytime a cytosine is methylated a different pulse is seen

This allows us to identify any points of DNA cytosine methylation

Question 52

Q

Explain how we identify base pairs using Oxford Nanopore sequencing

Answer

A

A flow of ions flows through the nanopore constantly

Each base blocks the current to a different degree

Each different nucleotide it blocks the ion current a very specific amount

Question 53

Q

Talk about nanopore proteins and explain how we use them

Answer

A

Nanopore proteins play a role in bacterial cells - normally they take up DNA

They transport DNA into the bacterial cell from outside the well

They take up double stranded DNA

Motor protein and motor enzyme transports DNA across the nanopore one strand at a time - forward strand first then the reverse strand -> can sequence both strands this way

Question 54

Q

What are two commercially available oxford nanopore sequencers?

Answer

A

MinION
GridION

Question 55

Q

Talk about some of the portable Oxford Nanopore technologies

Answer

A

Flongle and SmidgION
For use in the field
Often used in microbioogy e.g. for covid detection