Introduction to NGS and Library Constructions Basics Flashcards

Question 1

Q

slide 1

General Timeline of Sequencing Methods
what are the three generation types of sequencing

Answer

A

First Generation- Sanger Sequencing

Second Generation - Illumina (and other NGS methods/platforms)
Third Generation - Long Read (PacBio/Nanopore)

Question 2

Q

notes on slide 1

Answer

A

only use 2 and 3rd sequencing
sangar is very basic and novo. does not use
sangar seq. is very very short read seq. one kind of section of the DNA. Not super useful
in the 90’s they assembled the whole genome using Sangar but individual sections

Question 3

Q

slide 3
Sangar sequencing

Answer

A

Early methods of sequencing created by Frederick Sanger in 1977
Still used by researchers currently (clone verifications ,general CRISPR screens, etc.)

cons:
- Works best on smaller templates (PCR products, plasmids – gDNA too large)
- General read lengths are around 500bp to 800bp
- Cost per base is very high ($1,000+ per Mb)

pros
- Fast turnaround time
Longer reads (relative to Illumina)
Low Error Rate and cheap cost (per run ~$5 per sequencing reaction)

Question 4

Q

slide 3 notes

Question 5

Q

slide 4 steps in sequencing

Answer

A

1 - PCR with fluorescent, chain terminating ddNTPs
- take original DNA seq., PCR amplified and denatured (template is usually PCR product or cloned plasmid not gDNA)
- Mix with dNTPs and fluorescently labelled ddNTPs

2 - size separation by capillary gel electrophoresis

3 - laser excitation & deletion by sequencing machine

Question 6

Q

slide 5
second generation

Answer

A

No longer produced, didn’t really accomplish goal
Of affordable large data outputs, cost per G was high
Relative to machine cost ($500K machine), Roche realized this was not their marketspace

Question 7

Q

slide 7
illumina

Answer

A

Has become the ‘gold standard’ platform for NGS (by using the SBS technology). Illumina, originally started at Solexa
(a start-up in the UK). Solexa launched their sequencer in 2006 and ultimately, Illumina bought Solexa in 2007 (where
We start seeing major traction (in growth and applications) in the NGS world.

Question 8

Q

slide 7 notes

Question 9

Q

slide 8
overall comparison btwn seq/ platforms

Question 10

Q

slide 8 notes

Question 11

Q

slide 9
cost per human genome

Answer

A

in 2001, it cost $100,000 but in 2020 it costs less than $1000

Question 12

Q

slide 9 notes

Question 13

Q

slide 10
Sequencing Power for Every Scale

Answer

A

The HiSeq and NovaSeq
sequencers are the two major platforms
we use at Novogene. Other platforms,
such as NextSeq and MiSeq, are available,
but are generally used for specific reasons

Question 14

Q

slide 10 notes

Question 15

Q

slide 11
Flow Cell Surface

Answer

A

surface of flow cell coated with a lawn of oligo pairs

Different platforms, will have
different types of flow cells, which
in turn will yield different outputs

Question 16

Q

slide 12
Sequencing by Synthesis (Basis of Illumina Technology)

Answer

A

DNA: 0.05-1.0ug

cluster growth

sequencing

Question 17

Q

slide 12 notes

Question 18

Q

slide 13
Illumina NGS WorkflowEnabling translation of research discoveries into potential clinical applications

Answer

A

extract: DNA/RNA extraction from original sample (tissue, cells, blood, etc.) -> Generally done by the client but can be done by Novogene

library prep: Take the DNA/RNA and make ’libraries’ (sample material that can be loaded on the sequencer) using commercial kits.

sequence: Put libraries on the sequencer to create raw data (FASTQ files)

data analysis: Take raw data (FASTQ files) and analyze them using various software to make meaningful conclusions

Question 19

Q

Illumina Based Sequencing

Answer

A

Library Construction
Starting w/ DNA or RNA and turning into Illumina
Compatible ‘Material’
Cluster Generation
Add to flow cell
Bridge amplification
Sequencing
Single base at a time, imaging
Data Analysis
Images converted into usable information
basecalls and ‘reads’ -> Raw Data

Question 20

Q

Library Preparation

Answer

A

Main purposes – get ideal insert sizes attached with adapter regions to make them usable on the Illumina platform

The DNA/cDNA sequences will be flanked by adapter sequences

These adapter sequences/regions will include:
- i7 (Index 1) and i5 (Index 2) sequences -> helps library sample bind to flow cell
- Index sequences -> allows for multiplexing (loading of multiple samples on a single lane with a known sequence “divider”
- R1 and R2 binding sequences -> allows the R1/R2 primers to bind for amplification (standard for most libraries but can be custom)

Question 21

Q

Basic Illumina Compatible Library Template

Answer

A

P5 & P7 oligo (required) - needed to bind to flow cell

index 2 (optional)
- Also referred to as i5
Index/Index 2. Not always used, Only when dual index kits are used or UDI, helps with more sample multiplexing and UDI can be beneficial for index hopping

read1 primer (required)
- Needed for bridge amplification (part of sequencing process)
Ideally Illumina compatible - but can have custom R1 binding spot,
Require custom primer additional (and changes on how lanes can be purchased)

insert DNA (required)
- Where most libraries will hold/fail QC (pre-made service),
The insert sizes need to be of “ideal length” but
Depending on the protocol, might be longer or shorter
250-300 for RNA-Seq; 300-350 for WGS -> other services vary

read2 primer (required)
- Needed for bridge amplification (part of sequencing process)
Ideally Illumina compatible - but can have custom R2 binding spot,
Require custom primer additional (and changes on how lanes can be purchased)

index 1 (required)
- Also referred to as i7
Index/Index 1, required if
multiple samples are on same
lane

Question 22

Q

anatomy of a library

Answer

A

P5 & P7 ends of adapters bind to flow cell

DNA insert typically ranges 200-600bp (1kb)

different methods of indexing
- inline (part of the insert) - any level of multiplexing
- single index read (<96)
- dual index reads (384+)

Inline indexes are
not part of our
regular demultiplexing
pipeline and will
require an additional
evaluation/charge.

Question 23

Q

Cluster Generation

Answer

A

1 - attach DNA onto flow cell

2 - DNA folds over into bridge-like shape

3 - attach primer onto DNA

4 - Complementary strand (reverse) strand is made

5 - reverse strand and forward strand

6 - clonal copies of both forward and reverse strand in a cluster

Question 24

Q

Cluster Generation

Answer

A

When using the HiSeq platform, cluster generation happens on
a separate machine, called the cBot

When using the NovaSeq platform, cluster generation happens on
the same machine as sequencing.

Question 25

Q

the importance of cluster density

Answer

A

illumina reports “optimal” cluster density for each platform

pM amounts of libraries are used for sequencing

Accurate QC and quantification are essential

Question 26

Q

2-dye vs 4 dye chem

Answer

A

Some Illumina platforms use a 4-channel chemistry (older platforms). Newer platforms use a 2-Channel chemistry

Some researchers might want to use a 2-dye Chemistry vs. 4-dye Chemistry (“better accuracy”).
Not something we want to start discussing with clients, unless they bring it up.

Overall Theme: The more complicated you make the conversation, the more complicated the sale becomes

Question 27

Q

done!

Answer

A

Once Sequencing is Done …Time to Analyze the Data!!!Can be done by Novogene or our clients may havetheir own software/workflow (pipeline) in place