sequencing only - part 1 & 2 (ppt + video notes) Flashcards

1
Q

Training Outline – Part 1/2

A

Library Construct

QC Steps – Qubit, qPCR, and Fragment Analyzer

Sample Requirements and Sample QC

Sequencing Platforms + Sequencing Workflows

Data Calculations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

library structure

A

5’ to 3’ end
in order: P5 oligo (required), index 2 (optional), read 1 primer (required), insert DNA (required), read2 primer (required), index 1 (required), p7 oligo (required)

The order of the elements in library should be the same as it in the picture above.
* If not, the client should provide the library structure for us to evaluate and take all the risk.

P5 and P7 Oligos are used for binding to the FC (what is FC?)
 They should be exactly the same as our sequences:
P5 Oligo: AATGATACGGCGACCACCGAGATCTACAC
P7 Oligo: ATCTCGTATGCCGTCTTCTGCTTG

  • So here’s a library construct just to go through it 5 prime to three prime
  • It’s got directionality and it’s also, it also has different components.
  • So at the ends you’ll see we have the P5 and P7. these will bind to the flow cell
  • 3:18
    And so these are standard sequences that all alumina compatible libraries will have.
  • 3:22
    If you don’t have the P5P7, your library won’t bind and it’ll get washed off in that first cycle.
  • 3:29
    Right adjacent to the P5 and P7 you have index one and index 2.
  • 3:33
    So index one is also commonly referred to as I7 because it’s adjacent to the P7 and then the P5.
  • 3:41
    The index 2 is commonly referred to as the I-5 just because it’s right next to the P5.
  • 3:47
    Now with these high throughput sequencers, we have lots and lots of samples not only going on the flow cell, but the individual lanes.
  • 3:57
    And not a client may occupy the entire lane, but they also may not have a data requirement that matches the entire output of a lane.
  • 4:07
    So they might buy a part of a lane.
  • 4:08
    And so you’re going to have either multiple client samples going on the same lane or a client that has multiple samples going on the same lane.
  • 4:16
    And so the only way to sort of piece out the different data sets from each respective sample is to have these barcodes or indexes attached.
  • 4:23
    So as long as you’ve got more than one sample going on a lane, you’re going to have an index one or an I7, then that’s why it’s required.
  • So you can have a sequence on one side, which is a single index library where you’re just going to have that I7I5, sorry, index one I7, but you could also have a sequence on the opposite end, index 2 or the I-5.
  • 4:48
    And so by having more sequences or combinations available, you essentially have more options for how many samples can go on a lane, but a little bit more about that in training.
  • 4:57
    The part three, I guess at this point, but you need to have at least one single index is what we call it.
  • 5:03
    If you’ve got both, they’re called dual index libraries.
  • 5:06
    And then within the dual index libraries or the dual indexing method, there’s a few variants that you can add as well.
  • 5:13
    And then right adjacent to that, you’ve got the read 1 primer and the read 2 primer.
  • 5:18
    So in order to attach these bases, you need that three prime overhang.

5:22
Those come in from the primers.

5:24
And so by having the sequencing primer binding sites, you can actually attach bases that are detected through fluorescence and know which base is added at which position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sample QC Process

A

Library volume (μL) : micropipetor

Con.(ng/μL) : Qubit

Fragment size (bp) : Fragment Analyzer

Molarity (nmol/L) : qPCR

sample receiving to qubit to fragment analyzer to qPCR to library QC report

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

qubit

A

A Qubit is used to obtain a concentration (ng/ul) by using a selective dye, which will bind to dsDNA (similar to Qubit assay used for WGS)

There are different kits used for quantification (based off what you would like to measure and amount available) – we use the Qubit™ dsDNA HS Kit.

This value gives us a reference point for comparison (will make more sense later in the presentation)

Keep in mind that it will not 100% bind to dsDNA (no assay/technique is ever 100% perfect in science – there might be nonspecific binding, which could impact the concentration provided by the Qubit.

Overall, this is still a good assay/technique for measuring concentration for a specific nucleic acid/marker.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Nanodrop vs Qubit

A

Some clients may not have access to a Qubit for quantification and may use a Nanodrop to obtain a concentration

Although this is not the most accurate – it is still acceptable for an initial check – keeping in mind, the Qubit derived concentration most likely will be lower

Why?
Nanodrop is a spectrophotometer, which will use the principle of absorbance: nucleic acids will have a peak absorbance at ~260nm.

Other contaminates within the sample may absorb near this value – which can throw off the quantification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nandrop vs Qubit

A

Acceptable for a ‘quick check’ but need to keep in mind, most likely higher than the actual concentration.

sample A:
- 10ul with nanodrop concentration: 10ng/ul
- 10ul with qubit concentration: 6ng/ul

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

qPCR (quantitative PCR)

A

Based off a popular lab technique, PCR (polymerase chain reaction) but instead of
Being used to amplify a particular target/amplicon, it holds a quantitative goal.

Involves primers binding to specific targets within a product, which in turn yields a concentration

A great tool to measure the concentration of an Illumina library, which contains a highly specific/conserved sequence -> P5 and P7 regions.

Dyes (i.e. SYBR® Green)
will bind to the PCR product, providing
a florescence signal for detection by the
machine. Once threshold is met (Ct value),
these values can be compared against a standard
curve to yield concentrations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

qPCR (quantitative PCR)

A

Kit we use for qPCR: KAPALibraryQuantification Kit

qPCR is the ‘gold standard’ for quantifying and pooling libraries; however,
it requires special equipment and time (qPCR set-up + run can take >1 hour.)

Most clients are not available (too expensive and takes too much time) to quantify their libraries
through qPCR -> so they may resort to Qubit

At the end -> qPCR will allow us to obtain a Nanomolar (nM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Official Library QC Report - Example

A

lib qubit (ng/ul) - 1.86

qpcr mol (nmol/L) - 13.22

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Quantifying Libraries: Method #1

A

Fragment Analyzer (or equivalent) and Qubit

(concentration in ng/ul)/(660g/mol x average library size in bp) x 10^6 = concentration in nM

if the concentration in ng/ul is higher (inflated) that will impact the resulting calculation

Easier/Quick – but not as accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Quantifying Libraries: Method #2

A

Kit we use for qPCR: KAPALibraryQuantification Kit

At the end -> qPCR will allow us to obtain a Nanomolar (nM)

‘Gold Standard’ but requires special equipment/more time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

qPCR and Qubit

A

Purpose: To understand amount of library that is present (can understand a little about the library quality)

The Qubit will tell us how much ‘DNA’ is present in the sample but not all DNA is going to be library
(if a poor library construction is done, there might be DNA fragments that are not adapter ligated.)

The qPCR will use the p5/p7 regions, which are specific to an Illumina library, to understand amount of library present.

Comparing the qPCR and Qubit can give us an idea of if the sample we have received is mainly Illumina library (good) or fragments that don’t have the adapters ligated (which is bad).

It is important to load the appropriate amount of library on a sequencer, as this will impact cluster generation and ultimately the amount of data produced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

qPCR and Qubit

A

After converting Qubit to nM (using conversion formula)

Qubit > qPCR: some library fragments may not have adapter ligated.

Qubit < qPCR: there might be library fragments that are single-stranded due to accidental or intentional denaturation. Our Qubit assay only measures double stranded DNA (due to the kit we use).

Qubit ~ qPCR: sample present is mainly library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why Quantification is Important – Method #1 (Qubit)

A

Client did a QC on their end – used Qubit + Bioanalyzer; visually
we can see there is other “stuff” in the tube, which can be primers, leftover
reagents, etc. which might cause the Qubit to provide a higher read

Volume: 10ul
Qubit: 3ng/ul
Converted Qubit -> 4.2 nM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why Quantification is Important – Method #2 (qPCR)

A

The sample was submitted for Novogene – below are some of the QC metrics:

Volume: 10ul
Qubit: 3ng/ul
qPCR: 0.8nM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why Quantification is Important – Comparison

A

Same sample but different QC metrics,
based off what quantification
method was used

Volume: 10ul
Qubit: 3ng/ul
Converted Qubit -> 4.2 nM

Volume: 10ul
Qubit: 3ng/ul
qPCR: 0.8nM

17
Q

you have created your library…

A

but how do you know it has the right insert size? how do you check if the library was actually generated? how do you determine if you have dimers or other contaminates present within the sample

18
Q

Bioanalyzer and TapeStationSimilar pieces of equipment but use different methodologies to obtain final result

A

May not be accessible to clients readily – but helpful in understand what you have produced at the end

19
Q

Fragment Analyzer - what novogene uses

A

Very similar to the Bioanalyzer – but the Fragment Analyzer allows for more high-throughput processing (up to 96 samples); versus
The bioanalyzer going to up to 12 samples per run. The FA allows for integration with automation as well.

20
Q

What is a Fragment Analyzer (FA)?

A

As the name implies, it will analyze fragments, present within a sample

Similar to running a sample on a gel (electrophoresis) but done more rapidly, taking into account the # of samples being processed

The returned graph (sometimes referred to as a trace) – is clearer and more comprehendible for our analysis on determining the quality/profile of a library

The FA can produce ‘computer generated’ gels for review.

21
Q

Fragment Analyzer

A

Purpose: To understand the fragment size and distribution of the library provided by the client (as well as general quality)

The sample length is not just the insert itself but
includes the length of the adapters as well.

What we are looking for is generally a single, bell-curve
shaped with an insert around 300-350bp but reality
is most libraries don’t follow this profile and as a result,
their samples do not pass our QC (this is not a bad
thing)

22
Q

frag analyzer

A

Some libraries have unique trace profile but that does
not mean the library is bad/poor quality.

Ignoring the adapter dimer,
which we will talk about shortly,
this is the profile expected of a 10X scATAC
library. Multimodal peaks will cause the library
to not pass QC

23
Q

Adapter Dimers

A

Sometimes, during the library preparation process (either due to low quality samples, low starting material, or skipping a bead clean-up), you may see an adapter dimer on the trace.

These dimers peaks are a result of the adapter regions (see below),
having homology (creating the dimer) and generally show up between 120bp
to 160bp.

Dimers can be problematic, as they contain FC binding sequences (P5/P7) but
no actual insert; in addition, as they are shorter than the target product – the tend
to cluster more efficiently on the FC

If possible, removing the dimers is recommended – this can be offered by Novogene
Through a Library purification service.

This involves using AMPure XP Beads to remove fragments ~<200bp

Risk: Generally, involves 50% sample loss through the process

24
Q

Unqualified Library Treatment

A

abnormality
- adapator/primer contamination
- smaller insert size
- broad peak/multimodal peaks
- Q-PCR verified concentration <3nM

potential risks
- cluster generation abnormal
- reads with adaptor sequences
- low data output

suggested solution
- AMpure beads
- re-do lib prep
- AMpure beads or re-do lib prep
- concentration

25
Q

sample requirements - seq. only

A
26
Q

Sample QC Results

A

Project Coordinator or CSS will provide QC reports to clients before moving forward officially with the sequencing

Pass: Samples met our guidelines and will be guaranteed on the output requested

Hold: Samples do not meet our guidelines but historically speaking, the risk to move forward with these samples is low
The output variation, if any, seems to be negligible or within range (most libraries will fall into this category)

Fail: Libraries show a profile that are of concern (small fragment content, low amounts, etc.)
This profile might be a characteristic of the library itself

27
Q

Sample QC Results

A

Unlike the pass/hold/fail, we assign to nucleic acid projects (i.e. RNA or DNA), the samples that receive hold/fail for pre-made libraries are not as risky. Since our guidelines are established on the ‘picture perfect’ library to put on the sequencer. As a result, some clients get a little worried/concerned when they see their libraries are put on hold/fail.

Educate your clients, especially new ones, during the sales cycle:

Does the Frament Analyzer profile match what you are expecting for your library type?
Is there enough library to put on the sequencer to achieve the desired output?

28
Q

Sequencing Options

A

NovaSeq X Plus – 10B/25B – Full Lanes
NovaSeq X Plus – Partial Lane Sequencing

29
Q

Flow Cell Types - NovaSeq X Plus

A

The NovaSeq X Plus has 3 flow cell types – 1.5B, 10B, and 25B.

There are different configurations available within each flow cell type

The name of the FC indicates the # of reads that can be produced off the entire FC
- The 10B FC produces 10B paired reads

We have the 10B and 25B FCs available
Output on the 10B is ~375Gb per lane (~1.25B paired reads)
Output on the 25B is ~950Gb per lane (~3.1B paired reads)

30
Q

NovaSeq X Plus Platform – Full Lanes

A

The newest (high throughput) sequencer, available at Novogene (China and Davis.) This sequencer uses a 2-color chemistry (combination of two dyes will produce signals for 4 bases)

The hypothetical output on the NovaSeq X Plus 10B (PE150) lane is 375Gb (or ~1.25B paired reads) and NovaSeq X Plus 25B (PE150) lane is 950Gb (or ~3.125B paired reads)
Important to know that this is hypothetical output and depending on the library type and quality (main two factors), the output could be more reads or less reads.
Illumina states the output on the 25B is ~937.5GB, we initially rounded to ~950Gb but on the quotes, you will see 1,000Gb (this flow cell tends to overproduce; however, it is not consistent and of course, depends on the library type/quality).

The cost for the 10B lane is $1,799 and the 25B lane is $3,149 – both can come down with volumes.
Cost per base/read is lowest on 25B versus 10B, assuming you can fill up most of the lane.

31
Q

NovaSeq X Plus Platform – Partial Lane

A

A characteristics of the NovaSeq full lanes will apply (since same sequencer) but this workflow allows you to buy a percentage/part of a lane at a discounted rate, without having to worry about filling up the lane.

Important to note that these are going to be hypothetical outputs based off pooling but sample principle as any any sequencing project (the type of library and the quality of the library are driving factors to how well the library clustered on the FC and ultimately how much data is produced.)

A good alternative when you cannot fill up an entire lane, have libraries with duplicate indices, or do not have enough diversity (within the index sequences – more on this later).

Client will have to purchase “data buckets”; with the minimum amount being 50Gb (and no upper limit, although depending on the circumstances, it may make sense to buy a full lane)

32
Q

NovaSeq X Plus – Partial Lane – Pricing Structure

A
33
Q

NovaSeq X Plus – Partial Lane - Disclaimer

A

One thing I wanted to bring up is when we quote for the data amounts for our partial lane service – it is based off the ratio of your samples in the final pool. As you might know, the library fragment size/protocol, quality, as well as other factors, do influence the output so we might see some variations in the final output (either higher or lower than expected). We will provide all the data that correlates to your indexes.

One that note – it will be important to provide your indexes accurately and in 	the correct orientation. You will have an opportunity to review your indexes with 	the technical specialist during the sample submission process.
34
Q

Data Calculations

A

The following calculations will hold true only when the sequencing strategy is PE150.

1Gb of data is ~3.3M paired reads (or 6.6M total reads, combining R1 and R2)

of Gb = # of reads (in Millions) x 0.3 (conversion factor for PE150 projects)

Ex) Client would like 240GB of data, how many reads (hypothetical) would he/she receive?

of Gb = # of reads (in Millions) x 0.3 (conversion factor for PE150 projects)
240Gb/0.3 = # of reads
# of reads = 800M paired reads (or 1.6B total reads)

35
Q
A