Chapter 2 - Next Generation Sequencing Flashcards

Question

Vanuit NGS onstaat een vast getal van reads uit een pool van DNA fragmenten. Wat voor informatie kan dit geven?

Answer 1

Over biologische condities door vergelijing van read number met biologische condities

Answer 2

Vergelijken van aantal kopieën van elke mRNA transcript via sequencing van cDNA in verschillende celtypes of samples. (PCR voorafgaand aan de sequencing)

Answer 3

Overrepresentatie van sommige sequenties > heeft invloed op de uiteindelijke library en de kwantificatie van DNA/RNA abundance

Answer 4

Korte random nucleotide sequenties die gebruikt kunnen worden als een absolute telmethode (uniek voor elk molecuul (bv DNA, eiwit))

Answer 5

Elk molecuul in de populatie is uniek gemaakt door additie van UMI voorafgaand aan de PCR sequencing > UMIs komen in de library > moleculaire geheugen voor aantal moleculen in de startsample > elke UMI tellen om PCR bias te voorkomen > identieke kopieën scheiden van aparte moleculen door PCR amplificatie >> PCR errors bepalen

Answer 6

RNA-seq > verbeterde detectie van laagfrequente moleculen

Answer 7

UMIs zijn 22 nt lang > 4^22 mogelijke UMIs

Answer 8

1. extensie van primers met UMIs en p5/p7 adapters om de barcoded libraries te faciliteren voor Illumina 2. PCR 2 cycli > p5/p7 tagged amplicons inclusief UMI (via adapter aan flow cell) 3. Final library amplification (2e PCR) 4. Illumina sequencing tot Illumina reads

Answer 9

Incorrect gemeten nucleotiden door de detector

Answer 10

- During library prep > unintended ligation or other polymerization error - During sequencing > more common: chance of incorporating wrong or nu nucleotide > degradation of light signals due to out of synch

Answer 11

The sequencing error rate > length below 2 x 300 nt

Answer 12

Estimation of likelihood of sequencing error depending on light signal

Answer 13

0.8% > 8/1000 - error is indistinguishable from a SNP > overcome problem by increasing number of reads > 8 reads > 0.8^8 % error rate

Answer 14

NGS -- Sanger *library construction* NGS reads from fragment libraries -- cloning and amplification *parallelism* Parallel procession of millions of reads -- 96 reads at a time for gel electrophoresis *read length* 50-300 nt -- up to 1000 nt *error rate* 85-99% -- 99.999% *costs* 0.0002$ per kb -- 0.50$ per kb

Answer 15

Indentifying Structural Variants, repetitive elements, copy-number alterations

Answer 16

Longer than 1 kb > useful for transcriptomic research > entire mRNA transcripts > eliminate randomness in positions or size of genomic elements

Answer 17

newer ways of sequencing like nanopores

Answer 18

-Single-molecule real time sequencing -Synthetic approaches

Answer 19

-Does not rely on clonal population of amplified DNA -Asynchronous signal detection by fluorescent signal during polymerization of single DNA molecules. -No sequencing cycles > every signal from every molecule is captured on its own > no limit to read length than availability of nucleotides -higher error rate than short reads (weaker signal from a single molecule and chance of sequencer error)

Answer 20

Sequence the same piece of DNA multiple times: but this reduces throughput

Answer 21

Membrane with many pores, and the DNA strand is puled through. -electrical potential over the membrane -measuring the flux of currents through the pores specific for the sequence. -recognize short DNA sequences -no limit to length of DNA molecule except for mechanical stability of DNA -Base-calling is harder and error rates are higher than light-based base-calling

Answer 22

-Data conversion -Quality score and trimming -Sequence alignment / mapping -Coverage / read depth

Answer 23

Q = -log(p) Q: Phred score p: probability of sequencing error

Answer 24

10 > 1 in 10 incorrect base calls > 90% accuracy 40 > 1 in 10,000 incorrect base calls > 99.99%

Answer 25

The quality decreases the further downstream (towards 3'-end) in the read

Answer 26

The coverage will decrease (read depth at certain nucleotides)

Answer 27

So that the most amount of matches is made in nucleotide/amino acid sequence

Answer 28

-Unique -Non-unique -Low confidence: unique but low quality score, lots of mismatches -No alignment

Answer 29

Unique alignments

Answer 30

Low complexity region, pseudogene, repetitive region

Answer 31

the (average) number of reads representing a given nucleotide in the reconstructed/reference sequence

Answer 32

Increased reliability of the results like indentificaion of SNPs

Answer 33

Percentage of the DNA/RNA covered by all the reads

Answer 34

N*L/G N: number of reads L: average read length G: length of the original genome

Answer 35

because less DNA is needed for analysis

Answer 36

-Single cell processes: cancer: transformation, clonal evolution, metastasis, chemoresistance (transcriptomics analysis) -Study of micro-organisms which cannot be cultured direct (microbiome)

Chapter 2 - Next Generation Sequencing Flashcards

Reader Ch.2 (61 cards)