Next Generation Sequencing (NGS) - Euskirchen Flashcards

Question 1

Q

list the steps in Sanger sequencing

Answer

A

reaction mixture- contains template DNA, primer, labelled terminators (ddNTP)
Primer elongation and chain termination -
a. incorporation of ddNTPs (fluorescent) to the DNA sequence (corresponding ddNTP always added to the last base)
b. elongation of the sequence by filling the last position with labelled ddNTP step by step
Capillary gel electrophoresis - the fluorescent stand is then pulled through a glass capillary gel –> gel electrophoresis and a string of beads passes the capillary and separate the DNA fragments
Laser detection of fluorescence A computational analysis - optic laser detector reads out fluorescence–> results in peaks in different colours which correspond to DNA seq

Question 2

Q

in Sanger sequencing it is not possible to fill up the entire DNA sequence in one go by different terminators, so one has to incorporate 1 labelled ddNTP at a time
T/F

Answer

A

FALSE!!
In Sanger sequencing, ddNTPs are incorporated to the DNA sequence at the last position, and this sequence can then be elongated step by step and a new ddNTP will be added at the last position.

Question 3

Q

what are the steps of illumina sequencing?

Answer

A

A. library preparation-
1. fragmentation - preparation of DNA by fragmenting the chromosome into small fragments (75-100 bp)

adapter ligation - to make DNA fragments look the same by adding an adapter sequence so the first few bases look the same
PCR amplification - on a glass slide with complementary molecules to the DNA seq. so when the library is poured over the slide the adapters bind to the immobilised primers

B. cluster generation -
4. bridge amplification - adaptors bound to primers –> polymerase makes a double strand from the single stranded DNA.

C. sequencing by synthesis-
5. priming - reading the complement strand and form clusters of DNA. the reading can only stretch to the neighbouring primer so the clusters resulting from it are according to the primer that is bound to the adapter.

synthesis with fluorescence - incorporation of fluorescence labelled nucleotides to determine first base.
fluorescently labelled nucleotide is always the last base on complement strand –> identification of the seq
imaging - from above and see the clusters
cleavage of terminators - makes the process reversible–> terminators are cleaved

Question 4

Q

illumina sequencing…

a. is reversible and thus quite revolutionary
b. is called short red sequencing
c. is time effective
d. allows for generation of large databases in a single experiment
e. all of the above

Answer

A

e. all of the above

Question 5

Q

which enzyme is used for bridge amplification in illumina sequencing?

Answer

A

polymerase

Question 6

Q

what is the principle of nanopore sequencing?

Answer

A

protein forms a pore which is inserted into a membrane
pore is only big enough for a single DNA seq.
pore is equipped with a motor protein which translocated the DNA to the pore
voltage applied to the membrane –> ion flow through the pore

–> depending on the size of the molecule in the pore (strand size) the amount of current that can flow varies

measure the current across the pore –> modulated by the identity of the base at the sensing region
ion current changes over time as the DNA is translocated and every time there is a different base at the sensing region

Question 7

Q

what is the coverage track? how is it calculated?

Answer

A

% of a given genome covered by sequenced data. It is the % of reads that cover a known reference sequence

Question 8

Q

how could deletion be detected in WGS?

Answer

A

genome viewer- a software that shows the raw data and its alignment to a ref genome, on many coordinates.
for any given coordinates which indicate the individual reads in the column–> if there is a ‘gap’ of reads in certain coordinates –> coverage is low in this area–> indicates a deletion
structural combinations- try to fit a random sequence within a range–> if it doesn’t fit, instead of having the sequence in the expected position it would be somewhere else possibly on the other side –> split read (half of the sequence fits in position a and the other in position b

Question 9

Q

a dip is NOT…

a. a deletion of a wide range of the genome
b. a focal deletion
c. a point mutation
d. a potential cause of a disease

Answer

A

a. a deletion of a wide range of the genome

Question 10

Q

in poly A RNA seq…

a. we use mRNA sequences
b. we use cDNA reversed from mRNA
c. we use tRNA
d. DNA regions that consist of a start codon

Answer

A

b. we use cDNA reversed from mRNA

Question 11

Q

why do we use cDNA in RNA seq and not the normal DNA?

Answer

A

because in RNA sequencing, the main goal is to check gene expression. Most of the natural DNA is junk DNA which is not coding for genes, and therefore one cannot study gene expression from the normal DNA.
the mRNA from which the cDNA is made contains only exonic (coding) information, and therefore can be used for this purpose.

Question 12

Q

how can one extract gene expression levels using RNA sequencing?

Answer

A

mRNA contains only coding regions of the DNA.
Therefore, the coverage track fron RNA seq, results in ‘drops’, and the covered regions only correspond to these coding areas. Instead of a full coverage as in WGS, the reads correlate with gene expression (and abundance of expression) of each individual gene.

Question 13

Q

Multi-dimensional data is a problem of RNA sequencing. How can one overcome this problem?

a. dimension reduction methods such as t-sne/ PCA
b. two way ANOVA test
c. calculation of each variable separately
d. all of the above

Answer

A

a. dimension reduction methods such as t-sne/ PCA

Question 14

Q

Why is t-sne more informative than PCA in RNA-seq analysis?

Answer

A

Both PCA and t-sne are methods for reducing the dimensionality of data.
PCA is a linear Dimensionality reduction technique, and as such, it tries to preserve the GLOBAL structure of the data, and maps the data as a whole.

T-sne on the other hand, is a non-linear Dimensionality reduction technique, so it disrupts the GLOBAL structure of the data and rather preserves the LOCAL structure of it, so data points with mutual features get clustered together regardless of their location on the sequence.

Therefore, when looking for differences between cellular sequences, t-sne is more useful and efficient, as local, mutual differences do not go lost, and the differences in features of different cell types can be mapped and visualised more intuitively

Question 15

Q

If you are interested in structure of chromosomes, which technique would you use?

a. RNA seq
b. WGS
c. ATAC seq
d. variant profiling

Answer

A

c. ATAC seq

Question 16

Q

what is a library?

Answer

Study These Flashcards

A

taking a DNA from a sample and select the reads to make them compatible for sequencing

Question 17

Q

what is the difference between coverage and depth?

Answer

Study These Flashcards

A

coverage is the % of a given genome covered by sequenced data. It is the % of reads that cover a known reference sequence.

Depth is the ratio between the total number of reads from a sequence and the size of the genome –> counts how many reads there are across the genome

Question 18

Q

what is a contig?

Answer

Study These Flashcards

A

a set of overlapping DNA segments that together represent a consensus region of the DNA –> number of structures/fragments that cover the genome

when all contigs are summed up, we get an assembly

Question 19

Q

what is trimming?

Answer

Study These Flashcards

A

cutting the ends of the reads to cut the adapter sequence

Question 20

Q

what is barcoding?

Answer

Study These Flashcards

A

adding a sequence that is coding a known gene from a library –> as a barcode/identifier for the sequence we are analysing

Question 21

Q

what is (De-)Multiplexing?

Answer

Study These Flashcards

A

pulling all sequences together and use the barcode info to know which sequence came from which sample

Question 22

Q

what is alignment?

Answer

Study These Flashcards

A

Alignment finding the exact differences between 2 sequences (sample and ref).
It is the process of searching for the location of a given sequence on the genome –> map the reads to the reference genome.

Question 23

Q

what is a consensus?

Answer

Study These Flashcards

A

“polishing raw alignment” –> the calculated order of the most frequent residues found at each position in a sequence-alignment.
It represents the result of multiple sequence-alignments in which related sequences are compared and similar motifs are calculated

Next Generation Sequencing (NGS) - Euskirchen Flashcards

(23 cards)