Next Generation Sequencing (NGS) - Euskirchen Flashcards
list the steps in Sanger sequencing
- reaction mixture- contains template DNA, primer, labelled terminators (ddNTP)
- Primer elongation and chain termination -
a. incorporation of ddNTPs (fluorescent) to the DNA sequence (corresponding ddNTP always added to the last base)
b. elongation of the sequence by filling the last position with labelled ddNTP step by step - Capillary gel electrophoresis - the fluorescent stand is then pulled through a glass capillary gel –> gel electrophoresis and a string of beads passes the capillary and separate the DNA fragments
- Laser detection of fluorescence A computational analysis - optic laser detector reads out fluorescence–> results in peaks in different colours which correspond to DNA seq
in Sanger sequencing it is not possible to fill up the entire DNA sequence in one go by different terminators, so one has to incorporate 1 labelled ddNTP at a time
T/F
FALSE!!
In Sanger sequencing, ddNTPs are incorporated to the DNA sequence at the last position, and this sequence can then be elongated step by step and a new ddNTP will be added at the last position.
what are the steps of illumina sequencing?
A. library preparation-
1. fragmentation - preparation of DNA by fragmenting the chromosome into small fragments (75-100 bp)
- adapter ligation - to make DNA fragments look the same by adding an adapter sequence so the first few bases look the same
- PCR amplification - on a glass slide with complementary molecules to the DNA seq. so when the library is poured over the slide the adapters bind to the immobilised primers
B. cluster generation -
4. bridge amplification - adaptors bound to primers –> polymerase makes a double strand from the single stranded DNA.
C. sequencing by synthesis-
5. priming - reading the complement strand and form clusters of DNA. the reading can only stretch to the neighbouring primer so the clusters resulting from it are according to the primer that is bound to the adapter.
- synthesis with fluorescence - incorporation of fluorescence labelled nucleotides to determine first base.
fluorescently labelled nucleotide is always the last base on complement strand –> identification of the seq - imaging - from above and see the clusters
- cleavage of terminators - makes the process reversible–> terminators are cleaved
illumina sequencing…
a. is reversible and thus quite revolutionary
b. is called short red sequencing
c. is time effective
d. allows for generation of large databases in a single experiment
e. all of the above
e. all of the above
which enzyme is used for bridge amplification in illumina sequencing?
polymerase
what is the principle of nanopore sequencing?
- protein forms a pore which is inserted into a membrane
- pore is only big enough for a single DNA seq.
- pore is equipped with a motor protein which translocated the DNA to the pore
- voltage applied to the membrane –> ion flow through the pore
–> depending on the size of the molecule in the pore (strand size) the amount of current that can flow varies
- measure the current across the pore –> modulated by the identity of the base at the sensing region
- ion current changes over time as the DNA is translocated and every time there is a different base at the sensing region
what is the coverage track? how is it calculated?
% of a given genome covered by sequenced data. It is the % of reads that cover a known reference sequence
how could deletion be detected in WGS?
- genome viewer- a software that shows the raw data and its alignment to a ref genome, on many coordinates.
for any given coordinates which indicate the individual reads in the column–> if there is a ‘gap’ of reads in certain coordinates –> coverage is low in this area–> indicates a deletion - structural combinations- try to fit a random sequence within a range–> if it doesn’t fit, instead of having the sequence in the expected position it would be somewhere else possibly on the other side –> split read (half of the sequence fits in position a and the other in position b
a dip is NOT…
a. a deletion of a wide range of the genome
b. a focal deletion
c. a point mutation
d. a potential cause of a disease
a. a deletion of a wide range of the genome
in poly A RNA seq…
a. we use mRNA sequences
b. we use cDNA reversed from mRNA
c. we use tRNA
d. DNA regions that consist of a start codon
b. we use cDNA reversed from mRNA
why do we use cDNA in RNA seq and not the normal DNA?
because in RNA sequencing, the main goal is to check gene expression. Most of the natural DNA is junk DNA which is not coding for genes, and therefore one cannot study gene expression from the normal DNA.
the mRNA from which the cDNA is made contains only exonic (coding) information, and therefore can be used for this purpose.
how can one extract gene expression levels using RNA sequencing?
mRNA contains only coding regions of the DNA.
Therefore, the coverage track fron RNA seq, results in ‘drops’, and the covered regions only correspond to these coding areas. Instead of a full coverage as in WGS, the reads correlate with gene expression (and abundance of expression) of each individual gene.
Multi-dimensional data is a problem of RNA sequencing. How can one overcome this problem?
a. dimension reduction methods such as t-sne/ PCA
b. two way ANOVA test
c. calculation of each variable separately
d. all of the above
a. dimension reduction methods such as t-sne/ PCA
Why is t-sne more informative than PCA in RNA-seq analysis?
Both PCA and t-sne are methods for reducing the dimensionality of data.
PCA is a linear Dimensionality reduction technique, and as such, it tries to preserve the GLOBAL structure of the data, and maps the data as a whole.
T-sne on the other hand, is a non-linear Dimensionality reduction technique, so it disrupts the GLOBAL structure of the data and rather preserves the LOCAL structure of it, so data points with mutual features get clustered together regardless of their location on the sequence.
Therefore, when looking for differences between cellular sequences, t-sne is more useful and efficient, as local, mutual differences do not go lost, and the differences in features of different cell types can be mapped and visualised more intuitively
If you are interested in structure of chromosomes, which technique would you use?
a. RNA seq
b. WGS
c. ATAC seq
d. variant profiling
c. ATAC seq