High throughput sequencing - concepts+methods Flashcards
What are the three main high-throughput sequencing methods?
Pacbio, Illumina and Oxford Nanopore
(also Ion torrent)
What is the main difference between SNP genotyping and DNA sequencing?
DNA sequencing: usually aims to get a full picture of a complete genomic set and the exact order of the bases, a way to view the genetic code.
SNP genotyping: more of a targeted approach where pre-determined SNPs are analyzed, often between samples or populations. These SNPs are usually known to vary largely across different groups.
Why is an automated post-sequencing analysis crucial after NGS?
Because high-throughput NGS methods generate very large amounts of data - millions of sequences, which are not possible to analyze with a simple BLAST search. Need other tools.
What is de novo sequencing?
When sequencing a genome with no reference genome. This requires a lot of computer power since more data is usually needed to get a bigger picture, and these sequences need to be aligned with each other.
What is a contig?
A longer sequence pieced together from several short reads.
Explain how Sanger sequencing works.
Based on chain termination by adding dideoxy-bases - when a ddNTP is ligated into the chain a light is emitted. Mix regular dNTPs with ddNTPs –> ddNTPs will eventually terminate the ongoing chain at each position (need homogenous sample with high sequence abundance).
What is the difference between single end reads and paired end reads?
Single-end read sequencing produces reads only from one direction, while paired end reads are produces from both directions of the same sequence.
When is it best to use single end reads and paired end reads, respectively?
Single-end reads are easier and cheaper to generate, but when assembling a genome with the aim to visualize larger chunks of sequences, for example in de novo sequencing, paired end reads are better. When starting the sequencing at one end, the accuracy degrades during the process. Therefore, single-end reads are beneficial to use in applications where large sequencing depth (for example RNA-seq, analyzing the amount of transcript) is needed.
What is coverage?
Coverage = total number of sequenced bases/total number of bases in the genome
High coverage is usually needed in de novo sequencing.
What is a sequencing library?
Pools of DNA fragments containing adaptors that are compatible for the sequencing method. The library is usually also prepared in other ways to make the sequencing as clean and effective as possible.
Why is library preparation before sequencing so important?
You need to know exactly what you’re sequencing and that the input is suitable with the chosen sequencing method. Otherwise, the unwanted fragments will be sequenced and the end results might be difficult to analyze.
What are some steps involved in library preparation?
Target enrichment: enrich only for the molecules that you want and deplete others. For example: polyA selection enriches only for mRNA and depletes other RNAs such as rRNA (that is very abundant in most samples)
Fragmentation: size selection, DNA needs to be fragmented when doing methods that generate short reads (e.g Illumina).
Adding adapters: can be done with either PCR or ligation.