Reference mapping SNPs Flashcards
What Phred quality score requires the experiment to be restarted?
<20 as 20% is the minimum required to not consider as low quality, where low-quality reads are removed during quality control to prevent bias during analysis. 20 represents 99% accuracy.
What can FastQC check in a genome sequence?
Per base quality distribution: poor distribution is due to degraded quality over long runs, QUALITY TRIMMING helps.
Per tile sequence quality: average Phred scores per tile.
Quality scores: Low-quality reads must be REMOVED/TRIMMED as they could introduce BIAS during downstream analysis.
Per Sequence and per base GC Content: GC content across sequence is compared to a normal QC distribution where SHARP PEAKS indicate CONTAMINANTS or DIVERSITY.
Adapter sequence removal.
Per Base Sequence Content: Compares the proportion of each base and GC content where OVERREPRESENTED SEQUENCES may cause BIAS in the overall composition.
Sequence duplication: low duplication levels indicate high coverage of the target sequence, but high duplication levels indicate ENRICHMENT BIAS (e.g. PCR)
Per Base N Content: Bases with POOR QUALITY SCORES are indicated as ‘N’.
What is de novo assembly?
It refers to the K-mer approach in reconstructing the DNA sequence. Repetitive sequences present will cause assembly errors.
What is reference mapping used for in next-generation sequencing?
Next-generation sequencing reads are mapped onto a reference genome to identify SNPs, insertions and deletions.
What are the types of Single-nucleotide polymorphism (SNP)?
SNPs are single nucleotide variations in a DNA sequence between 2 or more organisms. It can be in the coding sequence or intergenic regions of DNA.
Synonymous SNPs: No change in amino acid sequence
Non-synonymous SNPs can change the amino acid sequence when a codon codes for a different amino acid which affects protein function (missense) or by creating a premature stop codon resulting in loss of gene function (nonsense).
What is TRAMS: Tool For Rapid Annotation of Microbial SNPs?
TRAMS is a program that annotates SNPs as synonymous, non-synonymous or nonsense.