Alignment and denovo Flashcards
What does alignment in NGS refer to?
Matching DNA sequences to a reference genome
Which algorithm is commonly used in sequence alignment?
Needleman-Wunsch
What is the purpose of using a seed in alignment algorithms?
To initiate the alignment process efficiently
What is a ‘read’ in the context of NGS alignment?
A fragment of DNA sequenced in an NGS run
What challenge does repetitive DNA pose in alignment?
Can lead to more than one alignments
What does the term ‘coverage’ mean in NGS?
The number of times a base is sequenced
What is ‘mapping’ in NGS data analysis?
Aligning reads to a reference genome
What role does a reference genome play in alignment?
Provides a template to compare against sequenced reads
What is an advantage of paired-end sequencing in alignment?
repeted
Helps finding regoins that have been repeted
repeted
What does the Burrows-Wheeler Transform (BWT) do in sequence alignment?
Compresses DNA sequences
What is the significance of gap penalties in sequence alignment?
To penalize insertions and deletions
What is a primary challenge in aligning short reads in NGS?
They may align to multiple locations
What is the Smith-Waterman algorithm primarily used for?
Local sequence alignment
What does ‘consensus sequence’ mean in the context of alignment?
The most common sequence in a dataset
How does base quality scoring impact alignment?
Higher quality scores improve alignment accuracy
What is the role of ‘alignment score’ in NGS?
To assess the quality of the alignment
Why are reference genomes updated periodically?
To correct errors and incorporate new genetic information
What does ‘chimeric alignment’ refer to in NGS?
A single read aligning to two different parts of the genome
What advantage does ‘splice-aware’ alignment offer for RNA-seq data?
It aligns reads across exon junctions
In NGS, what is ‘indel realignment’ used for?
To correct mismatches in the alignment
What is the primary goal of de novo assembly in NGS?
Assemble sequences without a reference genome
What does “de novo” mean in the context of genome assembly?
From the beginning
Which graph type is commonly used in de novo assembly?
De Brujin graph
What are ‘contigs’ in de novo assembly?
Overlapping DNA sequences joined together
What is the purpose of using k-mers in de novo assembly?
To break down reads into smaller parts for assembly
What challenge do repeats in DNA pose for de novo assembly?
They make it difficult to accurately assemble sequences
What is the role of mate-pair information in de novo assembly?
long range
To provide long-range continuity
What does the term ‘coverage’ refer to in de novo assembly?
The depth of sequencing of a particular region
Why is high coverage important in de novo assembly?
To ensure accuracy and completeness of the assembly
What is a ‘scaffold’ in genome assembly?
A set of contigs ordered and oriented with gaps between them
Which technology provides longer read lengths beneficial for de novo assembly?
Sanger sequencing
How does the choice of k-mer size impact de novo assembly?
It affects the complexity and accuracy of the assembly
What is a common problem in assembling highly repetitive DNA regions?
Uncertanity in placing repeats accurately
In de novo assembly, what is the significance of ‘N50’?
It represents the length of the median contig
What does the ‘greedy algorithm’ do in de novo assembly?
It merges fragments with the largest overlap first
How do assemblers handle sequencing errors in de novo assembly?
They correct errors based on coverage and consensus
Why is iterative k-mer analysis used in de novo assembly?
To find the optimal k-mer size for assembly
What is the outcome of successful de novo assembly?
A complete and accurate representation of the sequenced genome
How do paired-end reads aid in de novo assembly?
orientationand distance
By providing information about read orientation and distance