alignment_assembly Flashcards
What is the alignment process in bioinformatics?
The alignment process compares sequences (DNA, RNA, or protein) to identify regions of similarity, helping to understand functional, structural, and evolutionary relationships.
What is a bit score?
A normalized score that reflects the significance of the alignment between a query sequence and a database sequence, independent of database size; higher scores indicate more significant alignments.
What is an E-value?
The Expect value describes the number of hits one can expect to see by chance when searching a database of a particular size; lower E-values indicate more significant matches.
How does the length of the query sequence affect the E-value?
Shorter sequences tend to have higher E-values because they are more likely to appear in the database by random chance.
How does database size influence the E-value?
A larger database increases the likelihood of finding matches with the same score, affecting the E-value.
What is BLASTN used for?
Comparing nucleotide sequences against nucleotide databases.
What is BLASTP used for?
Comparing protein sequences against protein databases.
What is BLASTX used for?
Comparing a nucleotide sequence translated into all six reading frames against a protein database.
What is TBLASTN used for?
Comparing protein sequences against a nucleotide database translated in all six reading frames.
What is TBLASTX used for?
Comparing nucleotide sequences translated into all six reading frames against another translated nucleotide database.
What factors should you consider when choosing alignment software?
Type of sequences/experiment, sequencing platform, planned further analysis, and computational infrastructure.
Why is it important to know your sequencing platform when choosing alignment software?
Different platforms (e.g., Illumina, Ion Torrent, PacBio) have unique read lengths and error types that affect compatibility with mapping tools.
How can downstream analysis influence your choice of alignment software?
Ensure that subsequent tools are compatible with the reported alignment types and formats required for further analysis.
Why is computational infrastructure important in selecting alignment software?
Some tools may require significant computational power or memory resources; knowing your available resources helps in making an appropriate choice.
Aspect
Short Read Alignment
Read Length and Characteristics
Typically range from 36 to 600 bp; generated by platforms like Illumina; cost-effective, high-quality data but can complicate assembly of complex genomes.