Dot Plot Flashcards
Dot plot
Used to determine the similarity and variability between sequences
Compares two sequences (pair wise sequence alignments) or more sequences (multiple sequence alignment)
Similarity of the two sequences depends on
- The number and length of matching segments in the matrix
- The longer the diagonal line the higher the similarity in the sequence
(Insertions and deletions give rise to disruption)
Diagonal lines
Principal diagonal
Sub diagonal
Forward subdiagonal
Backward subdiagonal
The direction of the sequences on the axes will determine
The direction of the line on the dot plot
What causes multiple lines to be plotted
Frameshifts
Inverted repeat sequences
Softwares used to create dot plot
Anacon
D-Genies
Dotlet
Dotmatcher
Dot plot archived
Frameshift mutation
Framing error or reading framed shift
Caused by insertions/ deletions of a number of nucleotide in a DNA sequence
Inverted Repeat Sequences
Copies of nucleic acid sequence arranged in opposing orientation
Lie tandem
Separated by some sequence that is not part of the repeat (hyphenated)
Palindromic repeats
Advantages of inverted repeat sequences
- Reveals the presence of insertions/deletions
- Reveals direct & inverted repeats that are difficult to find
Disadvantages of inverted repeat sequences
Computational programs don’t show an actual alignment
Doesn’t return a score to show how optimal a given alignment is
Applications of dot plot
- Sequence alignment: identify regions of similarity and dissimilarity between 2 sequences
- Genome assembly: used to compare two genomes or different regions of the same genome to identify structural variations
- Repeat analysis: used to identify repetitive elements within a sequence which is useful for genome annotation and analysis
- Identification of conserved domains: used to identify conserved domains within a protein sequence providing insights into the function of the protein.
- Phylogenetic analysis: used to compare the similarity between sequences from different organisms, which can be useful for constructing phylogenetic trees and inferring evolutionary relationships.
Limitations of dot plot
- Sensitivity to sequence length and complexity: difficult to interpret for sequences that are highly repetitive or have complex structure, as the resulting plot may be difficult to interpret due to the high number of dots and lack of clear patterns.
- Sensitivity to sequence alignment: The interpretation of a dot plot can be highly dependent on the alignment of the sequences being compared. If the alignment is poor or incorrect, the resulting dot plot may be difficult to interpret or misleading.
- Limited scalability: become unwieldy for large sequences or datasets, as the plot size increases with the square of the sequence length. This can make it difficult to visualize and analyze large datasets using dot plots.
- Limited ability to identify subtle similarities: While dot plots can be useful for identifying regions of high similarity between sequences, they may not be sensitive enough to identify more subtle similarities or differences between sequences.
- Dependence on chosen window size and threshold: The interpretation of a dot plot can be highly influenced by the choice of window size and threshold used to generate the plot. Different window sizes and thresholds may result in different patterns in the dot plot, leading to different interpretations of the data.
Sequence alignment
Sequence alignment is a fundamental technique in bioinformatics that is used to compare two or more sequences of DNA, RNA, or protein.
The goal of sequence alignment is to identify regions of similarity and difference between the sequences, which can provide insights into their evolutionary relationships, functional similarities, and structural features.
Pairwise alignment
Pairwise alignment is the comparison of two sequences to identify regions of similarity and difference.
Eg: Needleman-Wunsch algorithm
Multiple alignment sequence
the comparison of three or more sequences to identify regions of similarity and difference. Eg: progressive alignment algorithms