Bioinformatics 1 Flashcards
What are two types of pairwise alignments?
Global alignment and local alignment
What is a global alignment?
Finds the simplest alignment across the entire two sequences - key for building a MSA
What are global alignments used for?
Similar proteins - same protein from different species.
What is a local alignment?
Tries to identify any common domains/regions between sequences and aligns them - these will be surrounded by unaligned regions.
What happens if you use a local alignment on very similar proteins?
It will affectively just do a global alignment.
What are local alignments used for?
Protein sequences with common domains and aligning cDNA to the genomic sequence.
Why do you need to use a local alignment for aligning cDNA to the genome and not a global alignment?
A global alignment will try to force an alignment because the exons match - but would be wrong.
A local alignment would give several alignments for each exon match in the genome - this identifies their location in the genome.
What database does blastn use?
nucleotide
What database does blastp use?
amino acid
What is the query type and database for tblastn?
query = amino acid database = translated nucleotide
What is the query type and database for blastx?
query = translated nucleotide database = amino acid
What is the query type and database for tblastx?
query = translated nucleotide database = translated nucleotdie
Which one of all the BLAST programs does not search in a protein database?
blastn
Why can a non exact match be as informative as a perfect match in an alignment?
Because amino acids are mutated during evolution - there is a higher probability the mutated amino acid being maintained if the chemical/physical properties of original amino acid are being maintained in the mutation. This means that mutations between sequences can be scored.
How do you build a substitution matrix?
By choosing a group of similar proteins and scoring based on the observed frequency of the amino acids within the protein.
What was the first substitution matrix and when was it developed?
PAM - point accepted mutation developed in the 1970s.
Note = by Margaret Dayhoff
How was PAM developed?
Did a global alignment on closely related proteins and looked at the sequence differences in the proteins - used this to derive a scoring matrix for how often at position did one amino acid mutate to another.
What type of proteins does PAM only work with?
Only works with closely related proteins.
What is the difference between PAM30 and PAM70?
The higher number in the naming scheme denotes lower sequence similarity and larger evolutionary distance.
What does BLOSUM stand for?
Block substitution matrix
What kind of proteins does BLOSUM work with?
Evolutionary divergent proteins.
Explain the basis of BLOSUM.
Does MSA of evolutionary proteins - looks at conserved regions of the proteins. If they are conserved this normally means they are functionally important and more pressure on amino acids that mutate to maintain similar properties.
How does a BLOSUM62 matrix work?
calculates the likelihood of the two amino acids that have aligned mutating to one another. This attributes a score to every alignment - so can see which alignment is the best.
What does a positive BLOSUM62 score mean?
Conservative substitution - likely to happen
e.g. Lys/Arg = +2