Bioinformatics 4 Flashcards
What are the benefits of predicting the protein fold?
It benefits medicine for drug design and biotechnology for design of novel enzymes.
What assesses these programs for protein fold prediction and how?
CASP (Critical Assessment of Techniques for Protein Structure Prediction)
The software is assessed by giving each software with a known protein structure and then seeing what it predicts it to be.
How are the structures often predicted?
Through homology - similar sequences tend to fold in similar ways.
BLAST can only identity homologues with >40% identity, what other programs can be used to find homology?
PSI-BLAST and HMM.
What program was developed by the Sternberg group at imperial?
Phyre.
How does Phyre work?
Works by searching the 10 million known sequences for homology using PSI-BLAST and captures the mutational changes at each position in the protein and creates an evolutionary fingerprint.
It then runs every known protein structure’s (65,000) sequence through PSI-BLAST this then creates a HMM from all the sequences with a known structure.
Finally, the query sequence has already been run through PSI-BLAST so then a HMM is created for it. The HMM for the query sequence is then compared to the HMM database of all known protein structures. When a good match is found a 3D model will be produced with a value of confidence.
What is a phylogenetic tree?
An prediction of the ancestry of a protein.
What are the 3 main tree building algorithms?
Neighbour Joining
Maximum Parsimony
Maximum Likelihood
What do these trees identify?
Phylogenetic trees identify the closest related protein to the one you are working with.
What is the first step to building a tree? (common to all algorithms)
The first step to building a tree is to produce a MSA.
What are the 3 major categories of tree building methods and which algorithms do they include?
Distance based methods - neighbour joining.
Character based methods - Maximum Parsimony and Maximum Likelihood.
Bayesian - method similar to maximum likelihood.
How does a distance based method work?
Distance methods uses a MSA to calculate pairwise distance, or the number of changes between each pair of sequences in a group.
This creates a distance matrix which can be used to produce a phylogenetic tree.
What are the advantages of the Neighbour Joining method?
Fast and can handle many sequences.
Neighbour Joining does not assume a ultrmetric tree, what is this?
Anultrametric treeis a special kind of additive tree, the “tips” or terminal nodes are equidistant from the root. Ultrametric trees can thus depict evolutionary time.
What are the limitations of a Neighbour joining?
Lacks any sort of tree search and optimality criterion and so there is no guarantee that the tree produced is the best fit for the data.