Bioinformatics 4 Flashcards

1
Q

What are the benefits of predicting the protein fold?

A

It benefits medicine for drug design and biotechnology for design of novel enzymes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What assesses these programs for protein fold prediction and how?

A

CASP (Critical Assessment of Techniques for Protein Structure Prediction)
The software is assessed by giving each software with a known protein structure and then seeing what it predicts it to be.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How are the structures often predicted?

A

Through homology - similar sequences tend to fold in similar ways.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

BLAST can only identity homologues with >40% identity, what other programs can be used to find homology?

A

PSI-BLAST and HMM.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What program was developed by the Sternberg group at imperial?

A

Phyre.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Phyre work?

A

Works by searching the 10 million known sequences for homology using PSI-BLAST and captures the mutational changes at each position in the protein and creates an evolutionary fingerprint.
It then runs every known protein structure’s (65,000) sequence through PSI-BLAST this then creates a HMM from all the sequences with a known structure.
Finally, the query sequence has already been run through PSI-BLAST so then a HMM is created for it. The HMM for the query sequence is then compared to the HMM database of all known protein structures. When a good match is found a 3D model will be produced with a value of confidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a phylogenetic tree?

A

An prediction of the ancestry of a protein.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 3 main tree building algorithms?

A

Neighbour Joining
Maximum Parsimony
Maximum Likelihood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do these trees identify?

A

Phylogenetic trees identify the closest related protein to the one you are working with.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the first step to building a tree? (common to all algorithms)

A

The first step to building a tree is to produce a MSA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 3 major categories of tree building methods and which algorithms do they include?

A

Distance based methods - neighbour joining.
Character based methods - Maximum Parsimony and Maximum Likelihood.
Bayesian - method similar to maximum likelihood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does a distance based method work?

A

Distance methods uses a MSA to calculate pairwise distance, or the number of changes between each pair of sequences in a group.
This creates a distance matrix which can be used to produce a phylogenetic tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the advantages of the Neighbour Joining method?

A

Fast and can handle many sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Neighbour Joining does not assume a ultrmetric tree, what is this?

A

Anultrametric treeis a special kind of additive tree, the “tips” or terminal nodes are equidistant from the root. Ultrametric trees can thus depict evolutionary time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the limitations of a Neighbour joining?

A

Lacks any sort of tree search and optimality criterion and so there is no guarantee that the tree produced is the best fit for the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain Maximum Parsimony method.

A

Builds a tree from finding the paths with the minimum number of mutations required at each point to go from one sequence to the other.
To begin it performs a MSA and identifies informative sites.

17
Q

What is an informative site?

A

An informative site is one where there are at least two different kinds of nucleotides at the site, each of which of which is represented in at least two of the sequences under study.

18
Q

Explain the Maximum Likelihood method.

A

Creates all possible trees using the Maximum Parsimony method but also uses a model of evolution whereby different rates of mutation can be used.
GAU –> UGU is in fact 2 changes not one - uses prior knowledge.

19
Q

Why is Maximum Likelihood a more realistic tree estimation?

A

It does not assume equal mutation probabilities for all branches.

20
Q

What are the only sequences suitable for Maximum Parsimony?

A

Sequences which are very similar.

21
Q

What are the only sequences suitable for Maximum Likelihood?

A

Used for small numbers of sequences that are quite similar.