Bioinformatics 3 Flashcards

Question 1

Q

How do Hidden Markov models expand on scoring matrices (BLOSUM) and Profiles (PSSM)?

Answer

A

The include Markov chains - amino acid at one position can influence what will come next. (this allows the probability of the next AA to be calculated based on the previous one).
It also introduces gaps.

Question 2

Q

How do you create a HMM?

Answer

A

Build a MSA of homologous sequences (using e.g. ClustalO).
There is a probability at each position for a match, mismatch, deletion or insertion.
The HMM is then built by traversing the alignment and calculating the probability for each possible transition between alignment positions.
Each transition possibility has a probability score.
the overall score is calculated by multiplying transition scores together.
Overall score is then converted to E-value.

Question 3

Q

What program can be used to create and/or search HMM databases?

Question 4

Q

Which methods can be used for experimental structure prediction?

Answer

A

X-ray crystallography
NMR
Cryo-EM.

Question 5

Q

What is the aim of secondary structure prediction?

Answer

A

to identify local structure - alpha helix, beta sheet and random coil.

Question 6

Q

What is the name given to 3 state prediction (alpha, beta, coil)?

Question 7

Q

Why is it possible to predict structure from sequence?

Answer

A

Because the theory is that to a large extent the local sequence determines the local structure.

Question 8

Q

Name a secondary structure prediction program?

Question 9

Q

How is the accuracy (Q3) of secondary structure prediction calculated?

Answer

A

Accuracy = (no. of residues correctly predicted)/(total no. of residues)

Question 10

Q

What are the parameters of Q3?

Answer

A

Q3 is given a value between 0 and 1.

Question 11

Q

What does Q3=1 indicate?

Answer

A

A perfect prediction.

Question 12

Q

What does the Q3 result of a random prediction depend on?

Answer

A

The percentage of the different states.

e.g. Equal amounts in each state Q3 =33%.

Question 13

Q

Why is Q3=1 unrealistic?

Answer

A

Because secondary structure assignment in protein structure is uncertain up to about 10% - so perfect Q3 =0.9

Question 14

Q

What is the similarity between Jpred software and SNP prediction software?

Answer

A

Both of their algorithms are trained.
Jpred - Trained on sequences with known structures.
SNP - trained on sequences with known SNPs.

Question 15

Q

What does the Jpred algorithm use to predict secondary structure?

Answer

A

Uses PSI-BLAST, MSA and HMM

Question 16

Q

How is the Jpred algorithm refined?

Answer

Study These Flashcards

A

Known sequences analysed multiple times and the algorithm is modified each time to find the best prediction method.

Question 17

Q

Explain the Jpred algorithm.

Answer

Study These Flashcards

A

Query is searched with PSI-BLAST (UniProt database) for 3 iterations.
This alignment generates the parameters:
PSI-BLAST profile frequency
PSI-BLAST PSSM
MSA - scoring using BLOSUM62
HMM from the aligned sequences.

Question 18

Q

The alignment produced by PSI-BLAST in the Jpred algorithm is modified by post-processing, what does this mean?

Answer

Study These Flashcards

A

the gaps in the query and aligned sequences are removed.
This improves the prediction accuracy because regions where there are gaps in the query sequence will most likely be in the coil state and this state has no effect on the prediction.

Question 19

Q

What are solvent accessibility predictions?

Answer

Study These Flashcards

A

Extent of the Van der Waal’s surface of each amino acid residue that is exposed to the solvent surrounding the protein.

Bioinformatics 3 Flashcards

(19 cards)