Bioinformatics 3 Flashcards

1
Q

How do Hidden Markov models expand on scoring matrices (BLOSUM) and Profiles (PSSM)?

A

The include Markov chains - amino acid at one position can influence what will come next. (this allows the probability of the next AA to be calculated based on the previous one).
It also introduces gaps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you create a HMM?

A

Build a MSA of homologous sequences (using e.g. ClustalO).
There is a probability at each position for a match, mismatch, deletion or insertion.
The HMM is then built by traversing the alignment and calculating the probability for each possible transition between alignment positions.
Each transition possibility has a probability score.
the overall score is calculated by multiplying transition scores together.
Overall score is then converted to E-value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What program can be used to create and/or search HMM databases?

A

HMMER.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which methods can be used for experimental structure prediction?

A

X-ray crystallography
NMR
Cryo-EM.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the aim of secondary structure prediction?

A

to identify local structure - alpha helix, beta sheet and random coil.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the name given to 3 state prediction (alpha, beta, coil)?

A

Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is it possible to predict structure from sequence?

A

Because the theory is that to a large extent the local sequence determines the local structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Name a secondary structure prediction program?

A

Jpred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the accuracy (Q3) of secondary structure prediction calculated?

A

Accuracy = (no. of residues correctly predicted)/(total no. of residues)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the parameters of Q3?

A

Q3 is given a value between 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does Q3=1 indicate?

A

A perfect prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the Q3 result of a random prediction depend on?

A

The percentage of the different states.

e.g. Equal amounts in each state Q3 =33%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is Q3=1 unrealistic?

A

Because secondary structure assignment in protein structure is uncertain up to about 10% - so perfect Q3 =0.9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the similarity between Jpred software and SNP prediction software?

A

Both of their algorithms are trained.
Jpred - Trained on sequences with known structures.
SNP - trained on sequences with known SNPs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the Jpred algorithm use to predict secondary structure?

A

Uses PSI-BLAST, MSA and HMM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is the Jpred algorithm refined?

A

Known sequences analysed multiple times and the algorithm is modified each time to find the best prediction method.

17
Q

Explain the Jpred algorithm.

A

Query is searched with PSI-BLAST (UniProt database) for 3 iterations.
This alignment generates the parameters:
PSI-BLAST profile frequency
PSI-BLAST PSSM
MSA - scoring using BLOSUM62
HMM from the aligned sequences.

18
Q

The alignment produced by PSI-BLAST in the Jpred algorithm is modified by post-processing, what does this mean?

A

the gaps in the query and aligned sequences are removed.
This improves the prediction accuracy because regions where there are gaps in the query sequence will most likely be in the coil state and this state has no effect on the prediction.

19
Q

What are solvent accessibility predictions?

A

Extent of the Van der Waal’s surface of each amino acid residue that is exposed to the solvent surrounding the protein.