ab initio structure prediction Flashcards

Question 1

Q

ab initio methods

Answer

A

template-free
no template available or can’t be found
3 methods:
- all atom molecular dynamics
  - simulate structure as it folds
- fragment approach
- contact prediction from multiple sequences

Question 2

Q

rosetta

Answer

A

fragment approach
match query sequence to small sections of proteins
- 9 residue segments
- template based search algorithm
fit sections together into 3D structure
- build up overlapping fragments with predicted structure
- create trial model
if structure doesn’t work remodel it

Question 3

Q

rosetta

identification of good trial structures

Answer

A

initially low resolution energy function
side chains represented by single centroid pseudoatom
major contributions:
- hydrophobic burial, beta strand pairing, steric overlap, specific residue interactions
form a coarse structure
refine with rotamer based side chain representations

Question 4

Q

rosetta

model choosing

Answer

A

many possible methods created in ab initio
can choose most popular:
- calculate distance between each model
- correct one has largest number of similar structures with smaller distances
- some aspects usually correct but large regions can be region

Question 5

Q

contact prediction

Answer

A

create contact map of inter-residue distances
x or 1 where 2 atoms closer than set cutoff distance
better if using MSA to find common pattern of complementary changes
- many sequences needed for strong signal
- aligning thousands of seqeucnes helps remove noise

Question 6

Q

contact prediction

anti-parallel beta strand

Answer

A

chains connected giving leading diagonal
if e.g. residue 1 and 12 are close there is an inidcation of 3D space
- produces off diagonal (feature of 3D space)
use these terms to build up structure

Question 7

Q

contact prediction

limitations

Answer

A

limited to proteins with large numbers of homologues
- >5000 ideally
- often means template is available anyway
can still be useful for solving parts of a structure
should improve as databases grow
- good for membrane proteins (difficult to crystallise)

Question 8

Q

future of structure prediction

Answer

A

increasing number of determined structures
- more templates
- more fragments for ab initio
increasing number of sequences
- better MSAs and profiles
- better information capture for template based modelling
- increasing viability of contact prediction

Question 9

Q

empirical prediction algorithms

Answer

A

establish training dataset
- e.g. sequences with known structures
ensure no duplicated due to homology
- non-redundant set, prevents bias
learn rules and parameters
evaluate on testing set not used in training
- no homology important

Question 10

Q

jack-knifing

Answer

A

cross-validation
split database into training and testing sets
start with set of non-homologous data
- take out one/several to form testing set
- learn on rest and evaluate test data
- repeat with different test data
get mean and variation of accuracy
- statistical analysis to compare methods (t test)

Question 11

Q

pitfalls of jack-knifing

Answer

A

multiple methods available
with proteins, testing and training data often have homology
more difficult to detect bias as algorithms become more complex
- best to test on new data not available during development

ab initio structure prediction Flashcards

(11 cards)