ab initio structure prediction Flashcards

1
Q

ab initio methods

A
  • template-free
  • no template available or can’t be found
  • 3 methods:
    • all atom molecular dynamics
      • simulate structure as it folds
    • fragment approach
    • contact prediction from multiple sequences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

rosetta

A
  • fragment approach
  • match query sequence to small sections of proteins
    • 9 residue segments
    • template based search algorithm
  • fit sections together into 3D structure
    • build up overlapping fragments with predicted structure
    • create trial model
  • if structure doesn’t work remodel it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

rosetta

identification of good trial structures

A
  • initially low resolution energy function
  • side chains represented by single centroid pseudoatom
  • major contributions:
    • hydrophobic burial, beta strand pairing, steric overlap, specific residue interactions
  • form a coarse structure
  • refine with rotamer based side chain representations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

rosetta

model choosing

A
  • many possible methods created in ab initio
  • can choose most popular:
    • calculate distance between each model
    • correct one has largest number of similar structures with smaller distances
    • some aspects usually correct but large regions can be region
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

contact prediction

A
  • create contact map of inter-residue distances
  • x or 1 where 2 atoms closer than set cutoff distance
  • better if using MSA to find common pattern of complementary changes
    • many sequences needed for strong signal
    • aligning thousands of seqeucnes helps remove noise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

contact prediction

anti-parallel beta strand

A
  • chains connected giving leading diagonal
  • if e.g. residue 1 and 12 are close there is an inidcation of 3D space
    • produces off diagonal (feature of 3D space)
  • use these terms to build up structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

contact prediction

limitations

A
  • limited to proteins with large numbers of homologues
    • >5000 ideally
    • often means template is available anyway
  • can still be useful for solving parts of a structure
  • should improve as databases grow
    • good for membrane proteins (difficult to crystallise)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

future of structure prediction

A
  • increasing number of determined structures
    • more templates
    • more fragments for ab initio
  • increasing number of sequences
    • better MSAs and profiles
    • better information capture for template based modelling
    • increasing viability of contact prediction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

empirical prediction algorithms

A
  • establish training dataset
    • e.g. sequences with known structures
  • ensure no duplicated due to homology
    • non-redundant set, prevents bias
  • learn rules and parameters
  • evaluate on testing set not used in training
    • no homology important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

jack-knifing

A
  • cross-validation
  • split database into training and testing sets
  • start with set of non-homologous data
    • take out one/several to form testing set
    • learn on rest and evaluate test data
    • repeat with different test data
  • get mean and variation of accuracy
    • statistical analysis to compare methods (t test)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

pitfalls of jack-knifing

A
  • multiple methods available
  • with proteins, testing and training data often have homology
  • more difficult to detect bias as algorithms become more complex
    • best to test on new data not available during development
How well did you know this?
1
Not at all
2
3
4
5
Perfectly