protein structure prediction Flashcards
1
Q
motivation for structure prediction
A
- inform about function
- guide raitonal drug design
- mutagenesis
- solve structures from experimental data
- fundamental understanding of chemistry of protein structure
2
Q
CASP
A
- critical assessment of protein structure prediction
- blind trial to evaluate different approaches
- sequences sent to predictors prior to revealing experimental coordinates
- manual evaluation every 2 years
- combined with server-only predictions
3
Q
ab initio energy calculations
A
- original idea to describe interactions between atoms
- search for conformaiton of lowest energy
- energy minimisation methods, followed by molecular dynamics
- from first principles
- energy function needed first
4
Q
energy function
A
- potential energy of a protein in a particular conformation
- V = bond length + bond angle + bond dihedral rotation + VDW + electrostatic interactions
- molecular dynamics adds water molecules
- energy minimsation adds ad hoc terms for hydrophobicity
- or works in vacuo
5
Q
energy minimisation
A
- x, y, z obtained for each atom
- calculate energy
- make small positional changes to find path to lowest energy conformation (deltaG is minimal)
- some success with small proteins
6
Q
issues with energy minimisation
A
- can get stuck in local minimum
- think it is the lowest point but there is a global minimum
- just can’t get there
- solve with molecular dynamics
- simulate protein as moving object
- has momentum to overcome energy barriers
- think it is the lowest point but there is a global minimum
- energy ladnscape is difficult to define
- unsure if you are going up or down
- energy terms are difficult to define - calculation can be wrong
7
Q
secondary structure prediction
A
- identify local structures
- alpha, beta, coil, sometimes turn
- 3 or 4 state prediction
- determines local 3D structure to an extent
- doesn’t work with 7 residue sequences
- same 7 residue sequence in different proteins can produce compeltely different structure
- algorithms look at window of ~15
- long range effects involved
8
Q
secondary structure prediction
accuracy measure
A
- no of residues correctly predicted/no of residues considered
- Q3 = accuracy measure of 3 state prediction
- random result with equal numbers of each state = 33%
- in a protein dominated by helices (80:20), best random prediction would say all helical = 80%
- typical mix of 3 states, random result = 40%
9
Q
old single sequence methods
A
- simpler
- used to derive newer methods
- mainly based on obtaining rules from counting frequencies of residues in known structures
- empirical
- e.g. chou fasman
10
Q
chou-fasman
A
- numerical residue scores derived from data and ad hoc rules
- based on secondary structure propensity
- score>1 implies residue occurs in helix morefrequently than by chance
- create matrix for alpha and beta propensities of all amino acids
- pro/gly = helix breakers
- some residues are similar
- can be greater than 1 (not probability)
11
Q
helix breakers
A
- helices need H bonds between NH and CO
- pro:
- side chains bends back to covalently bind NH
- no H bond
- side chains bends back to covalently bind NH
- gly:
- small residue
- makes a cavity
- packs poorly against the rest of the helix
12
Q
rules of chou-fasman
A
- helix if:
- run of 4 out of 6 residues favouring a helix
- average helix propensity > 1 and > average beta strand propensity
- extend helix until pro is found, or run of 4 residues with helix propensity <1
13
Q
stereochemical methods
A
- recognise patterns of hydrophobic residues that favour secondary structures
- empirical
- enhanced by inspection of structures
- no longer used but pattern concept still important
- difficult to program
- original Q3 ~ 60%
- improved by ML and neural networks
14
Q
stereochemical methods
alpha vs beta
A
- alpha:
- 3.6 res per turn
- amphipathic pattern consistent with helix
- helical wheel plot
- one side hydrophobic, other hydrophilic
- beta:
- can be buried
- sheet with helices either side, run of hydrophobic residues
- can be surface
- stacked pair of beta sheets (Ig fold)
- bottom sheet alternates
- can be buried
15
Q
artifical neural networks
A
- simulates computation of brains
- input signal and set of nodes
- weight nodes so that input signal gives an output signal of alpha/beta
- input and answer known - only need to find weights
- once weights know, new sequence output can be produced
- improved with MSAs