Key notes Flashcards

1
Q

What type of systems cannot be simulated usind MD? Give examples of biological systems we can simulate and how the free energy there is related to the processes.

A
  • In MD, classical forcefields can’t be used for reactions involving bond breaking/making.
  • Protein folding and ligand binding are ideal applications for this
  • ΔE and ΔS define whether a protein fold is favourable enough to form and how strongly a ligand binds to a protein.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Discuss changes in free enrgy, binding and structure as well as solvent reorganisation in a ligand binding system

A
  • Binding of the ligand will cause a change in enthalpy/internal energy as a result of intermolecular interactions (e.g. electrostatic interaction associated with vdW)
  • Loss of conformational freedom in binding site causes a decease in entropy, that counterbalances an increase due to water around free ligand having more freedom.
  • The total free energy is a net effect of all these different changes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Experimentally we can find the free energy of a system using ΔA = -RTlnKbind where Kbind is the ratio of the time a ligand is bound compared to unbound. Why does this approach not work for simulation and what is an alternative approach?

A
  • To sufficiently sample a system in both states, and directly calculate the free energy would involve simulation times that are too unrealistic.
  • Instead relate the free energy to a microscopic description of the system through statistical mechanics
  • A = -kbTlnQNVT­, (QNVT is the canonical partition function)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some problems with using a partition function approach to calculate free energy directly? What is an alternative approach

A
  • Sampling entire phase space and integrating QNVT directly is impractical
  • Alternatively, could take an ensemble average, and give A in terms of potential energy, U. (A ∝ e-U(r))
  • However low energy samples would contribute very little average (?) and high energy samples take a long time to reach.
  • This leads us to not being able to calculate A directly.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  • QNVT is a function of the , which depends on the sum of … (…) and … (…) energy as a function of and
  • can be solved analytically, leaving as an excess which can be calculated in a simulation at each step giving overall
A
  • QNVT is a function of the Hamiltonian, which depends on the sum of kinetic (K) and potential (U) energy as a function of position and momenta.
  • K can be solved analytically, leaving U as an excess which can be calculated in a simulation at each step giving overall ΔU.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is our method of calculating ΔA as a function of ΔU implemented?

A
  • Carry out simulation of state 0, calculating PE at each step (U0)
  • At each step also, take configuration (snapshot of trajectory) and apply PE function corresponding to state 1 to calculate U1, resulting in ΔU.
  • This is known as thermodynamic perturbation theory
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How might we compute the difference in hydration free energy between two ions?

A
  • Define forcefield parameters for Lithium in a box of water, as well as for rubidium in water.
  • Water terms are identical, Li and Rb will have different LJ/coulombic terms
  • Run MD simulation of system in state 0 (Li(aq)+), calculating U0 via PE eqn and terms defined in FF.
  • Simultaneously, take same configuration and calculate PE of system in state 1 (Rb(aq)+), U1
  • Same coordinates, only difference is parameters used to calculate U1 and therefore ΔU at that step.
  • At end of simulation take average of ΔU and use to find ΔA.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the problem ignored so far in free energy calculation using thermodynamic perturbation theory with no windows? relate your answer to the free energy of the system

A
  • If state 0 and 1 are very different (state 0 has a low probability of being in state 1) then ΔU is large
  • In the case of Li/Rb this would occur due to unfavourable interactions of Rb overlapping the water molecules closer to Li, causing a high energy LJP term
  • A large ΔU results in the large exponential term becoming negligibly small, giving low weight in ensemble average
  • This causes ΔA to converge slowly, meaning our errors in our finite simulation will be large.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an example of the solution to energy perturbation theory?

A
  • The free energy change in mutating AA glycine to Alanine might be an important system to study for active site manipulation.
  • As λ increases from 0 to 1 we switch off interaction of glycine and turn on interaction of alanine.
  • This shows the power of simulations as this is impossible experimentally
  • A technical issue with this is that as we switch LJ interaction changes as charges are switched on/off gradually, causing atoms to shift to unfavourable locations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the solution to our free energy perturbation theory problem?

A
  • Break down calculation into windows where there is good overlap between states and ΔU is small.
  • This is done through a coupling parameter, λ, which gradually increases from 0 to 1 through multiple simulations (equation), then sum the free energy changes outputed
  • There is an increase in cost for these additional simulations
  • In Li/Rb case first window would be change from Li to 10% Rb etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is biased sampling?

A
  • Where a biasing potential is used that can be used to force the system to explore unfavourable configurations, leading to enhanced sampling of phase space
  • This means we are more likely to overcome kinetic barriers that trap us in local minima of our PES for our entire simulation time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a key condition of biased sampling?

A

Need to know something about pathways as a starting point as this is what we are defining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are reaction coordinates and briefly give two examples? (also known as collective variables and order parameters)

A
  • Characterise a process in terms of a small set of properties of a system that are a function of atomic coordinates.
  • Also known as collective variables and order parameters
  • Distance/separation, r
  • Dihedral angle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  • Give an example of how distance can be used as a reaction coordinate
A
  • Potential mean force (PMF), which is the free energy along a chosen reaction coordinate, can be simulated using the distance between an Na+-Cl- ion pair in electrolyte solution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  • Give an example of how radius of gyration (Rgyr) can be used as a reaction coordinate. What must be assumed?
A
  • Rgyr gives an indication of the expansion/contraction of a globular structure through the average of the distance each atom is from the centre of mass. More expanded = higher Rgyr
  • The transformation of a β hairpin peptide to unfolded random coil state’s free energy landscape can be investigated
  • Choosing appropriate set of reaction coordinates is difficult so must guess generally
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Give an advantages and disadvantages of using multiple reaction coordinates

A

Pros

  • A combination of variables allows important structures across high energy barriers to be sampled, giving a larger indication of the greater free energy landscape.
  • If our single reaction coordinate output poorly maps experimental results, a second coordinate can be introduced to form a 2D plot that may give a different minimum energy pathway to before.

Cons

  • However, in combination, outputs of these reaction coordinates can lead to many different structures which must all be considered
  • Certain structures may even be resritcted via specific choice of a given set of coordinates
  • Large computational cost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q
  • Give an example of how root mean squared distance (RMSD) can be used as a reaction coordinate
  • WHat must one be careful of?
A
  • RMSD is the difference between atomic positions at time t and the starting positions of the simulation, t0.
  • Can be averaged over all atoms of interest, e.g. carbons in a protein backbone chain
  • Similarly, with Rgyr, must be careful with choice of reaction coordinate to pair with as may not be unique function of rN.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q
  • Free energy is a function so the free energy change is independent of the . This means we can create unrealistic if …/… are not of importance to us.
A
  • Free energy is a state function, so the free energy change is independent of the path. This means we can create unrealistic pathways if mechanism/kinetics are not of importance to us.
20
Q
  • Outline the basic principles of umbrella sampling
A
  • System is restrained (through tethering to a spring) to a small region along the reaction coordinate ξ using a biasing potential.
  • If the system deviates too far from this small region, an energy penalty restores the region.
  • This is repeated at different target values of ξ. The system is forced to explore small unfavourable regions along a certain channel until full reaction coordinate is explored.
  • All simulations are stitched together to produce an unweighted underlying free energy profile.
22
Q

What factors control the overlap sampling between adjacent simulations and how fine grained our sampling of our free energy profile is in umbrella sampling?

A
  • Force constant
    • Too low: biasing insufficient to explore high energy regions (wide harmonic)
    • Too high: insufficient overlap between windows (narrow (harmonic)
  • Frequency of window spacing
  • Choosing these values is largely trial and error
23
Q

What is the WHAM algorithm?

A
  • The weighted histogram analysis method is used to stitch simulations together iteratively in umbrella sampling.
  • Unbiased distribution solved with arbitrary values of free energy associated with that potential.
  • Values fed back in to each other until FEP is converged and best estimate for unbiased distribution is obtained.
24
Q

What is metadynamics and how does it differ from umbrella sampling?

A
  • In umbrealla sampling we forced the system to explore unfavroubale regions of phase space with a biasing potential, which restrained us in places difficult to sample, but penalised when too high/unfavourable
  • Metadynamics instead adds biasing potentials to penalise the system from visiting already sampled regions (i.e low energy phase space), forcing it to move to less favourable positions
25
Q
  • Where umbrella sampling used as its potentials, instead, metadynamics uses … … , which are added to the potential as the simulation proceeds. Metdynamics is …, meaning we don’t need to estimate the underlying … … (and biasing potential) in advance with metadynamics as we did in … …. However, we do still need a reaction coordinate.
A
  • Where umbrella sampling used harmonics as its biasing potentials, instead, metadynamics uses Gaussian functions, which are added to the potential as the simulation proceeds. Metdynamics is adaptive, meaning we don’t need to estimate the underlying energy landscape (and biasing potential) in advance with metadynamics as we did in umbrella sampling. However, we do still need a reaction coordinate.
26
Q

How does metadynamics allow sampling of the full reaction coordinate?

A
  • Start at some configuration, depositing gaussians as we sample
  • Eventually will be pushed out into a new local minimum
  • We can tweak how often these depositions occur as well as the height and width of them.
27
Q
  • Metadyanimcs can be … … … but is useful for getting a quick scan of the … …
A
  • Metadyanimcs can be slow to converge but is useful for getting a quick scan of the energy landscape.
28
Q

Describe the Monte Carlo (MC) method and how does it contrast to MD method?

A
  • MC simulations are an alternative method to sampling accessible microstates of the ensemble, generally used for smaller systems. Uses orthogonal techniques
  • MD used Netwon EOM to predict positions of atoms at a future time, taking a time average to find property of interest
  • MC generates configurations through random numbers that are unconnected in a timescale, using an ensemble average to investigate a property
29
Q

What is the general principle in Metropolis MC?

A
  • The probability of a transition from configuration 1 to 2 is a function of the change in ΔU between those states.
  • The probability of a configuration can be written in terms of the Boltzmann distribution where the ideal partition function is a normalizing factor
31
Q

What if the energy of the new configuration is greater than or equal to (≥) original configuration?

A
  • Boltzmann probability is compared to a random number between 0 and 1
  • If #rand < p(new), confuguration accepted, allowing an increase in energy
  • If #rand > p(new), new configuration rejected and have another copy of original microstate set
32
Q

How is MC implemented?

A
  • Move a random particle in a predefined way (e.g along z-axis), with an acceptable dr
  • Calculate U of new configuration. If lower than original, new configuration is accepted and replaces old.
  • A trajectory-esque profile forms as we move to lower energy configurations.
34
Q

How does the size of ΔU affect the acceptance ratio?

A
  • Unew­ > Uold : Boltzmann factor of energy difference closer to 1. More likely to be greater than random # from 0 -> 1. High probability uphill likely favourable) move accepted and added to growing ensemble of microstates
  • Unew >> Uold (e.g a steric clash): less random #’s likely to be lower than Boltzmann factor of ΔU. Low probability large uphill move accepted.
  • Low energy states are general preferred in this algorithm
35
Q

What are some advantages of MC? Give examples and comparisons to MD

A
  • MC does not need to follow realistic pathways, so can explore conformations more rapidly. E.g. a protein folding event in MD would have to physically fold realistically – slow process. MC could randomly rotate an important dihedral to quickly generate states on interest
  • MC doesn’t require calculation of force, whereas in MD, the differentiation of the potential (=F), using Verlet, can be very costly. In MC this isn’t required, allowing for more unphysical models to be used.
36
Q

What is a disadvantage of MC?

A
  • Sampling efficiency depends strongly on move set choice. Poor choice may limit transfer to other unknown configurations, preventing access to all regions of phase space (poor sampling).
  • For example, the choice of certain dihedral to sample be our change to sample around, but it may in fact prevent certain configurations from being explored due to unknown factors.
37
Q

What is an example of an MD-MC combined simulation? Why must they be used toghether? How? What is the advantage or combining both methods?

A
  • A mixed lipid-membrane of chains differing by 4C’s in their tail.
  • MD would be too slow to see lipid diffusion
  • MC would converge too slowly with a system this complex
  • Instead a trail MC move follows each MD step, removing/adding 4C atoms, evaluating E of exchange to see if favourable
  • Sampling more efficient as removed dependency on starting configuration, which can trap system in very low energy configurations surrounded by high.
38
Q

Why is it difficult to enhance sampling with a biasing potential?

A
  • The timescale problem in MD makes in difficult to overcome kinetic barriers in our simulation time.
  • To enhance the sampling of this phase space we could use a biasing potential to explore unfavourable configuration but requires knowledge of important factor in pathways.
39
Q

What is Replica Exchange Molecular dynamics (REMD)?

A
  • REMD is an alternative method of overcoming (or removing) kinetic barriers, allowing more rapid sampling of phase space.
  • Can either change the potential via alteration of the forcefield to change the curve we are sampling (H-REMD)
  • Or change the temperature we sample at (T-REMD).
40
Q

What is the problem with simply changing the temperature we sample at?

A
  • The conditions in which we have chosen to investigate are now different, which can complicate the system of interest, if an event is temperature dependent (e.g a phase change)
41
Q
  • How can we change the temperature more suitably in T-REMD?
A
  • Use Temperature Replica Exchange Molecular Dynamics
  • Run an MD simulation of different replicas of the system at different temperatures in parallel (parallel tempering = replicas are generated through MC instead)
  • Our temperature of interest is the lowest.
  • Exchange configurations between replicas using MC (in both methods)
  • Continue simulation
  • Repeat step 2 until converged.
42
Q

What is the process of the T-REMD method?

A
  • As in MC, interested in the free energy difference between states.
  • If E lower make swap
  • If not random number decided if accepted or not
  • If accepted take high temperature coordinate simulation and exchange with low temperature configuration
  • Overtime, periodic exchanges between replicas allow inaccessible regions of phase space, at high T initially, are now accessible (takes a long time for this to occur though)

N.B. These are not unfavourable regions, they are merely blocked by high E barriers from being sampled -may be lower energy minima

43
Q

What is a disadvantage of T-REMD?

A
  • States are not connected in time
  • Can’t use to calculated time dependent properties (e.g diffusion coefficients) as timescale used is unrealistic
44
Q

Discuss the practical considerations that need to be taken in a T-REMD experiment in terms of computational power.

A
  • Many more processors are required to run many replicas in parallel. This along with the size of the system will be tied to the computational resource available.
45
Q

Discuss the practical considerations that need to be taken in a T-REMD experiment in terms of temperatures used.

A
  • Set of temperatures used such that largest temperature will enable rapid exploration of phase space.
  • Finding this temperature can be difficult when don’t know the PES
  • Spacing between temperatures must also be optimised to minimise convergence time
46
Q

What is the process of H-REMD? Give a biological example of where this may be useful

A
  • Temperature across simulations is constant
  • Instead a soft-core potential is used to across replicas, where Lennard Jones potential gradually softened meaning that atoms can sit on top of one another while still remaining in certain conformations
  • Useful in ligand binding, where certain bound conformations can be locked in deep minima.
  • H-REMD allows ligand to rotate in pocket and explore different orientations while still being bound.
  • These can then be swapped in to a correct LJP, giving an indicator to the transitions in between them
47
Q

Discuss the practical considerations that need to be taken in a T-REMD experiment in terms of acceptance probability.

A
  • Must run a test to see if exchange if often/probable enough to reach convergence.
  • Otherwise, may just switch back and forth between two states of similar energy.
48
Q

Discuss the practical considerations that need to be taken in a T-REMD experiment in terms of the chemistry of the system.

A
  • If one is simulating a system involving a phase change due to increased temperature, this will result in a high energy change (e.g a protein unfolding.
  • Must have a lot of replicas around phase transition of configurations that suddenly differ largely.