Key topics bottom up Flashcards
- What is mechanistic/small-scale systems biology?
In mechanistic/small-scale systems biology, we use the current knowledge about a biological
system to build mathematical models. We are usually interested in the understanding of key
mechanisms that give rise to behaviors we observe in data. Common approaches in mechanistic
systems biology is either to keep the model as small as possible and test one mechanism/one
hypothesis at a time, or, the other way around, to start with all known mechanisms and reduce
the model until only the necessary components are there. Even though we use the term ”smallscale”, the methods can be scaled to handle rather large models.
Mechanistic modeling can be more hypothesis driven or more data driven. In a hypothesis
driven approach, we start with both data and knowledge about the system when we formulate
the models, while in a data driven approach we do not use/need the full knowledge about the
biological system to formulate the models. Instead we let data decide which biological interactions to include. The outcome of the analysis in a hypothesis driven approach is described
in ”the modeling cycle” (see 8. The modeling cycle below). In a data driven approach, the
outcome is usually new hypothesis/ideas about the biological system under study. In both hypothesis driven modeling and in hypothesis driven modeling, we need to test and validate the
findings of the model-based approach with new data. In this part of the course, we focus on
hypothesis driven modeling.
What formula do we use with model formation and what does the variables stand for?
(dx)/(dt)=x=f(x,u,p)
x=model states
u=model input which can be both constant and time varying
p=parameters that change over time
f is the function of x, up, p
How do we describe the rate of reactions?
We will use v1 = k1 · x1
In this course, we need to be able both to go from a drawing of the
biological system, a so called interaction graph, to the fully specified model ODEs, and vice
versa. What is the general recipe to follow?
- Identify model states, x
- Identify reaction rates, v, including assumptions / what we know about parameters
- Formulate ODEs, d/dt(x)
- Identify what is measured, ˆy
- Include all parameters, p = (k, x(0), ky), and their values
Describe states [x] more thoroughly
- States, x, are derived with respect to time, d/dt(x), and are therefore usually changing
with time.
Describe parameters [p] more throgourgly
Parameters, p are constant with respect to time, and we use the following parameters:
rate constants, usually denoted k1, k2, etc, initial conditions, x(0), and measurement parameters, ky. The values for parameters are usually not known, and have to be guessed or
estimated based on data.
Describe reaction rates, v
- Reaction rates, v, determines the rate of reactions. In this course we only use the simplest
form of kinetics (mass-action kinetics), e.g./ v1 = k1 · x1 · x2.
Describe measurement equation
- Measurement equation, e.g. ˆy = ky ∗ x2, and the meaning of this equation: we cannot
measure x2 directly, but something proportional to x2.
If you have a reaction of 3 proteins, Protein A interacts with protein B protein B interacts with protein C and proteins C interacts with protein A. Glucose is outside this circle and interacts with protein A. Use the recipe to write a formula for this reaction:
- Identify model states:
x1 = [ProteinA]
x2 = [ProteinB]
x3 = [ProteinC] - Identify reaction rates, including assumptions / what we know about parameters where glucose is u:
v1 = k1 · x1 · u
v2 = k2 · x2
v3 = k3 · x3 - Formulate ODEs:
d/dt(x1) = −v1+v3
d/dt(x2) = v1−v2
d/dt(x3) = v2−v3 - What is measured?
yˆ = ky · x2 - Parameters and their values:
k1 = 3, k2 = 1, k3 = 2
ky = 0.5
x1(0) = 0, x2(0) = 100, x3(0) = 10
All parameter values are assumed, since they are not given. Note that we need these
values to be able to simulate the model. Also, we need to assume a value for the input
strength: u = 1.
The full model formulation is given below:
d/dt(x1) = −v1+v3
d/dt(x2) = v1−v2
d/dt(x3) = v2−v3
v1 = k1 · x1 · u
v2 = k2 · x2
v3 = k3 · x3
yˆ = ky · x2
x1(0) = 0, x2(0) = 100, x3(0) = 10
k1 = 3, k2 = 1, k3 = 2
ky = 0.5
u = 1
Based on this formula draw a system based on the recpie we follow:
To instead go from a model formulated as ODEs to an interaction graph or model reactions, look
at the all terms at the right hand side of the ODEs, including their sign (negative or positive).
The terms represent the reaction rates (v1, v2, etc).
For example,
d/dt(x1) = −v1+v2
d/dt(x2) = v1−v2
d/dt(x3) = −v3+v4
d/dt(x4) = v3−v4
Here we see that v1 goes from x1 to x2 and v2 in the other direction, and that v3 goes from x3 to
x4 and v4 in the other direction. To know more, we need to know the equations for the reaction
rates:
v1 = k1 · x1
v2 = k2 · x2
v3 = k3 · x3 · x2
v4 = k4 · x4
Here we see that v3 contains both x3 and x2 and therefore x2 must be involved in the transition
from x3 to x4. We therefore conclude that we have these reactions:
x1 → x2
x2 → x1
x3+x2 → x4
x4 → x3
Which is equivalent to this interaction graph:
x1 and x2 interacts with each other. xw interacts with x4. x3 and x4 interacts with each other.
What is the euler method?
The most simple such solver to use is called the Euler
method or the forward Euler method. The Euler method uses the following formula to compute
the values for x after one time step (∆t):
x(∆t) = x(0) +d/dt(x(0))·∆t
i.e. we use the initial value for x and add the time-derivative for x at time = 0 and multiply this
value with the time step (∆t). In this way we take a step in the direction of the slope of the ODE.
Let us look at an example of how to use the simple Euler method. If we have this model,
d/dt(x1) = −v1+v2
v1 = k1 · x1
v2 = k2
x1(0) = 4
k1 = 0.5
k2 = 1
and want to calculate x1 after a time-step of 0.1, x1(0.1) using the Euler method, how does one do that?
x(∆t) = x(0)+d/dt(x(0))·∆t = x(∆t) = x(0)+(−k1·x1(0)+k2)·∆t = 4+(−0.5·4+1)·0.1 =
4+ (−1)· 0.1 = 3.9
Model parameters, especially the kinetic parameters that decides the rate of reactions, are usually not possible to measure experimentally within biology. Therefore, we use methods to
estimate parameter values based on the available data. We use a cost function (also known as
objective function or loss function) to evaluate the agreement between model simulations and
data for each set of parameter values that we simulate. Write how a cost function can look and explain the function:
v(p) = ∑((y(t)−yˆ(t, p))/(SEM(t)))
where the sum is over all measured time points, t; p is the parameters; y(t) is the measured data
and ˆy(t) is the model simulations that corresponds to the data; SEM(t) is the uncertainty given
as standard error of the mean for the data.
The residuals are the difference between data and model simulations (y(t) − yˆ(t, p)) and we
want them to be as small as possible. We therefore minimize the cost function. To do so, there
are global and local minimization functions to use. Global minimization functions aim to find
the global minimum and therefore search both uphill and downhill in the landscape of possible
parameter values to not get stuck in local minimas. Local minimization functions, on the other
hand, search only downhill, and needs to be combined with global minimization or multiple
starting points to be effective.
In this course, we practically try out parameter estimation in the computer exercise, and look at examples where we go from model simulations with big residuals, i.e. a bad fit with data, to
model simulations in agreement with data.
What is the purpouse of statistiscal tests?
Statistical tests are used to evaluate the agreement between model simulations and data. Usually,
a visual inspection where you look at model simulations and data in the same graph, gives a hint
on which test to use to see if you can reject the model/hypothesis or not.
What statisitcal tests do we need to know?
- χ2-test for the size of the residuals – is the model in good enough agreement with data
when you account for the data uncertainty? - Whiteness test for correlation between residuals – is there a systematic error in the model
that give rise to correlated residuals? - Likelihood ratio test to compare models that all are in agreement with data – is one of the
models significantly better than the others? - Cross validation for model complexity – is the model too complex in relation to data and
therefore over-fitted to the data used in parameter estimation?