Chapter 25 - Brownian motion and Ito Flashcards
define a stochastic process
a stochastic process is a random process that is a function of time
Define brownian motion
Like with everything else, the book is extremely close to being entirely wrong with the terminology.
Brownian motion refer a specific path taking, or a stochastic movement pattern, that occur when a particle move as a result of many varying impacts in continuous time.
Mathemtically, brownian motion is a result of a wiener process.
The wiener process has the following characteristics that define it:
1) Z(0) = 0
2) Z(t+s) - Z(t) is normally distributed with mean 0 and variance s.
3) Non overlapping increments are independently distributed.
4) Z(t) is continuous.
the book say that brownian motion has these characteristics, but this is informal. It may be accepted, but it is more precise to say that the wineer process has those characteristics.
Name an outcome of the characteristics of brownian motion
Martingale.
E[Z(t+s) | Z(t)] = E[Z(t)]
elaborate on the intuition behind:
Z(t+h)-Z(t) = Y(t+h)sigma sqrt(h)
we model a single step movement in Z by considering a step length h. If we let h go very small, we approach continuous motion. But with a simplified discrete case, we can use h=k, where k is some step size. So we are effectively saying that the difference between the Z value before and after the step, which would give us the relative movement during the step, is given by the stochastic function “Y(t+h)sigma sqrt(h)”. Y(t+h) could by any random variable, but we assume it is binary. therefore, the argument “t+h” doesn’t really make that much of a difference, it could be any argument and we’d receive a value with the same expected value etc. However, the argument serve more as a purpose of saying which step the random variable draw belongs to.
We draw from a binary distribution with only two outcomes, -1 and 1. This therefore represent the direction of our movement in the corresponding time step. Then we need the volatility. The volatility, or standard deviation, depends on the particle or whatever we are observing. it is found by estimating the variance and rooting it. When we take volatility, it will be measured in regards to some basic time step size. therefore we need to scale it so that it fits with the specific increment we choose of h. This is why we sqrt the h and multiply it by the standard deviation.
elaborate on the role of binomial distribution in the equation earlier
It is more fitting to describe it as a bernoulli trial. A trial is a special case of the binomial distribution where the number of trials is equal to 1. This gives the binary outcome.
how do we find the number of intervals of length “h” from the time interval 0 to T
Since T and h is the same unit, we simple take:
T/h = #intervals
Generalize the earlier equation to apply to a broader time interval that encompass multiple time steps
we are looking at Z(T)-Z(0). We need the number of intervals, which was T/h.
Z(T)-Z(0) = sum of increments
Recall that the brownian motion deifnitinin have these increments normally distributed and independent. Therefor,e the sum of them is a random variable that is also normally distributed.
Z(T) - Z(0) = ∑y_i(i) sigma sqrt(h) [i=0, T/h]
sigma and sqrt(h) is constant, as long as the time steps are equal sized.
Z(T)-Z(0) = sigma sqrt(h) ∑y_i(i) [i=0, T/h]
So now we are talking about a sum of Bernoulli trials, which gives us a binomial distribution b(x; T/h, 0.5). In other words, we can replace the sum of trials with a single binomial distribution variable.
BUT: To understand some of the properties, we go down a different path.
We have:
Z(T)-Z(0) = sigma sqrt(h) ∑y_i(ih) [i=0, T/h]
Taking the expected value, we get:
E[] = sigma sqrt(h) E[∑y(ih)] = 0
And since the variance is 1, the std is 1 (per trial). The sum of trials give a higher variance, but are canceled out from sqrt(h). We end up getting variance 1 again. BUT: we then have the T term, so we get variance equal to T.
we then use the central limit theorem to say that a sum of independent binomial random variables with mean 0 and variance T approach a normal distribution when the number of samples grow large.
What is the single step brownian motion equation when using infinitesimals
Since we typically work with sums, give the sum form of the infinitesimal variant brownian motion equation
Define quadratic variation
Defined as the sum of squared increments.
Use this equation to elaborate on some properties of brownian motion
If we find quadratic variation, we will find that as n go to infinty, the quadratic variation becomes T. Therefore, the quadratic variation of brownian motion is not a random variable, but is fixed and finite.
In all of these cases, are we working with the actual brownian motion?
No, we are working with the binomial approximation of it
How can we generalize the simple binomial brownian motion formula to have non-zero mean and variance equal to something other than 1
We add whatever mean we want by shifting the function, and then we explicitly include the sigma term that was actually always included.
generally speaking, when we enhance the model we originally had for brownian motion, and used arithmetic brownian motion, what did we actually do?
We allow for the possibility of non-zero mean and arbitrary variance.
what is dZ(t)
The change given by the random variable at time step t:
dZ(t) = Y(t) sqrt(dt)
Recall that Z(T) is
the sum of all the incremenets
Given as integral
Questions that need answering:
1) Why is X(T)-X(0) normally distributed
2)
name positive properties of arithmetic brownian motion
name drawbacks of arithmetic brownian motion
allow negative values. This is not necessarily bad, but it is DEFEINITELY bad if we want to use brownian motion to model stock prices, as they should not be negative.
ALSO: The current model doesnt connect volatility and dollar return.
define mean reversion
a process that identifies when it has reached upper and lower “thresholds” and increase the likelihood of reverting back to the mean
how can we extend the arithmetic brownian motion
we make the so-called “Ornstein-Uhlenbeck process”.
In broad terms, it modify the drift term so that it want to approach the mean.
The lambda parameter represent how much mean reversion we want.
X(t) is the state. So we use the curernt state to help with understanding which movement it should go towards.
Recall the definition of brownian motion
Brownian motion is defined from some properties:
1) Start at zero, Z(0) = 0 with certainty
2) Independent increments. The future movement does not depend on the past.
3) Normally distributed increments. The change over an interval <t, h+t] is n(x; 0, h). we can approximate using binomial distribution (series of Bernoulli trials) as the sum of Bernoulli trial variables approach normal as the number of trials increase.
4) Continuous paths.
5) Martingale property. Ensure no drift. However, we can extend brownian motion to include drift, but then it becomes a model based on brownian motion, and not necessarily an exact brownian motion.
Define an Ito process
when the drift and volatility depend on X(t) (the current state), it is called an Ito process.
Recall the arithmetic brownian motion equation
dX(t) = a dt + s dZ(t)
the arithmetic brownian motion equaiton is given as:
dX(t) = a dt + s dZ(t)
Extend to Ito process, and make the next step
extending to Ito process by making sure that the drift and volatility depend on the current state:
dX(t) = a[X(t)]dt + s[X(t)]dZ(t)
Next step:
if we assume that the function is simply that the drift and volatility are proportional to the current state, we get:
dX(t) = aX(t)dt + sX(t)dZ(t)
Then we divide on X(t):
=> dX(t) / X(t) = adt + sdZ(t)
From the previous card, we got:
dX(t) / X(t) = adt + sdZ(t)
What do we have here+
We have geometric brownian motion.
Now, we have that the drift and volatility are proportional to the level of X(t).
Recall that a (alpha) represent the continuously compounded returns on the stock.
what can we say about a variable that follow geometric brownian motion?
Log normally distributed.
dX(t) / X(t) = adt + sdZ(t)
what exactly in this process is said to follow sometihng related to brownian motion? Does it follow brownian motion?
the equation I provided model the relative change of X(t), and X(t) follow geometric brownian motion. Geometric Brownian motion is not the same as standard Brownian motion, but have similarities.
X(t) follow the geometric brownian motion.
elaborate on why it is so important to us that the Ito process is added, and that it is added by a simple proportional term
if we dont add it, we have:
dX(t) = adt + sdZ(t)
The problem with this is that the contribution of the drift is the same regardless of state. the same with the standard deviaiton.
Since we are modelling the change in the state, and the mean/drift represent a percentage, it does not make sense to add the same small term to the overall step movement. We need to properly scale it. If the stock price is 100, and drift is 15%, we need to add 15% of 100, not 15% of the time step, which if provided in unit-form is simply 0.15. If we never add the proportional term, we always end up only adding 0.15 to the stock movement, which is not how the drift should behave.
the same applies for the volatility. The swings should be more violent for larger stock prices. Volatility represent the swing, regardless of the absolute value. Volatility use only returns about the mean, which is not dependent on the size. Therefore, we must scale it properly
The outcome is that we model:
dX(t)/X(t) = adt + sdZ(t)
which represent the percentage change in of the particle we’re modeling.
what is the expected value of X(t) when modeled using geometric brownian motion?
X(0)e^(at)
This is exactly what we want, but it doesnt account for dividends? the book hasnt mentioned dividends yet.
Difference between arithmetic and geometric Brownian motion
Arithmeic is defined so that it can take negative values.
geoemtric is always posiitve.
Recall the arithmetic brownian motion equation underlying binomial approximation, that consider the discrete case and not the continuous
X(t+h) - X(t) = ah + sY(t) sqrt(h)
X(t+h) - X(t) = ah + sY(t) sqrt(h)
What two components is there here?
we have the determinsitic drift, given by ah, and then we have the uncertainty movement given by volatility, the Bernoulli trial, and the scaling in regards to the unit time.
We simply say:
1) Deterministic part
2) Random part
What can we say happens over “short periods of time” in regards to the arithmetic brownian process?
The character of the Brownian process is determined almost entirely on the random part.
Recall that the random part is:
sY(t)sqrt(h).
to understand why, we look at the ratio between the volatility and the drift.
Volatility is given by sigma sqrt(h).
Drift is given by ah.
The ratio is then given as:
Ratio = sigma sqrt(h) / a h
We can avoid having h terms in both the enumerator and denominator.
Ratio = sigma / a(h^(1-0.5))
Ratio = sigma / [a sqrt(h)]
This ratio is a function of h. the larger h is, the smaller the ratio becomes.
For small values of h, the ratio is very large.
the same outcome is found by arguing that the order of magnitude from the drift component vs the volatility component is different, and is in favor of the drift. This means that as h grow large, the drift part takes over.
If the h is small, it will have a correspondingly smaller effect on the drift, and make it more negligible.
in fact, this happens when h<1. the square root term will start to behave fucked up, in that it will produce larger values than h. for instance, sqrt(0.8)=0.89. Therefore, the smaller h is, the more importance will be weighted in the favor of the random part, rather than the deterministic part.
elaborate on what happens when we square the arithmetic brownian motion equation and look at a very small time interval
the benefit here is that becasue we are only seeing the variance, we can estimate it.
elaborate on variance estimation methods
The standard way vs brownian motion way.
Standard way also adds the drift into the computaitons.
Brownian motion does not.
How to make it follow geometric brownian motion
[S(t), t] is simply S(t) so that a[S(t),t] = aS(t).
Same with volatility. Proportional terms, like before.
give the Ornstein-Uhlenbeck thing.
Mean reversion strategy. Intuition is that we adjust the drift term with either a negative or a positive contribution depending on where the current state lie in relation to the mean. We also have a factor/weight that detemrine how much mean reversion we want.
We define the change in X:
dX(t) = lambda [a - X(t)]dt + s dZ(t)
There is need for care when applying this formula. The units of alpha and X(t) must align in a way that makes sense, otherwise we will not obtain the mean reversion we want.
Drawback of the pure form Ornstein-Uhlenbeck equaiton
As with regular Arithmetic brownian motion, the stock prices may become negative.
Define an Ito process
Ito process is a generalizaiton of brownian motion. It allows for a drift component and a diffusion process.
As a result, it takes the general shape of:
dX(t) = u(X,t)dt + s(X,t)dZ(t)
In this course, we represent it like this:
dX(t) = a[X(t)]dt + s[X(t)]dZ(t)
this is basically the same.
Then we typically make the functions just proportional to the drift and diffusion terms:
dX(t) = aX(t)dt + sX(t)dZ(t)
then we get:
dX(t)/X(t) = adt + sdZ(t)
Now we have an Ito process relevant for our purpose, that provide the relative change in X(t) from an infinitesimal change in time.
recall the equation we use and associate with geometric brownian motion
dX(t)/X(t) = adt + sdZ(t)
Name a property of the geometric brownian motion equation:
dX(t)/X(t) = adt + sdZ(t)
Follow log-normal distribution.
To be precise, it is the X(t) variable that follow log-normal distribution.
what multiplication rules do we have?
dt x dZ = 0
dt ^2 = 0
(dZ)^2 = dt
elaborate on the taylor expansion used in differnetial numerics or whatever
we can consider it as a change in variable. we are still using the same old Taylor formula, but we can abstractly look at it as predicting f(y) around x, where y=x+h.
The case is that we take x, whatever it is, and find the taylor expansion around x. When we predict f(x) at x=y, we use the taylor appriximation around x to find a value for y=x+h.
using only words, describe what Ito’s lemma is all about
If we define a function, and one of the parameters/arguments of the function is a stochastic process (i.e. a function that has stochastic part) then the first function will become a stochastic process as well, AND the “new” stochastic process can be defined as consisting of a deterministic part (drift) and a random part (fluctuation, diffusion).
this is interpretation (validated by gpt) from chatgpt which gave the following answer:
Ito’s lemma states that when you apply a function to a stochastic process, the result is itself a new stochastic process with a specific structure: it consists of a deterministic part and a stochastic part. The deterministic part accounts for the drift (the expected rate of change), while the stochastic part captures the randomness introduced by the original process.
This structure arises because when taking small steps in time, the randomness has a squared term that does not disappear as it would in ordinary calculus. Instead, this term contributes to the deterministic drift. The lemma formalizes how to compute the total change in the function by combining first-order and second-order effects, ensuring that all randomness is properly accounted for.
why is Ito’s lemma useful
The separation between the weight of the deterministic component and the weight of the random component is essential in determining trends and fluctuations.
elaborate on the relative importance of the drift term and the noise term
we consider the discrete counterpart for the arithmetic brownian motion.
we have:
X(t+h) - X(t) = ah + s Y(t) sqrt(h)
Then we consider what happens when h become very small. Since the h-term is of different order as weight to the drift and to the noise, it will carry different behavior. Specifically, when h is small, smaller than 1, the significance of the drift term diminish. The h-weight on the noise term grow. In the limit, it is basically all noise.
However, as h is large, the drift takes over.
elaborate on this. all of its terms
Assuming we have a stock price S_0 and S_t, we can model the price development.
We could use log-normal, and have that:
S_t = S_0e^{(a-∂-0.5s^2)t} and say that this represent our model. however, it is difficult to work with log-normals, so we take normal instead.
Therefore, to obtian the normal we divide on S_0 and ln operate it.
ln(S_t/S_0) = (a-∂-0.5s^2)t
IMPORTANT: Here we have done something wrong.
The expression above doesnt follow any distribution as it has no random variable. It is simply the mean. Instead, our model looks like this:
S_t = S_0 e^{(a-∂-0.5s^2)t + s sqrt(t)Z(t)}
Z(t) is the random variable.
The way we obtain the expression:
We consider random movement according to a normal distribution (standard normal) and then un-standardize it by giving it the volatility we want and the mean we want. The outcome is the expression:
(a-∂-0.5s^2)t + s sqrt(t) Z(t)
this represent a step. It represent a random process. It follow N((a-∂-0.5s^2)t, s^2 t).
The mean there would represent the continuously compounded returns, and we have a log normal distribution like this:
S_t/S_0 = e^(…)
if x is normally distributed, what is E[e^x]?
Assuming x is normally distributed with mean k and variance s^2,
the expected value is e^(k + 0.5s^2).
Elaborate on the entire process that connects shit together. Basically, build the stock model from the ground up. Start by the foundational assumption, and end with the outcomes and results.
We start by assuming that continuously compounded returns are normally distributed. This is the foundational building block that allow us to continue.
Because of the assumption, we have that S_t/S_0 = e^(at). This assumes 100% certainty, no random component. In the case of perfect everything, we could probably end it there. However, because of uncertainty, we need to change it up.
It is common to assume that stocks follow brownian motion with drift. The Brownian motion allow us to add in outcomes that does not align with the exact nature of the drift. This is useful in order to obtain probability information etc.
So, we want to model an uncertain stock price. We estimate its mean and its volatility. These are parameters that we use to shape our model.
But, we start by considering a standard normal variable. We consider this variable to represent a single “step” of arbitrary length. In good models, we let the step length go towards 0, which creates the best accuracy. But regardless, the standard variable represent a draw from the standard normal distribution. we multiply the result by the sqrt(time_step_size) so that the variance match up with the contribution of the time step.
Then we de-standardize the standard normal variable so that it follow our specific stock’s characteristics. This is done by multiplying the by the estimated level of volatility and adding the estimated mean.
Now we have a model that does some good things, and some bad things. The bad things include possibility of negative prices and that the drift term (mean that we added) and the diffusion term (volatility and random component) are not scaling well in regards to the current state. Therefore, we make it better by making it an Ito process by simply multiplying them by X(t), making their contributions proportional to the current state X(t).
Now we have something that follow geometric Brownian motion.
It would be temping to say that we are sort of done, but there are still important things we need to do.
Currently, the expected value of the stock price is log-normally distributed, but the mean is no longer what we had it to be earlier. Due to the workings of the log normal distribution, the expected value becomes greater. We dont want this, and to offset it, we subtract from the mean, the balancing term of 0.5*sigma^2. Now, the log-normal distributed variable X(t) follow a drift term that gives us the correct trend equal to the mean we want.
So, now we have some normal variable with a mean that is lower than it intuitively would be, because we now that when we make it an Ito process, its mean increase.
Specifically, we have that the continuously compounded returns are given by “(a-∂-0.5sigma^2)t+sigma sqrt(t) Z(t)”, which is an expression that is normally distributed with mean (a-∂-0.5sigma^2)t and variance tsigma^2.
When we send this through the function of :
S_t/S_0 = e^x, we get that S_t/S_0 is log-normally distributed.
As a result, we obtain the model: S_t = S_0e^{(a-∂-0.5*sigma^2)t+sigma sqrt(t) Z(t)}
t can be any time period, and we obtain the likely stock price at time t relative to time=0. Since it is log-normally distributed, we should expect that running the same simulation many times gives us a shape of S_t prices that follow exactly this log-normal distribution.
if we are interested in the specific path, and not only the final outcomes, we use what we know of Brownian motion to simulate each time step with smaller time steps and smaller contributions. The final price distribution is the same in the limit obviously, but the path is vibrant, showing how Brownian motion determine the paths that ultimately lead to a log-normal distribution.
what is jensens inequality?
E[e^X] > e^(E[X])
Do we actually need brownian motion, or could we model using only the log normal and normaility assumtpions?
We actually build using only the Brownian motion as foundation. We of course model it so that the drift and volatility match our stock. But then we say that the stock price at time t, X(t)-X(0) is given by the integral. And solving the integral using a toolbox of mathematics gives us something that is clearly normally distributed in (a-∂-0.5s^2)t+s sqrt(t) Z. The variance is t s^2, and mean is (a-∂-0.5*s^2)t. Once we exponentiate it, the mean change to match our wanted mean, and we also get the model that we are familiar with: S_t=S_0e^(something), but now “something” also includes random component