https://jiha-kim.github.io/posts/introduction-to-stochastic-calculus/ avatar Ji-Ha's Blog For sharing things * HOME * CATEGORIES * TAGS * ARCHIVES * ABOUT Home Introduction to Stochastic Calculus Post [ ] Cancel Introduction to Stochastic Calculus A beginner-friendly introduction to stochastic calculus, focusing on intuition and calculus-based derivations instead of heavy probability theory formalism. Posted Feb 22, 2025 Updated Feb 24, 2025 By Ji-Ha Kim 40 min read Introduction to Stochastic Calculus Contents Introduction to Stochastic Calculus Introduction to Stochastic Calculus Notation and code for generating visuals are presented in the Appendix. 0. Introduction This document is a brief introduction to stochastic calculus. Like, an actual introduction. Not the textbook "introductions" which immediately blast you with graduate-level probability theory axioms and definitions. The goal of this blog post is more to focus on the physical intuition and derivation of Brownian motion, which is the foundation of stochastic calculus. I will avoid very technical formalisms such as probability spaces, measure theory, filtrations, etc. in favor of a more informal approach by considering only well-behaved cases. I also try to avoid introducing too many new concepts and vocabulary. I hope that a wider audience can feel inspired as to how stochastic calculus emerges naturally from the physical world. Then, hopefully, more people can appreciate the beauty and meaning of the mathematics behind it, and decide to dig deeper into the subject. Applications Brownian motion and Ito calculare a notable example of fairly high-level mathematics that are applied to model the real world. Stock prices jiggle erratically, molecules bounce in fluids, and noise partially corrupts signals. Stochastic calculus gives us tools to predict, optimize, and understand these messy systems in a simpified model. * Physics: Einstein used Brownian motion to prove atoms exist--its jittering matched molecular collisions. * Finance: Option pricing (e.g., the famous Black-Scholes equation) relies on stochastic differential equations like \(dS = \mu S dt + \sigma S dW\). * Biology: Random walks model how species spread or neurons fire. This is just the tip of the iceberg. More and more applications are emerging, notably in machine learning, as Song et al. (2021) have shown in their great paper "Score-Based Generative Modeling through Stochastic Differential Equations". They precisely use a stochastic differential equation using Ito calculus to model the evolution of noise over time, which they can then reverse in time to generate new samples. This framework generalizes previous ones and improves performance, allowing for new paths of innovation to be explored. 1. Motivation Pascal's triangle gives the number of paths that go either left or right at each step, up to a certain point: \[\begin{array}{cccccc} & & & 1 & & & \\ & & 1 & & 1 & & \\ & 1 & & 2 & & 1 & \\ 1 & & 3 & & 3 & & 1 \end{array}\] Using 0-indexing, the number of ways to reach the \(k\)-th spot in the \(n\)-th row is \(\binom{n}{k} = \frac{n!}{k!(n-k)!}\). For example, in row 3, there are \(\binom{3}{2} = 3\) ways to hit position 2. Pascal's Triangle Paths for 3 choose 2 Code 2D image: All 3 paths for the 2nd position in the 3rd row of Pascal's triangle Why care? This setup powers the binomial distribution, which models repeated trials with two outcomes--win or lose, heads or tails. Think of: * A basketball player shooting free throws with probability \(p\) of success and \(q = 1 - p\) of failure. * A gambler betting on dice rolls. Pascal's triangle tells us there are \(\binom{n}{k}\) ways to get \(k \) wins in \(n\) trials. If the trials are independent, we can use the multiplication rule for probabilities: Note that the independence assumption is strong. Real life isn't always so clean--winning streaks in games often tie to mentality or momentum, not just chance. Keep in mind that this model can and will be inaccurate, especially visibile for very long streaks in phenomena like stock prices or sports. However, in more common scenarios, it usually approximates reality well. \[P(A \text{ and } B \text{ and } C \dots) = P(A) P(B) P(C) \dots\] For one sequence with \(k\) wins (probability \(p\) each) and \(n - k \) losses (probability \(q\) each), the probability is \(p^k q^{n-k} \). Multiply by the number of ways to arrange those wins, and we get: \[P(k \text{ wins in } n \text{ trials}) = \binom{n}{k} p^k q^{n-k}\] This is the binomial distribution--great for discrete setups. Now, let's zoom out. The real world often involves continuous processes, like: * The motion of a falling object, * Gas diffusing through a room, * Stock prices jumping around, * Molecules colliding in a liquid. For these, the binomial model gets messy as trials pile up. Calculus, with its focus on continuous change, feels more natural. In the continuous case: Points and sums (discrete tools) lead to infinities. We need intervals and integrals instead. 2. From Discrete Steps to Continuous Limits It's actually known what happens to the binomial distribution as it becomes continuous. But what does that conversion mean mathematically? Let's dig in with examples and then formalize it. In calculus, going from discrete to continuous means shrinking step sizes and cranking up the number of steps. For an interval \([a, b] \), we: 1. Split it into \(n\) chunks of size \(h = \frac{b - a}{n}\), 2. Sum up contributions (like a Riemann sum), 3. Let \(n \to \infty\) and \(h \to 0\), landing on an integral. Can we adapt this to the binomial distribution? Let's try. Picture the \(n\)-th row of Pascal's triangle as a random walk: at each of \(n\) steps, we move \(+1\) (a win) or \(-1\) (a loss). We'll set the probabability of winning as \(p = 0.5\) as a first example since it's symmetric, making each direction equally likely and simpler to work with. The number of ways to get \(k\) wins (and \(n - k\) losses) is \(\ binom{n}{k}\). Let's try to plot this for a different values \(n\) over \(k\). (The code can be found in the Appendix.) Plots for n=5,10,25,50,100 Code 2D image: Binomial distribution plots for n=5,10,25,50,100 That looks awfully familiar, doesn't it? It's a bell curve, so naturally, we might guess that the limit is a normal distribution (aka Gaussian distribution). Where does such a normal distribution arise from? The answer lies in the Central Limit Theorem, which states that the sum of a large number of independent random variables will be approximately normally distributed. So where's the sum happening here? Let's proceed to formalizing our intuition. To accomplish this, let's define a random variable for a single step as: \[X(t) = \begin{cases} 1 & \text{with probability } \frac{1}{2} \\ -1 & \text{with probability } \frac{1}{2} \\ \end{cases}\] Here, \(X(t)\) will encode our displacement at the \(t\)-th step where \(t \in \{1,\dots,n\}\) is an indexing parameter. As before, we assume that \(X(t_1)\) is independent of \(X(t_2)\) for \(t_1 \ne t_2 \). At each step \(t\), \(X(t)\) has mean \(0\) and variance \(1\). Then, the overall displacement \(S(n)\) is: \[S(n) = X(1) + X(2) + \dots + X(n) = \sum_{t=1}^n X(t)\] So there it is! The central limit theorem states more precisely that given \(n\) independent and identically distributed random variables \(X_1, X_2, \dots, X_n\) with mean \(\mu\) and variance \(\sigma^2\), we have: \[X_1 + \dots + X_n \sim N(n\mu, n\sigma^2) \text{ as } n \to \infty \] This is precisely what need. As we take \(n \to \infty\), we have that \[S(n) \sim N(0, n)\] such that \[\lim_{n \to \infty} \frac{1}{\sqrt{n}} \cdot S(n) = N(0, 1)\] which is our desired limit. We have shown that a "continuous binomial distribution" is in fact a normal distribution. Here are some very nice 3D animations of sample paths with the distribution evolving over the number of steps: Discrete Random Walk, 15 steps Code 3D animation: Discrete Random Walk, 15 steps Discrete Random Walk, 100 steps Code 3D Animation: Discrete Random Walk, 100 steps over 5 seconds Normal Distribution Approximation by Random Walks Code 2D animation: Normal distribution approximation by discrete random walks 3. Defining Brownian motion (Wiener process) Let's consider a scenario faced by Scottish botanist Robert Brown in the 1820s. Imagine a small particle, like dust or pollen, floating on a body of water. Brown realized that its movement was surprisingly erratic. It seemed like the small-scale nature of the setup resulted in such sensitivity to fluctuations, so much is that the real movement from external forces would completely overtake the previous one. Hence, in a simplified mathematical model we scale consider the events at different times as independent. In addition, there is positional symmetry: the average position of the particle at time \(t\) seemed float approximately around the origin. Motivated by these observations as well as our previous intuition on continuous random walks, let's first think about a simplified model for 1-dimensional discrete case. We'll list some properties that a continuous random walk should have. 1. Starting Point: As a mathematical convenience, we position our coordinate system to set the starting point of the walk to be zero. 2. Positional Symmetry: The walk has no directional bias. For each step, the expected displacement is zero, such that the overall expected displacement is also zero. 3. Independence: Steps at different times are independent. The displacement between two different intervals of time is independent. 4. Continuity: The walk is continuous, with no jumps or gaps. 5. Normality: As we established by taking discrete random walks in the continuous limit, the distribution of positions at any given time should be normal. So let's write this mathematically. Such a random variable is usually denoted either by \(B_t\) for "Bronian motion", which is the physical phenomenon, or \(W_t\) for "Wiener process", in honor of the mathematician Norbert Wiener who developed a lot of its early theory. I will use \(W(t)\) to emphasize its dependence on \(t\). Let \(W(t)\) be the position of the Brownian motion at time \(t\), and let \(\Delta W(t_1,t_2)\) be the displacement of the Brownian motion from time \(t_1\) to time \(t_2\). Note that, unlike the discrete case, we cannot consider a single increment and have a single index \(t\) for displacements as we did with \(X(t)\). As mentioned, the continuous case requires considering intervals instead of single steps. Then, we write some properties of Brownian motion: 1. \(W(0)=0\) almost surely 2. \(W(t)\sim N(0,t)\) + With the first condition, this is often written equivalently as \(\Delta W(s,t)\sim N(0,t-s)\) for all \(s \ne t\) 3. \(\Delta W(t_1,t_2)\) is independent of \(\Delta W(t_2,t_3)\) for arbitrary distinct \(t_1 < t_2 \le t_3\) We can straightforwardly use these conditions are enough to find 1. \(E[W(t)]=0\) for all \(t\) 2. \(Var(W(t))=t\) for all \(t\) This is analogous to the discrete case. But it also turns out that these conditions are sufficient to prove continuity, although it's more involved: 1. The sample path \(t \mapsto W(t)\) is almost surely uniformly Holder continuous for each exponent \(\gamma < \frac{1}{2}\), but is nowhere Holder continuous for \(\gamma >= \frac{1}{2}\). p.30,33 of source + In particular, a sample path \(t \mapsto W(t)\) is almost surely nowhere differentiable. So, \(W(t)\) is our mathematical model for Brownian motion: a continuous, random, zero-mean process with variance proportional to time. It's wild--it's globally somewhat predictable yet locally completely unpredictable. A plot of W(t) looks like a jagged mess, but it's got structure under the hood. (You can generate one yourself with the code in Appendix.) Sample Brownian Motion Path Code 2D image: Sample Brownian motion path 3D Animation Continuous Brownian Motion Code 3D animation: Brownian motion with evolving distribution Now, let's take this beast and do something useful with it. --------------------------------------------------------------------- 4. Ito Calculus Brownian motion \(W(t)\) is continuous but so irregular that it's nowhere differentiable. To see why, consider the rate of change over a small interval \(dt\): \[\lim_{dt \to 0} \frac{W(t + dt) - W(t)}{dt} = \lim_{dt \to 0} \frac {\Delta W(t, t + dt)}{dt}\] Since \(\Delta W(t, t + dt) \sim N(0, dt) = \sqrt{dt} \, N(0, 1)\): \[\frac{\Delta W(t, t + dt)}{dt} = \frac{\sqrt{dt} \, N(0, 1)}{dt} = \frac{1}{\sqrt{dt}} N(0, 1)\] As \(dt \to 0\), \(\frac{1}{\sqrt{dt}}\) grows without bound, and the expression becomes dominated by random fluctuations--it doesn't converge to a finite derivative. This rules out standard calculus for handling Brownian motion, but we still need a way to work with processes driven by it, like stock prices or particle diffusion. In the 1940s, Kiyosi Ito developed a framework to address this: Ito calculus. Rather than forcing Brownian motion into the rules of regular calculus, Ito built a new system tailored to its random nature, forming the foundation of stochastic calculus. The Increment \(dW\) and Its Properties Define the small change in Brownian motion over an interval \(dt\): \[dW := W(t + dt) - W(t) = \Delta W(t, t + dt)\] From Section 3, \(W(t + dt) - W(t) \sim N(0, dt)\), so: \[dW = \sqrt{dt} \, N(0, 1)\] Unlike the deterministic \(dx\) in regular calculus, \(dW\) is random--its magnitude scales with \(\sqrt{dt}\), and its sign depends on a standard normal distribution \(N(0, 1)\). It's a small but erratic step, with: * \(E[dW] = 0\), * \(Var(dW) = E[(dW)^2] = dt\). Now consider \((dW)^2\). Its expected value is \(dt\), but what about its variability? The variance is \(Var[(dW)^2] = 2 dt^2\), which becomes negligible as \(dt \to 0\). This stability allows us to treat \((dW)^2 \approx dt\) in Ito calculus (formally, in the mean-square sense--see the Appendix for details). In contrast to regular calculus, where \((dx)^2\) is too small to matter, \((dW)^2\) is on the same scale as \(dt\), which changes how we handle calculations. The Ito Integral: Integrating Against Randomness In regular calculus, \(\int_a^b f(x) \, dx\) approximates the area under a curve by summing rectangles, \(\sum f(x_i) \Delta x\), and taking the limit as \(\Delta x \to 0\). For Brownian motion, we want something like \(\int_0^t f(s) \, dW(s)\), where \(dW(s)\) replaces \ (dx\). Here, the steps are random: \(\Delta W(s_i, s_{i+1}) \sim \ sqrt{\Delta s} \, N(0, 1)\). We approximate: \[\int_0^t f(s) \, dW(s) \approx \sum_{i=0}^{n-1} f(s_i) [\Delta W (s_i, s_{i+1})]\] over a partition \(s_0, s_1, \dots, s_n\) of \([0, t]\), then let \(n \to \infty\). Unlike a deterministic integral, the result is a random variable, reflecting \(W(t)'s\) randomness. Using \(f(s_i)\) from the left endpoint keeps the integral "non-anticipating"--we only use information up to time \(s_i\), which aligns with the forward-evolving nature of stochastic processes. Ito's Lemma: A Chain Rule for Randomness For a function \(f(t, W(t))\), regular calculus gives: \[df = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial W} dW\] But Brownian motion's roughness requires a second-order term. Taylor-expand \(f(t, W(t))\): \[df = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial W} dW + \frac{1}{2} \frac{\partial^2 f}{\partial W^2} (dW)^2 + \text {smaller terms}\] As \(dt \to 0\): * \(dt^2\) and \(dt \, dW\) vanish, * \((dW)^2 \approx dt\) stays significant. This leaves: \[df = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial W} dW + \frac{1}{2} \frac{\partial^2 f}{\partial W^2} dt\] This is Ito's Lemma. The extra \(\frac{1}{2} \frac{\partial^2 f}{\ partial W^2} dt\) arises because \((dW)^2\) contributes at the \(dt\) scale, capturing the effect of Brownian motion's curvature. Since we have the algebraic heuristic \(dW^2 = dt\), we could in some define everything in terms of powers \(dW\) to expand things algebraically and implicitly compute derivative rules. This is precisely the idea behind my blog post on Automatic Stochastic Differentiation, where we use \(\mathbb{R}[\epsilon]/\ epsilon^3\) in a similar fashion to dual numbers \(\mathbb{R}[\ epsilon]/\epsilon^2\) for automatic differentiation in deterministic calculus. If you haven't already, I highly recommend checking it out. Example: \(f(W) = W^2\) Take \(f(t, W(t)) = W^2\): * \(\frac{\partial f}{\partial t} = 0\), * \(\frac{\partial f}{\partial W} = 2W\), * \(\frac{\partial^2 f}{\partial W^2} = 2\). Then: \[d(W^2) = 0 \cdot dt + 2W \, dW + \frac{1}{2} \cdot 2 \cdot dt = 2W \, dW + dt\] Integrate from 0 to \(t\) (with \(W(0) = 0\)): \[W(t)^2 = \int_0^t 2W(s) \, dW(s) + t\] The \(t\) term matches \(E[W(t)^2] = t\), and the integral is a random component with mean 0, consistent with Brownian motion's properties. --------------------------------------------------------------------- 5. Stochastic Differential Equations Ito calculus gives us tools--integrals and a chain rule--to handle Brownian motion. Now we can model systems where randomness and trends coexist, using stochastic differential equations (SDEs). Unlike regular differential equations (e.g., \(\frac{dx}{dt} = -kx\)) that describe smooth dynamics, SDEs blend deterministic behavior with stochastic noise, fitting phenomena like stock prices or diffusing particles. Defining an SDE Consider a process influenced by both a predictable trend and random fluctuations: \[dX(t) = a(t, X(t)) \, dt + b(t, X(t)) \, dW(t)\] * \(X(t)\): The evolving quantity (e.g., position or price). * \(a(t, X(t)) \, dt\): The "drift"--the systematic part, scaled by \(dt\). * \(b(t, X(t)) \, dW(t)\): The "diffusion"--random perturbations from Brownian motion. Here, \(a\) and \(b\) are functions of time and state, and \(dW(t) = \sqrt{dt} \, N(0, 1)\) brings the noise. Solutions to SDEs aren't fixed curves but random paths, each run producing a different trajectory with statistical patterns we can study. Ito's Lemma Revisited Ito's lemma actually applies to a function \(f(t, X(t))\) and its stochastic derivative \(df(t, X(t))\) for a general \(dX(t) = b(t,X (t))dt+\sigma(t,X(t))dW\), and this is done through the linearity of the Ito differential (as seen using the \(\mathbb{R}[\epsilon]/\ epsilon^3\) formulation). Considering that \(dX=O(dW)\), we consider terms up to \(dX^2=O(dW^2) \): \[\begin{aligned} df &= f_t \, dt + f_X \, dX + \frac{1}{2}f_{XX} dX^ 2 \\ &= f_t \, dt + f_X \, (b \, dt+\sigma \, dW) + \frac{1}{2}f_{XX} (b \, dt+\sigma \, dW)^2 \\ &= (f_t + bf_X+\frac{1}{2}\sigma^2 f_ {XX}) \, dt + \sigma f_X \, dW \end{aligned}\] which is the general form typically presented. Drift and Diffusion The drift \(a(t, X)\) sets the average direction, like a current pushing a particle. The diffusion \(b(t, X)\) determines the random jitter's strength. If \(b = 0\), we get a standard ODE; if \(a = 0\), it's just scaled Brownian motion. Together, they model systems with both structure and uncertainty. Take a simple case: \[dX(t) = \mu \, dt + \sigma \, dW(t)\] * \(\mu\): Constant drift. * \(\sigma\): Constant noise amplitude. Starting at \(X(0) = 0\), integrate: \[X(t) = \int_0^t \mu \, ds + \int_0^t \sigma \, dW(s) = \mu t + \ sigma W(t)\] Since \(W(t) \sim N(0, t)\), we have \(X(t) \sim N(\mu t, \sigma^2 t) \)--a process drifting linearly with noise spreading over time. It's a basic model for things like a stock with steady growth and volatility. Sample SDE Path Code 2D image: Sample SDE path with mu=1.0, sigma=0.5 Geometric Brownian Motion For systems where changes scale with size--like stock prices or certain physical processes--consider geometric Brownian motion (GBM): \[dS(t) = \mu S(t) \, dt + \sigma S(t) \, dW(t)\] * \(S(t)\): The state (e.g., stock price). * \(\mu S(t)\): Proportional drift. * \(\sigma S(t)\): Proportional noise. The percentage change \(\frac{dS}{S} = \mu \, dt + \sigma \, dW\) has a trend and randomness. To solve, let \(f = \ln S\): * \(\frac{\partial f}{\partial t} = 0\), * \(\frac{\partial f}{\partial S} = \frac{1}{S}\), * \(\frac{\partial^2 f}{\partial S^2} = -\frac{1}{S^2}\). Using Ito's lemma: \[d(\ln S) = \frac{1}{S} (\mu S \, dt + \sigma S \, dW) + \frac{1}{2} \left( -\frac{1}{S^2} \right) (\sigma^2 S^2 dt)\] \[= \left( \mu - \ frac{1}{2} \sigma^2 \right) dt + \sigma \, dW\] Integrate from \(0\) to \(t\): \[\ln S(t) - \ln S(0) = \left( \mu - \frac{1}{2} \sigma^2 \right) t + \sigma W(t)\] \[S(t) = S(0) \exp\left( \left( \mu - \frac{1}{2} \ sigma^2 \right) t + \sigma W(t) \right)\] The drift is adjusted by \(-\frac{1}{2} \sigma^2\) due to the second-order effect of noise, and \(\sigma W(t)\) adds random fluctuations. This form underlies the Black-Scholes model in finance. Sample Geometric Brownian Motion Path Code 2D image: A sample path of a geometric Brownian motion with parameters m = 0.15 and s = 0.2 Geometric Brownian Motion drifting over time Code 3D animation: Geometric Brownian Motion drifting over time Beyond Analytics Analytical solutions like GBM's are exceptions. Most SDEs require numerical simulation (e.g., stepping \(X(t + \Delta t) = X(t) + \mu \ Delta t + \sigma \sqrt{\Delta t} \, N(0, 1)\)) or statistical analysis via equations like Fokker-Planck. See the appendix for simulation code. --------------------------------------------------------------------- 6. Stratonovich Calculus Recall Ito's lemma: \[df = \left(\frac{\partial f}{\partial t} + \frac{1}{2} \frac{\ partial^2 f}{\partial X^2}\right) dt + \frac{\partial f}{\partial X} dX\] That second derivative term is pretty annoying to deal with in calculations. Is there a way we can simplify it to the familiar chain rule in regular calculus? \[df = \frac{\partial f}{\partial t} dt + \frac{\partial f}{\partial X} dX\] The answer is yes, and it's called Stratonovich calculus. Let's explore a bit. First, the deterministic part clearly satisfies the regular chain rule, since we can directly apply it using linearity. The trouble arises in the stochastic part, which we need to analyze. This means we only need to consider a function \(f(X(t))\). Remember, for the Ito form, we chose to define the integral by choosing the left endpoint of each interval. In other words, it is this stochastic part that will vary. To delete this second order term, we need to somehow absorb it into the stochastic part by defining some Stratonovich differential, typically denoted by \(\circ dW\). Going back to our Riemann sum definitions, our degrees of freedom lie in the choice of the evaluation point for each interval: \[\int_{0}^{T} f(X(t)) \diamond dW = \lim_{n \to \infty} \sum_{i=0}^ {n-1} f(X(t_i) + \lambda \Delta X(t_i,t_{i+1})) \Delta W(t_i, t_ {i+1})\] where \(\lambda \in [0,1]\) is a constant that linearly interpolates between the left and right endpoints of each interval giving a corresponding differential \(\diamond dW\), and \(\Delta X(t_i,t_ {i+1}):=X(t_{i+1})-X(t_i)\). In the deterministic case, since we always have \(O(dX^2) \to 0\), it doesn't matter where we choose the evaluation point. However, in the stochastic case, remember that \(O(dW^2) \to O(dt)\), so we need a more careful choice of evaluation point. Mathematically, our goal is to define a new stochastic integral that preserves the standard chain rule: \[df = f_X \circ dX\] In the limiting discrete form, let's try setting every term equal to each other: \[f(X+\Delta X) - f(X) = f_X(X+\lambda \Delta X) \Delta X\] In other words, our newly defined differential should result in the derivative being a linear approximation of the original function instead of quadratic: \[\frac{f(X+\Delta X)-f(X)}{\Delta X} = f_X(X+\lambda \Delta X)\] But watch what happens as we take the Taylor expansion on both sides about \(X\) (recalling that \(o(\Delta X^2)\to 0\)): \[f_X + \frac{1}{2}f_{XX}\Delta X = f_X + \lambda f_{XX}\Delta X\] Comparing coefficients, we wish to set \(\lambda = 1/2\) to preserve the chain rule. So Stratonovich integrals are defined by the midpoint evaluation rule: \[\begin{aligned} \int_{0}^{T} f(X(t)) \circ dW &= \lim_{n \to \ infty} \sum_{i=0}^{n-1} f(X(t_i) + \frac{1}{2} \Delta X(t_i,t_{i+1})) \Delta W(t_i, t_{i+1}) \\ &= \lim_{n \to \infty} \sum_{i=0}^{n-1} f\ left(\frac{X(t_i)+X(t_{i+1})}{2}\right) \Delta W(t_i, t_{i+1}) \\ \ end{aligned}\] Conversion Formula between Ito and Stratonovich There is a formula to convert the Stratonovich differential into a corresponding Ito SDE that depends on the Ito differential as well as the volatility function \(\sigma\). Recall that Ito's lemma states that for \(dX = a dt + b dW\): \[df = f_t dt + f_X dX + \frac{1}{2}f_{XX} dX^2 = (af_t + \frac{1}{2} b^2 f_{XX}) dt + bf_X dW\] In parallel, we defined Stratonovich's chain rule to satisfy for \(dX = \tilde a dt + \tilde b \circ dW\): \[df = f_t dt + f_X \circ dX = (f_t + \tilde a f_X) dt + \tilde b f_X \circ dW\] Hence, between Ito and Stratonovich SDEs, we have in both cases that the differential is scaled by the volatility function of \(X\) and \ (f_X\), but the drift function changes. Let's find a conversion formula between the two. Suppose we have: \[dX = a dt + b dW = \tilde a dt + b \circ dW\] Then, our objective is to find \(\tilde a\) in terms of \(a\). Recall from the integral definition that \(b(X) \circ dW = b(X+\frac {1}{2}dX) dW\). If we Taylor expand around \(X\), we have: \[b(X+\frac{1}{2}dX) dW = b(X)dW + b_X(X)\frac{1}{2}dX dW + o(dt)\] Now, if we plug in \(dX=a dt + b dW\), the first term vanishes, leaving \(b_X b \frac{1}{2}dW^2 \sim \frac{1}{2}b_X b dt\) (where the arguments \(X\) are left implicit). Hence: \[a = \tilde a + \frac{1}{2} b_X b.\] Applications of Stratonovich Calculus Stratonovich calculus, with its midpoint evaluation rule, adjusts how we handle stochastic integrals compared to Ito's left-endpoint approach. This shift makes it valuable in certain fields where its properties align with physical systems or simplify calculations. Below are some practical applications, each with a concrete mathematical example. * Physics with Multiplicative Noise: In physical systems, noise often scales with the state--like a particle in a fluid where random kicks depend on its position. Consider a damped oscillator with position \(X(t)\) under state-dependent noise: \[dX = -k X \, dt + \sigma X \circ dW\] Here, \(k > 0\) is the damping constant, \(\sigma\) is the noise strength, and \(\circ dW\) denotes the Stratonovich differential. Using Stratonovich's chain rule, for \(f(X) = \ln X\): \[d(\ln X) = \frac{1}{X} (-k X \, dt + \sigma X \circ dW) = -k \, dt + \sigma \circ dW\] This integrates to \(X(t) = X(0) e^{-kt + \sigma W(t)}\), matching the expected exponential decay with noise. Stratonovich fits here because it preserves symmetries in continuous physical processes, unlike Ito, which adds a \(\frac{1}{2} \sigma^2 X \, dt\) drift term. * Wong-Zakai Theorem and Smooth Noise: Real-world noise isn't perfectly white (uncorrelated like \(dW\))--it's often smoother. The Wong-Zakai theorem shows that approximating smooth noise (e.g., \(\eta(t)\) with correlation time \(\epsilon\)) as \(\ epsilon \to 0\) yields a Stratonovich SDE. Take a simple system: \[\dot{x} = a x + b x \eta(t)\] As \(\eta(t) \to\) white noise, this becomes \(dX = a X \, dt + b X \circ dW\). In Stratonovich form, the solution is \(X(t) = X(0) e^{a t + b W(t)}\). This is useful in engineering, like modeling voltage in a circuit with thermal fluctuations, where noise has slight smoothness. * Stochastic Control: In control problems, Stratonovich can simplify dynamics under feedback. Consider a system with control input \(u(t)\) and noise: \[dX = (a X + u) \, dt + \sigma X \circ dW\] For \(f(X) = X^2\), the Stratonovich rule gives: \[d(X^2) = 2X (a X + u) \, dt + 2X \cdot \sigma X \circ dW = (2a X^2 + 2X u) \, dt + 2\sigma X^2 \circ dW\] The lack of a second-derivative term (unlike Ito's \(+ \sigma^2 X ^2 dt\)) aligns with classical control intuition, making it easier to design \(u(t)\) for, say, stabilizing a noisy pendulum or a drone in wind. * Biological Diffusion: In biology, noise can depend on spatial gradients, like protein diffusion across a cell. Model this as: \[dX = \mu \, dt + \sigma(X) \circ dW, \quad \sigma(X) = \sqrt{2D (1 + k X^2)}\] where \(D\) is diffusivity and \(k\) adjusts noise with position. Stratonovich ensures the diffusion term reflects physical conservation laws, matching experimental data in systems like bacterial motility better than Ito, which alters the drift. * Numerical Stability: For simulations, Stratonovich pairs well with midpoint methods. Take \(dX = -a X \, dt + \sigma \circ dW \). A Stratonovich discretization might use: \[X_{n+1} = X_n - a \left(\frac{X_n + X_{n+1}}{2}\right) \Delta t + \sigma \Delta W_n\] This implicit scheme leverages the midpoint rule, reducing numerical artifacts in models like chemical kinetics compared to Ito's explicit steps. The choice between Stratonovich and Ito depends on context. Stratonovich suits systems where noise is tied to physical continuity or symmetry, while Ito dominates in finance for its non-anticipating properties. The conversion \(a = \tilde{a} + \frac{1}{2} b b_X\) lets you switch forms as needed. Appendix A.0. Further Reading * An Intuitive Introduction For Understanding and Solving Stochastic Differential Equations - Chris Rackauckas (2017) * Stochastic analysis - Paul Bourgade (2010) * AN INTRODUCTION TO STOCHASTIC DIFFERENTIAL EQUATIONS VERSION 1.2 - Lawrence C. Evans (2013) * Stochastic differential equations An introduction with applications - Bernt K. Oksendal (2003) * Wikipedia: Stochastic calculus * Wikipedia: Stochastic differential equation A.1. Notation Here is a list of notation used in this document: * \(\binom{n}{k}=\frac{n!}{k!(n-k)!}\) is the binomial coefficient * \(X: \Omega \to \mathbb{R}\) is a random variable from a sample space \(\Omega\) to a real number * \(P(A)\) is the probability of event \(A\) * \(E[X]=\int_{\omega \in \Omega} X(\omega) dP(\omega)\) is the expected value of \(X\) * \(N(\mu, \sigma^2)\) is a normal distribution with mean \(\mu\) and variance \(\sigma^2\) * \(W(t)\) is the position of a Brownian motion at time \(t\) * \(\Delta W(t_1,t_2)\) is the displacement of a Brownian motion from time \(t_1\) to time \(t_2\) * \(dt\) is an infinitesimal time increment * \(dW := \Delta W(t,t+dt)\) is an infinitesimal increment of Brownian motion over time * \((dW)^2 \sim dt\) denotes that \((dW^2) = dt + o(dt)\) where \(\ lim_{t \to 0} \frac{o(dt)}{dt} = 0\), such that \((dW)^2\) is asymptotically equal to \(dt\) in the mean-square limit: \(\lim_{dt \to 0} \frac{E[(dW)^2-dt]^2}{dt}=0\) * \(f_t:=\frac{\partial f}{\partial t}\) is the partial derivative of \(f\) with respect to \(t\) * \(f_xx:=\frac{\partial^2 f}{\partial x^2}\) is the second order partial derivative of \(f\) with respect to \(x\) B.1. Python code for binomial plots 1 import numpy as np 2 import matplotlib.pyplot as plt 3 from scipy.stats import binom 4 5 n_values = [5, 10, 25, 50, 100] 6 p = 0.5 7 8 # Individual plots 9 for n in n_values: 10 k = np.arange(0, n + 1) 11 positions = 2 * k - n 12 probs = binom.pmf(k, n, p) 13 14 plt.figure(figsize=(6, 4)) 15 plt.bar(positions, probs, width=1.0, color='skyblue', edgecolor='black') 16 plt.title(f'n = {n}') 17 plt.xlabel('Position (# wins - # losses)') 18 plt.ylabel('Probability') 19 plt.ylim(0, max(probs) * 1.2) 20 plt.savefig(f'random_walk_n_{n}.png', dpi=300, bbox_inches='tight') 21 plt.close() 22 23 # Combined plot 24 fig, axes = plt.subplots(5, 1, figsize=(8, 12), sharex=True) 25 for i, n in enumerate(n_values): 26 k = np.arange(0, n + 1) 27 positions = 2 * k - n 28 probs = binom.pmf(k, n, p) 29 axes[i].bar(positions, probs, width=1.0, color='skyblue', edgecolor='black') 30 axes[i].set_title(f'n = {n}') 31 axes[i].set_ylabel('Probability') 32 axes[i].set_ylim(0, max(probs) * 1.2) 33 axes[-1].set_xlabel('Position (# wins - # losses)') 34 plt.tight_layout() 35 plt.savefig('random_walk_combined.png', dpi=300, bbox_inches='tight') 36 plt.close() B2. Python Code for Brownian Motion Plot 1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 # Simulate Brownian motion 5 np.random.seed(42) 6 t = np.linspace(0, 1, 1000) # Time from 0 to 1 7 dt = t[1] - t[0] 8 dW = np.sqrt(dt) * np.random.normal(0, 1, size=len(t)-1) # Increments 9 W = np.concatenate([[0], np.cumsum(dW)]) # Cumulative sum starts at 0 10 11 # Plot 12 plt.plot(t, W) 13 plt.title("Sample Brownian Motion Path") 14 plt.xlabel("Time t") 15 plt.ylabel("W(t)") 16 plt.grid(True) 17 plt.show() B3. Python Code for Basic SDE Simulation 1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 # Simulate simple SDE: dX = mu dt + sigma dW 5 np.random.seed(42) 6 T = 1.0 7 N = 1000 8 dt = T / N 9 t = np.linspace(0, T, N+1) 10 mu, sigma = 1.0, 0.5 11 X = np.zeros(N+1) 12 for i in range(N): 13 dW = np.sqrt(dt) * np.random.normal(0, 1) 14 X[i+1] = X[i] + mu * dt + sigma * dW 15 16 plt.plot(t, X, label=f"m={mu}, s={sigma}") 17 plt.title("Sample Path of dX = m dt + s dW") 18 plt.xlabel("Time t") 19 plt.ylabel("X(t)") 20 plt.legend() 21 plt.grid(True) 22 plt.show() B4. Python Code for Geometric Brownian Motion Simulation 1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 # Simulate simple SDE: dX = mu dt + sigma dW 5 np.random.seed(42) 6 7 # Simulate Geometric Brownian Motion (exact solution) 8 T_gbm = 10.0 # Longer time to show exponential nature 9 N_gbm = 1000 10 t_gbm = np.linspace(0, T_gbm, N_gbm+1) 11 S0 = 100.0 # Initial stock price 12 mu, sigma = 0.15, 0.2 # Slightly larger for visibility 13 S = S0 * np.exp((mu - 0.5 * sigma**2) * t_gbm + sigma * np.sqrt(t_gbm) * np.random.normal(0, 1, N_gbm+1)) 14 15 plt.figure(figsize=(8, 4)) 16 plt.plot(t_gbm, S, label=f"m={mu}, s={sigma}") 17 plt.title("Sample Path: Geometric Brownian Motion") 18 plt.xlabel("Time t") 19 plt.ylabel("S(t)") 20 plt.legend() 21 plt.grid(True) 22 plt.savefig("gbm_path.png", dpi=300, bbox_inches="tight") 23 plt.show() B5. LaTeX Code for Tikz Diagram of Paths in Pascal's Triangle 1 \documentclass{standalone} 2 \usepackage{tikz} 3 \begin{document} 4 5 \begin{tikzpicture}[scale=0.8] 6 % Add a white background rectangle 7 \fill[white] (-12, 1) rectangle (10, -5); 8 9 % Row labels (only once, to the left of the first diagram) 10 \node[align=right] at (-11, 0) {Row 0}; 11 \node[align=right] at (-11, -1) {Row 1}; 12 \node[align=right] at (-11, -2) {Row 2}; 13 \node[align=right] at (-11, -3) {Row 3}; 14 15 % Diagram 1: Path RRL 16 \node at (-6, 0) {1}; % Row 0 17 \node at (-7, -1) {1}; % Row 1 18 \node at (-5, -1) {1}; 19 \node at (-8, -2) {1}; % Row 2 20 \node at (-6, -2) {2}; 21 \node at (-4, -2) {1}; 22 \node at (-9, -3) {1}; % Row 3 23 \node at (-7, -3) {3}; 24 \node at (-5, -3) {3}; 25 \node at (-3, -3) {1}; 26 \draw[->, red, thick] (-6, 0) -- (-5, -1) -- (-4, -2) -- (-5, -3); % RRL 27 \node at (-6, -4) {Right-Right-Left}; 28 29 % Diagram 2: Path RLR 30 \node at (0, 0) {1}; % Row 0 31 \node at (-1, -1) {1}; % Row 1 32 \node at (1, -1) {1}; 33 \node at (-2, -2) {1}; % Row 2 34 \node at (0, -2) {2}; 35 \node at (2, -2) {1}; 36 \node at (-3, -3) {1}; % Row 3 37 \node at (-1, -3) {3}; 38 \node at (1, -3) {3}; 39 \node at (3, -3) {1}; 40 \draw[->, blue, thick] (0, 0) -- (1, -1) -- (0, -2) -- (1, -3); % RLR 41 \node at (0, -4) {Right-Left-Right}; 42 43 % Diagram 3: Path LRR 44 \node at (6, 0) {1}; % Row 0 45 \node at (5, -1) {1}; % Row 1 46 \node at (7, -1) {1}; 47 \node at (4, -2) {1}; % Row 2 48 \node at (6, -2) {2}; 49 \node at (8, -2) {1}; 50 \node at (3, -3) {1}; % Row 3 51 \node at (5, -3) {3}; 52 \node at (7, -3) {3}; 53 \node at (9, -3) {1}; 54 \draw[->, green, thick] (6, 0) -- (5, -1) -- (6, -2) -- (7, -3); % LRR 55 \node at (6, -4) {Left-Right-Right}; 56 \end{tikzpicture} 57 58 \end{document} 3D Visualizations C1. 3D Plot of Discrete Random Walks 1 import numpy as np 2 import matplotlib.pyplot as plt 3 from mpl_toolkits.mplot3d import Axes3D # for 3D plotting 4 import imageio.v3 as imageio # using modern imageio v3 API 5 import os 6 from scipy.special import comb 7 from scipy.stats import norm 8 9 # Create a directory for frames 10 os.makedirs('gif_frames', exist_ok=True) 11 12 ############################################## 13 # Part 1: Discrete Binomial Random Walk (N = 15) 14 ############################################## 15 16 N = 15 # total number of steps (kept small for clear discreteness) 17 num_sample_paths = 5 # number of sample paths to overlay 18 19 # Simulate a few discrete random walk sample paths 20 sample_paths = [] 21 for i in range(num_sample_paths): 22 steps = np.random.choice([-1, 1], size=N) 23 path = np.concatenate(([0], np.cumsum(steps))) 24 sample_paths.append(path) 25 sample_paths = np.array(sample_paths) # shape: (num_sample_paths, N+1) 26 27 frames = [] 28 for t_step in range(N + 1): 29 fig = plt.figure(figsize=(10, 7)) 30 ax = fig.add_subplot(111, projection='3d') 31 32 # For each discrete time slice up to the current time, plot the PMF 33 for t in range(t_step + 1): 34 # For a random walk starting at 0, possible positions are -t, -t+2, ..., t 35 x_values = np.arange(-t, t + 1, 2) 36 if t == 0: 37 p_values = np.array([1.0]) 38 else: 39 # k = (x + t)/2 gives the number of +1 steps 40 k = (x_values + t) // 2 41 p_values = comb(t, k) * (0.5 ** t) 42 # Plot the discrete PMF as blue markers (and connect them with a line) 43 ax.scatter(x_values, [t]*len(x_values), p_values, color='blue', s=50) 44 ax.plot(x_values, [t]*len(x_values), p_values, color='blue', alpha=0.5) 45 46 # Overlay the sample random walk paths (projected at z=0) 47 for sp in sample_paths: 48 ax.plot(sp[:t_step + 1], np.arange(t_step + 1), np.zeros(t_step + 1), 49 'r-o', markersize=5, label='Sample Path' if t_step == 0 else "") 50 51 ax.set_xlabel('Position (x)') 52 ax.set_ylabel('Time (steps)') 53 ax.set_zlabel('Probability') 54 ax.set_title(f'Discrete Binomial Random Walk: Step {t_step}') 55 ax.set_zlim(0, 1.0) 56 ax.view_init(elev=30, azim=-60) 57 58 frame_path = f'gif_frames/discrete_binomial_{t_step:02d}.png' 59 plt.savefig(frame_path) 60 plt.close() 61 frames.append(imageio.imread(frame_path)) 62 63 # Compute per-frame durations: 0.25 sec for all frames except the last one (2 sec) 64 durations = [0.25] * (len(frames) - 1) + [2.0] 65 66 # Write the GIF with variable durations and infinite looping 67 imageio.imwrite('discrete_binomial.gif', frames, duration=durations, loop=0) 68 69 ############################################## 70 # Part 2: Discrete Random Walk Normalizing (N = 50) 71 ############################################## 72 73 N = 50 # total number of steps (increased to show gradual convergence) 74 num_sample_paths = 5 # number of sample paths to overlay 75 76 # Simulate a few discrete random walk sample paths 77 sample_paths = [] 78 for i in range(num_sample_paths): 79 steps = np.random.choice([-1, 1], size=N) 80 path = np.concatenate(([0], np.cumsum(steps))) 81 sample_paths.append(path) 82 sample_paths = np.array(sample_paths) # shape: (num_sample_paths, N+1) 83 84 frames = [] 85 for t_step in range(N + 1): 86 fig = plt.figure(figsize=(10, 7)) 87 ax = fig.add_subplot(111, projection='3d') 88 89 # Plot the PMFs for all time slices from 0 to the current step 90 for t in range(t_step + 1): 91 # For a random walk starting at 0, possible positions are -t, -t+2, ..., t 92 x_values = np.arange(-t, t + 1, 2) 93 if t == 0: 94 p_values = np.array([1.0]) 95 else: 96 # For each x, number of +1 steps is (x+t)/2 97 k = (x_values + t) // 2 98 p_values = comb(t, k) * (0.5 ** t) 99 100 # Plot the discrete PMF as blue markers and lines 101 ax.scatter(x_values, [t]*len(x_values), p_values, color='blue', s=50) 102 ax.plot(x_values, [t]*len(x_values), p_values, color='blue', alpha=0.5) 103 104 # For the current time slice, overlay the normal approximation in red 105 if t == t_step and t > 0: 106 x_cont = np.linspace(-t, t, 200) 107 normal_pdf = norm.pdf(x_cont, 0, np.sqrt(t)) 108 ax.plot(x_cont, [t]*len(x_cont), normal_pdf, 'r-', linewidth=2, label='Normal Approx.') 109 110 # Overlay the sample random walk paths (projected along the z=0 plane) 111 for sp in sample_paths: 112 ax.plot(sp[:t_step + 1], np.arange(t_step + 1), np.zeros(t_step + 1), 113 'g-o', markersize=5, label='Sample Path' if t_step == 0 else "") 114 115 ax.set_xlabel('Position (x)') 116 ax.set_ylabel('Time (steps)') 117 ax.set_zlabel('Probability') 118 ax.set_title(f'Discrete Binomial Random Walk at Step {t_step}') 119 ax.set_zlim(0, 1.0) 120 ax.view_init(elev=30, azim=-60) 121 122 frame_path = f'gif_frames/discrete_binomial_{t_step:02d}.png' 123 plt.savefig(frame_path) 124 plt.close() 125 frames.append(imageio.imread(frame_path)) 126 127 # Compute per-frame durations: 0.25 sec for all frames except the last one (2 sec) 128 durations = [0.25] * (len(frames) - 1) + [2.0] 129 130 # Write the GIF with variable durations and infinite looping 131 imageio.imwrite('discrete_binomial_normalizing.gif', frames, duration=durations, loop=0) C2. 3D Animation of Brownian Motion Normal distribution sweeping and evolving across time according Brownian motion 1 import numpy as np 2 import matplotlib.pyplot as plt 3 from mpl_toolkits.mplot3d import Axes3D # for 3D plotting 4 from scipy.stats import norm 5 import imageio.v3 as imageio # using modern API 6 import os 7 8 os.makedirs('gif_frames', exist_ok=True) 9 10 # Parameters for continuous Brownian motion 11 num_frames = 100 # more frames for smoother animation 12 t_values = np.linspace(0.1, 5, num_frames) 13 x = np.linspace(-5, 5, 200) # increased resolution 14 15 num_sample_paths = 5 16 sample_paths = np.zeros((num_sample_paths, len(t_values))) 17 dt_cont = t_values[1] - t_values[0] 18 for i in range(num_sample_paths): 19 increments = np.random.normal(0, np.sqrt(dt_cont), size=len(t_values)-1) 20 sample_paths[i, 1:] = np.cumsum(increments) 21 22 frames = [] 23 for i, t in enumerate(t_values): 24 fig = plt.figure(figsize=(10, 7)) 25 ax = fig.add_subplot(111, projection='3d') 26 27 mask = t_values <= t 28 T_sub, X_sub = np.meshgrid(t_values[mask], x) 29 P_sub = (1 / np.sqrt(2 * np.pi * T_sub)) * np.exp(-X_sub**2 / (2 * T_sub)) 30 ax.plot_surface(X_sub, T_sub, P_sub, cmap='viridis', alpha=0.7, edgecolor='none') 31 32 for sp in sample_paths: 33 ax.plot(sp[:i+1], t_values[:i+1], np.zeros(i+1), 'r-', marker='o', markersize=3) 34 35 ax.set_xlabel('Position (x)') 36 ax.set_ylabel('Time (t)') 37 ax.set_zlabel('Density') 38 ax.set_title(f'Continuous Brownian Motion at t = {t:.2f}') 39 ax.view_init(elev=30, azim=-60) 40 41 frame_path = f'gif_frames/continuous_3d_smooth_t_{t:.2f}.png' 42 plt.savefig(frame_path) 43 plt.close() 44 frames.append(imageio.imread(frame_path)) 45 46 imageio.imwrite('continuous_brownian_3d_smooth.gif', frames, duration=0.1) C3. 3D Animation of Geometric Brownian Motion 1 import numpy as np 2 import matplotlib.pyplot as plt 3 from mpl_toolkits.mplot3d import Axes3D # for 3D plotting 4 import imageio.v3 as imageio # modern API 5 import os 6 7 os.makedirs('gif_frames', exist_ok=True) 8 9 # Parameters for geometric Brownian motion (GBM) 10 S0 = 1.0 # initial stock price 11 mu = 0.2 # drift rate (increased for noticeable drift) 12 sigma = 0.2 # volatility 13 14 num_frames = 100 15 t_values = np.linspace(0.1, 5, num_frames) # avoid t=0 to prevent singularity in density 16 S_range = np.linspace(0.01, 5, 200) # price range 17 18 # Simulate GBM sample paths 19 num_sample_paths = 5 20 sample_paths = np.zeros((num_sample_paths, len(t_values))) 21 dt = t_values[1] - t_values[0] 22 for i in range(num_sample_paths): 23 increments = np.random.normal(0, np.sqrt(dt), size=len(t_values)-1) 24 W = np.concatenate(([0], np.cumsum(increments))) 25 sample_paths[i] = S0 * np.exp((mu - 0.5 * sigma**2) * t_values + sigma * W) 26 27 frames = [] 28 for i, t in enumerate(t_values): 29 fig = plt.figure(figsize=(10, 7)) 30 ax = fig.add_subplot(111, projection='3d') 31 32 mask = t_values <= t 33 T_sub, S_sub = np.meshgrid(t_values[mask], S_range) 34 P_sub = (1 / (S_sub * sigma * np.sqrt(2 * np.pi * T_sub))) * \ 35 np.exp(- (np.log(S_sub / S0) - (mu - 0.5 * sigma**2) * T_sub)**2 / (2 * sigma**2 * T_sub)) 36 ax.plot_surface(S_sub, T_sub, P_sub, cmap='viridis', alpha=0.7, edgecolor='none') 37 38 for sp in sample_paths: 39 ax.plot(sp[:i+1], t_values[:i+1], np.zeros(i+1), 'r-', marker='o', markersize=3) 40 41 ax.set_xlabel('Stock Price S') 42 ax.set_ylabel('Time t') 43 ax.set_zlabel('Density') 44 ax.set_title(f'Geometric Brownian Motion at t = {t:.2f}') 45 ax.view_init(elev=30, azim=-60) 46 47 frame_path = f'gif_frames/geometric_brownian_drifted_3d_t_{t:.2f}.png' 48 plt.savefig(frame_path) 49 plt.close() 50 frames.append(imageio.imread(frame_path)) 51 52 imageio.imwrite('geometric_brownian_drifted_3d.gif', frames, duration=0.1) C4. Python Code for Normal Distribution Approximation by Random Walks 1 import numpy as np 2 import matplotlib.pyplot as plt 3 from scipy.stats import norm 4 import imageio.v3 as imageio # modern ImageIO v3 API 5 import os 6 from scipy.special import comb 7 8 # Create a directory for frames 9 os.makedirs('gif_frames', exist_ok=True) 10 11 # 1. Continuous Brownian Motion with Sample Paths 12 13 # Define time values and x range for density 14 t_values = np.linspace(0.1, 5, 50) # Times from 0.1 to 5 15 x = np.linspace(-5, 5, 100) # Range of x values 16 17 # Simulate a few sample Brownian motion paths 18 num_sample_paths = 5 19 dt_cont = t_values[1] - t_values[0] # constant time step (~0.1) 20 sample_paths = np.zeros((num_sample_paths, len(t_values))) 21 sample_paths[:, 0] = 0 22 increments = np.random.normal(0, np.sqrt(dt_cont), size=(num_sample_paths, len(t_values)-1)) 23 sample_paths[:, 1:] = np.cumsum(increments, axis=1) 24 25 frames = [] 26 for i, t in enumerate(t_values): 27 p = (1 / np.sqrt(2 * np.pi * t)) * np.exp(-x**2 / (2 * t)) 28 29 plt.figure(figsize=(12, 4)) 30 plt.subplot(1, 2, 1) 31 plt.plot(x, p, 'b-', label=f't = {t:.2f}') 32 plt.title('Brownian Motion Distribution') 33 plt.xlabel('x') 34 plt.ylabel('Density p(x,t)') 35 plt.ylim(0, 0.8) 36 plt.legend() 37 plt.grid(True) 38 39 plt.subplot(1, 2, 2) 40 for sp in sample_paths: 41 plt.plot(t_values[:i+1], sp[:i+1], '-o', markersize=3) 42 plt.title('Sample Brownian Paths') 43 plt.xlabel('Time') 44 plt.ylabel('Position') 45 plt.xlim(0, 5) 46 plt.grid(True) 47 48 frame_path = f'gif_frames/continuous_t_{t:.2f}.png' 49 plt.tight_layout() 50 plt.savefig(frame_path) 51 plt.close() 52 frames.append(imageio.imread(frame_path)) 53 54 # Save the continuous Brownian motion GIF 55 # (duration in seconds per frame; adjust as desired) 56 imageio.imwrite('continuous_brownian.gif', frames, duration=0.1) 57 58 # 2. Discrete Random Walk with Sample Paths 59 60 def simulate_random_walk(dt, T, num_paths): 61 """Simulate random walk paths with step size sqrt(dt).""" 62 n_steps = int(T / dt) 63 positions = np.zeros((num_paths, n_steps + 1)) 64 for i in range(num_paths): 65 increments = np.random.choice([-1, 1], size=n_steps) * np.sqrt(dt) 66 positions[i, 1:] = np.cumsum(increments) 67 return positions 68 69 dt = 0.01 # Step size 70 T = 5.0 # Total time 71 num_paths = 10000 # For histogram 72 times = np.arange(0, T + dt, dt) 73 positions = simulate_random_walk(dt, T, num_paths) 74 sample_indices = np.arange(5) 75 76 frames = [] 77 for i, t in enumerate(times): 78 if i % 10 == 0: # Use every 10th frame for the GIF 79 current_positions = positions[:, i] 80 x_vals = np.linspace(-5, 5, 100) 81 p_theoretical = norm.pdf(x_vals, 0, np.sqrt(t) if t > 0 else 1e-5) 82 83 plt.figure(figsize=(12, 4)) 84 plt.subplot(1, 2, 1) 85 plt.hist(current_positions, bins=50, density=True, alpha=0.6, label=f't = {t:.2f}') 86 plt.plot(x_vals, p_theoretical, 'r-', label='N(0,t)') 87 plt.title('Discrete Random Walk Distribution') 88 plt.xlabel('Position') 89 plt.ylabel('Density') 90 plt.ylim(0, 0.8) 91 plt.legend() 92 plt.grid(True) 93 94 plt.subplot(1, 2, 2) 95 for idx in sample_indices: 96 plt.plot(times[:i+1], positions[idx, :i+1], '-o', markersize=3) 97 plt.title('Sample Random Walk Paths') 98 plt.xlabel('Time') 99 plt.ylabel('Position') 100 plt.xlim(0, T) 101 plt.grid(True) 102 103 frame_path = f'gif_frames/discrete_t_{t:.2f}.png' 104 plt.tight_layout() 105 plt.savefig(frame_path) 106 plt.close() 107 frames.append(imageio.imread(frame_path)) 108 109 # Save the discrete random walk GIF with infinite looping 110 imageio.imwrite('discrete_random_walk.gif', frames, duration=0.1, loop=0) Applied Mathematics, Stochastic Calculus Stochastic Differential Equations Ito Calculus Ito Lemma Stratonovich Calculus This post is licensed under CC BY 4.0 by the author. Share Recently Updated * Introduction to Stochastic Calculus * Automatic Stochastic Differentiation * Deriving Reverse-Time Stochastic Differential Equations (SDEs) Trending Tags Ito Calculus Ito Lemma Stochastic Differential Equations Stratonovich Calculus Automatic Differentiation Diffusion Processes Dual Numbers Fokker-Planck Equation Score-Based Generative Modeling Time-Reversal Contents Further Reading Feb 21, 2025 Automatic Stochastic Differentiation Introduction Calculus can be quite tedious when computed symbolically by hand. In many modern applications (for example, in machine learning), automatic differentiation is used to efficiently comp... Feb 20, 2025 Deriving Reverse-Time Stochastic Differential Equations (SDEs) A derivation of the reverse-time stochastic differential equation used in Score-Based Generative Modeling. Automatic Stochastic Differentiation - (c) 2025 Ji-Ha Kim. Some rights reserved. Using the Chirpy theme for Jekyll. Trending Tags Ito Calculus Ito Lemma Stochastic Differential Equations Stratonovich Calculus Automatic Differentiation Diffusion Processes Dual Numbers Fokker-Planck Equation Score-Based Generative Modeling Time-Reversal A new version of content is available. Update