[HN Gopher] Introduction to Stochastic Calculus
       ___________________________________________________________________
        
       Introduction to Stochastic Calculus
        
       Author : ibobev
       Score  : 276 points
       Date   : 2025-02-24 15:40 UTC (7 hours ago)
        
 (HTM) web link (jiha-kim.github.io)
 (TXT) w3m dump (jiha-kim.github.io)
        
       | Daniel_Van_Zant wrote:
       | Is stochastic calculus something that requires a computer to
       | stimulate many possible unfolding of events, or is there a more
       | elegant mathematical way to solve for some of the important final
       | outputs and probability distributions if you know the
       | distribution of dW? This is an awesome article. I've seen
       | stochastic calculus before but this is the first time I really
       | felt like I started to grok it.
        
         | LeonardoTolstoy wrote:
         | It has been a while since I studied along these lines
         | (stochastic chemical reaction simulations in my case) but I
         | think the answer is often yes, but not always (I don't think).
         | A random walk for example will be a normal distribution (and
         | you know the mean, and you know the variance is going to
         | infinity), so I do think in that case you end up with an
         | elegant analytical solution if I'm understanding correctly as
         | the inputs can determine the function the variance follows
         | through time.
         | 
         | But often no, you need to run a stochastic algorithm (e.g.
         | Gillespie's algorithm in the case of simple stochastic chemical
         | kinetics) as there will be no analytical solution.
         | 
         | Again it has been a while though.
        
           | yoyoma1234 wrote:
           | For normal distributions I think do - black scholes is an
           | analytical solution to option pricing. Been a while since I
           | studied stochastic calculus
           | 
           | I question why this is the second highest article on hacker
           | news currently, can't imagine many people reading this
           | website are REALLY in this field or a related one, or if it's
           | just signaling like saying you have a copy of Knuths books or
           | that famous lisp one
        
             | PhilipRoman wrote:
             | This is one of those archetypal submissions on HN:
             | mathematics (preferably pure, using the word "calculus"
             | outside of integrals/derivatives gives additional points),
             | moderately high number of upvotes, very few comments.
             | Pretty much the opposite of political posts, where everyone
             | can "contribute" to the discussion.
        
             | nh23423fefe wrote:
             | I upvote good things even if i dont read because i dont
             | want to spend all my energy reacting to trash politics
             | posts. cut away bad, promote good
        
             | magicalhippo wrote:
             | I upvote so it sticks around longer, so it has a better
             | chance of generating interesting comments.
             | 
             | I also upvote because I find it interesting to learn about
             | stuff I didn't know about. I might not understand it, but I
             | do like the exposure regardless.
        
         | FabHK wrote:
         | Certain simple stochastic differential equations can be solved
         | explicitly analytically (like some integrals and simple
         | ordinary differential equations can be solved explicitly), for
         | example the classic Black Scholes equation. More complicated
         | ones typically can't be solved in that way.
         | 
         | What one often wishes to have is the expectation of a function
         | of a stochastic process at some point, and what can be shown is
         | that this expectation obeys a certain (deterministic) partial
         | differential equation. This then can be solved using numerical
         | PDE solvers.
         | 
         | In higher dimensions, though, or if the process is highly path-
         | dependent (not Markovian), one resorts to Monte Carlo
         | simulation, which does indeed simulate "many possible unfolding
         | of events".
        
         | kkylin wrote:
         | It depends a bit on exactly what you want to calculate, but in
         | general things like the probability density function of the
         | solution of a stochastic differential equation (SDE) at time t
         | satisfies a partial differential equation (PDE) that is first
         | order in time and second order in space [0]. (This PDE is known
         | to physicists as the Fokker-Planck equation and to
         | mathematicians as the Kolmogorov forward equation.) Except in
         | special examples, the PDE will not have exact analytical
         | solutions, and a numerical solution is needed. Such a numerical
         | solution will be very expensive in high dimensions, however, so
         | in high-dimensional problems it is cheaper to solve the SDE and
         | do Monte Carlo sampling, rather than try to solve the PDE.
         | 
         | Edit: sometimes people are interested in other types of
         | questions, for example the solution when certain random events
         | occur. Analogous comments apply. Also, while stochastic
         | calculus is very useful for working with SDEs, if your interest
         | is other types of Markov (or even non-Markov) processes you may
         | need other tools.
         | 
         | Edit again: as another commenter mentioned, in special cases
         | the SDE itself may also have exact solutions, but in general
         | not.
         | 
         | [0] This statement is specific to stochastic differential
         | equations, i.e., a differential equation with (gaussian) white
         | noise forcing. For other types of stochastic processes, e.g.,
         | Markov jump processes, the evolution equation for distributions
         | have a different form (but some general principles apply to
         | both, e.g., forms of the Chapman-Kolmogorov equation, etc).
        
         | anvuong wrote:
         | Depends on what you want to know. If you want to get some
         | trajectories then simulation of the stochastic differential
         | equation is required. But if you just want to know the
         | statistics of the paths, then in many cases you can write and
         | try to solve the Fokker-Planck equation, which is a partial
         | differential equation, to get the path density.
        
         | sfpotter wrote:
         | In case the other responses to your question are a little
         | difficult to parse, and to answer your question a little more
         | directly:
         | 
         | - Usually, you will only get analytic answers for simple
         | questions about simple distributions.
         | 
         | - For more complicated problems (either because the question is
         | complicated, or the distribution is complicated, or both), you
         | will need to use numerical methods.
         | 
         | - This _doesn 't_ necessarily mean you'll need to do many
         | simulations, as in a Monte Carlo method, although that can be a
         | very reasonable (albeit expensive) approach.
         | 
         | More direct questions about certain probabilities can be
         | answered without using a Monte Carlo method. The Fokker-Planck
         | equation is a partial differential equation which can be solved
         | using a variety of non-Monte Carlo approaches. The
         | quasipotential and committor functions are interesting objects
         | which come up in the simulation of rare events that can also be
         | computed "directly" (i.e., without using a Monte Carlo
         | approach). The crux of the problem is that applying standard
         | numerical methods to the computation of these objects faces the
         | curse of dimensionality. Finding good ways to compute these
         | things in the high-dimensional case (or even the infinite-
         | dimensional case) is a very hot area of research in applied
         | mathematics. Personally, I think unless you have a very clear
         | physical application where the mathematics map cleanly onto
         | what you're doing, all this stuff is probably a bit of a waste
         | of time...
        
           | Daniel_Van_Zant wrote:
           | Thanks for the explanation this was very helpful. You've
           | given me a whole new list of stuff to Google. The
           | quasipotential/comittor functions especially seem quite
           | interesting although I'm having a bit of trouble finding good
           | resources on them.
        
       | EGreg wrote:
       | I remember studying stochastiv calculus
       | 
       | And I remember noting that the standard deviation in regular
       | statistics was that "quadratic variation" was slightly different
       | than how variance is calculated. Off by one or squared or
       | whatever. I made a note to eventually investigate why. Probably
       | due to some stochastic volatility.
        
         | FabHK wrote:
         | There is the fact that the variance of the entire population is
         | defined [0] as                 sum i=1..N (x_i - mu)^2 / N
         | 
         | while, given a sample of n iid [1] samples from a distribution,
         | the _best [2] estimate_ of the distribution variance is
         | sum i=1..n (x_i - a )^2 / (n-1)
         | 
         | Note that we replaced the mean _mu_ by the sample average _a,_
         | [3] and divided by (n-1) instead of N.
         | 
         | [0] with the mean mu := sum x_i / N being the actual mean of
         | the population
         | 
         | [1] independent and identically distributed
         | 
         | [2] best in the sense of being unbiased. It's a tedious, but
         | not very difficult calculation to confirm that the expectation
         | of that second expression (with n-1) is the population
         | variance.
         | 
         | [3] with the sample average a := sum x_i / n being an estimate
         | of the population mean
        
         | SeaGully wrote:
         | The other guy gives a solid explanation so don't use mine as a
         | replacement or to assume the other is wrong.
         | 
         | To me there are two ways to approach the problem I think you
         | are thinking of (sample variance I think).
         | 
         | (1) The sample variance depends on the sample mean which is
         | sum(x_i) / n. Given the first n-1 of n samples, you would then
         | know the final value (x_n = n * sample_mean - sum(x_i)_(n-1))
         | so at the very least n-1 could be understood as a "degrees of
         | freedom". There are only n-1 degrees of freedom. Other higher
         | sample moments can be roughly understood with the same degrees
         | of freedom argument. This could be wrong though, it was just
         | something I remember from somewhere.
         | 
         | (2) The more mathematically inclined way is that
         | biased_sample_variance = sum((x_i - sum(x_i) / n)^2) / n. The
         | mean of the biased_sample_variance (across many iterations of a
         | set of samples N), is not the population variance, but (n - 1)
         | / n * population_variance (i.e. it is biased). So you multiply
         | the biased_sample_variance by (n / (n - 1)) which gives the
         | unbiased sample_variance equation: sum((x_i - sum(x_i) / n)^2)
         | / (n - 1). The math is rather fun in my opinion, once you get
         | into the swing of things.
         | 
         | I sure do hope I understood your question correctly.
        
       | ForceBru wrote:
       | Seems like a great article. Having some prior experience with
       | stochastic calculus, I think I understand almost everything here.
       | Any other good introductory materials?
        
         | seanhunter wrote:
         | I've been planning to study this in a bit although I have some
         | background to cover first so haven't got on to it. From what
         | I've found, the youtube channel "Mathematical Toolbox" has some
         | videos which are quite introductory but seem good. Some people
         | also recommend the book "An Informal Introduction to Stochastic
         | Calculus with Applications" by Calin as a good place to start.
         | Then Klebaner "Introduction to Stochastic Calculus with
         | Applications" and also Evans "An Introduction to Stochastic
         | Differential Equations" are apparently very good but harder and
         | more formal texts, but you need some analysis and measure
         | theoretic probability background first. The Evans is the same
         | Evans who wrote the definitive book about PDEs fwiw. Klebaner
         | and Evans are apparently a lot harder than Calin though even
         | though they are all called introductions.
        
       | dmvdoug wrote:
       | Can someone please help me parse this sentence?
       | 
       | > _Brownian motion and Ito calculare a notable example of fairly
       | high-level mathematics that are applied to model the real world_
       | 
       | What is "Ito calculare" supposed to have been? I am stumped. "Its
       | calculation"?
        
         | karpierz wrote:
         | Ito calculus - https://en.wikipedia.org/wiki/It%C3%B4_calculus
        
         | luisfmh wrote:
         | Ito is the name of the type of calculus
         | (https://en.wikipedia.org/wiki/It%C3%B4_calculus) and calculare
         | I think is just the plural of calculus. So something like "all
         | the ito calculus are notable examples of fairly high level
         | mathematics ..."
        
           | dmvdoug wrote:
           | That makes so much more sense! Although the pedant in me
           | wants to argue that calculus plural is "calculi"/"calculuses"
           | (the dictionary gives me the latter, although I've never seen
           | it in the wild myself---but I won't pursue that because it's
           | beside the point!) Thanks for the help!
        
           | layer8 wrote:
           | The plural of calculus is calculi or calculuses. Calculare
           | might be an autocorrection for a different language
           | (https://en.wiktionary.org/wiki/calculare), though given that
           | the author has a Korean name, it's more likely just a weird
           | typo.
        
         | FabHK wrote:
         | Typo.
         | 
         | -> and Ito calculus are a notable
        
         | adgjlsfhk1 wrote:
         | the article goes into the details in https://jiha-
         | kim.github.io/posts/introduction-to-stochastic-... but the TLDR
         | is it's a way to define integration of random walks.
        
         | ricoxicano wrote:
         | I think it's a reference to Ito Calculus
         | 
         | https://en.wikipedia.org/wiki/It%C3%B4_calculus
        
         | incognito124 wrote:
         | It's a typo. "calculare" is supposed to be "calculus are"
        
       | whatshisface wrote:
       | Here's my understanding of Ito calculus if it helps anyone:
       | 
       | 1. The only random process we understand initially is Brownian
       | motion.
       | 
       | 2. Luckily, we can change coordinates.
        
         | max_ wrote:
         | Thanks, could you expand more on 2?
        
           | hrududuu wrote:
           | Ito's formula/lemma is like the chain rule from calculus. It
           | is a generalization, in that it uses a second order Taylor
           | series expansion, whereas the chain rule only needs a first
           | order expansion. Anyway, I think (2) is a reflection of this
           | fact, and how the chain rule lets us compute dynamics of a
           | derived process.
           | 
           | I sort of disagree with (1), since Ito's lemma is most
           | naturally applied to ~martingales, of which Brownian Motion
           | is an important special case.
        
       | bowsamic wrote:
       | I had to study quantum stochastic calculus for my PhD. Really
       | crazy because you get totally different results for the same
       | mathematical expression compared to normal calculus
        
         | ta8645 wrote:
         | Doesn't this mean that at least one of the results is wrong?
        
           | bowsamic wrote:
           | Kinda. The differential operator in quantum Ito calculus can
           | be applied to mathematical objects that the normal
           | differentials aren't properly defined on, such as stochastic
           | variables.
        
           | antognini wrote:
           | No, I think one of the fundamental insights of stochastic
           | calculus is that the addition of noise to a process changes
           | the trajectory in a non-trivial way.
           | 
           | In finance, for instance, it leads to the concept of a
           | "volatility tax." Naively, you might think that adding noise
           | to the process shouldn't change the expected return, it would
           | just add some noise to the overall return. But in fact adding
           | volatility to the process has the effect of reducing the
           | expected return compared to what you would have in the
           | absence of volatility. (This is one of the applications of
           | the result that the original article talks about in the
           | Geometric Brownian Motion section.)
        
             | crdrost wrote:
             | Just to add to this, the reason that the things are
             | different is, stochastics as a subject is trying to do
             | calculus in the presence of noise, and what noise does is,
             | it makes your function nondifferentiable. You would think
             | that you cannot do calculus, without smooth curves! But you
             | can, but we have to modify the chain rule and define
             | exactly what we mean by integration etc.
             | 
             | So the idea is "smooth curves do X, but non-smooth noisy
             | curves do U(kh) where kh in some sense is the noise input
             | into the system, and they aren't contradictory because Y(0)
             | = X. (At least usually... I think chaos theory has some
             | counterexamples where like the time t that you can predict
             | a system's results for, is, in the presence of exactly 0
             | noise, t=[?], but in the limit of nonzero noise going to
             | zero, it's some finite t=T.)
        
       | tsunego wrote:
       | still wild to me that diffusion models are fast becoming the
       | secret sauce behind ai image generation, but their roots are
       | buried deep in stochastic calculus
       | 
       | who knew brownian motion would eventually help create cat memes?
        
       | robwwilliams wrote:
       | Question for HN readers: We have defined about 50 spots (loci) in
       | the mouse genome that contain DNA differences that modulate
       | mortality rates. Most of them have complex age-dependent
       | "actuarial" effects. We would like to predict age at death.
       | 
       | Would stochastic calculus be a useful approach in actuarial
       | prediction of life expectancies of mice?
       | 
       | (And this is why I am pleased to see this high on HN.)
        
         | seanhunter wrote:
         | Can't speak about mice, but stochastic calculus is used in
         | modelling for life insurance for humans I believe.
         | 
         | eg https://www.soa.org/globalassets/assets/Files/static-
         | pages/r...
        
           | joe_the_user wrote:
           | Your link doesn't demonstrate the use of stochastic calculus
           | by life insurance companies or for life insurance. It's just
           | an undergraduate curriculum for actuarial students (that they
           | learn all this stuff doesn't imply that's what life insurance
           | companies use).
        
           | layer8 wrote:
           | This is rather https://en.mwikipedia.org/wiki/Stochastic_mode
           | lling_(insuran...
        
         | whatshisface wrote:
         | Stochastic calculus is like ordinary calculus in that it is
         | most useful when one time is like another except for a few
         | variables that describe a state, and least useful when one time
         | is unlike another.
         | 
         | Because you have as many questions (loci) as you have segments
         | that you can reasonably expect to divide time into (changing
         | the time of death by 1/50th of a mouse lifespan would be
         | impossible to detect unless I am wrong?), and because the time
         | intervals are not that numerous, and also because you wouldn't
         | really have a model for the interaction of the state variables
         | and would be using model-free statistical methods, I think you
         | would get all of the value there is to get out of noncontinuous
         | methods.
        
         | joe_the_user wrote:
         | (Just spitballing)
         | 
         | I think stochastic calculus looks at a system whose output
         | value is a smooth/real value. Basically, it is for modeling
         | systems like random walks where there is a little bit of random
         | up-and-down jumping in each interval. However, if you are
         | basically looking time versus dead-or-alive, your output is
         | binary and time-of-death is really all the info you get and you
         | wouldn't need/want a random walk model, just a more ordinary
         | statistical model. Maybe if there was some other variable
         | besides dead-or-alive you were measuring or aware of a
         | stochastic model could help then (which is a bit like saying
         | "if we had bacon, we could have bacon-and-eggs, if we had
         | eggs").
         | 
         | Also, if what you're saying is you have 50*X bytes of
         | information that all influence life expectancy, it sounds like
         | a challenging problem. But also it's kind of Taylor-made for
         | neural networks; many discreet inputs versus a single smooth
         | output. You might try a neural network and linear model and see
         | how much better the neural network is - then you could
         | determine if more complex-than-linear interactions were
         | occurring.
        
         | bbminner wrote:
         | Just in case you missed it,
         | https://en.m.wikipedia.org/wiki/Survival_analysis exists to
         | answer specifically this question.
         | 
         | In more practical terms, if I were to approach this problem,
         | I'd discretize it in time and apply classical ml to predict
         | "chance to die during month X assuming you survived that long"
         | and fit it to data - that'd be much easier to spot errors and
         | potential issues with your data.
         | 
         | I'd go for the stochastic calculus or actual survival analysis
         | only if you wanted to prove/draw a connection between some pre-
         | existing mathematical properly such as memory-less-ness and a
         | physical/biological properly of a system such as behavior of
         | certain proteins (that'd be insanely cool, but rather hard, esp
         | if data is limited). In my (very vague) understanding, that's
         | what finance papers that use stochastic analysis do - they make
         | a mathematical assumption about some universal mathematical
         | properly of a system (if markets were always near optimal with
         | probability of deviation decaying as XYZ, the world economy
         | would react this way to these things), and then prove that it
         | actually fits the data.
         | 
         | Happy to chat more, sounds like a fun project :)
        
         | etiam wrote:
         | I'm not prepared to say "no", and as has been noted already, it
         | depends on the application, but from your description it seems
         | to me more like a task for Bayesian statistics organized on
         | graphs (the nodes & vertices kind).
        
           | btown wrote:
           | And going beyond this: my layman's understanding of biology
           | is that the way in which genes are expressed can be highly
           | nonlinear and modulated by all sorts of different pathways.
           | If you have some clarity on how these pathways work,
           | probabilistic programming might be a helpful tool here in a
           | Bayesian context.
           | 
           | It's been a number of years since I've looked at these
           | things, but https://www.theactuary.com/2024/04/04/bayesian-
           | revolution and https://arxiv.org/abs/2310.14888 are recent
           | articles that may be relevant.
        
         | nextos wrote:
         | I was coming here to say this is a survival analysis problem,
         | and thus a different branch of probability and statistics.
         | However, you can also frame it as a stochastic process if you
         | have extra epigenetic data that is associated to those 50 DNA
         | loci or some genes they regulate.
         | 
         | For example, your DNA loci of interest could have a state
         | (methylated or unmethylated). And you could come up with a
         | stochastic process where death occurs when a function of
         | methylation changes at those loci (e.g. a linear model) crosses
         | a threshold (first passage in stochastic process jargon).
         | 
         | Omer Karin & Uri Alon have published a similar concept to
         | explain how the decreased capacity of immune cells to remove
         | senescent cells leads to a Gompertz-like law of longevity,
         | something that originates from actuarial studies! Their model
         | is simpler as they deal with a univariate problem [1].
         | 
         | [1] https://www.nature.com/articles/s41467-019-13192-4
        
         | evanfrommaxar wrote:
         | I would apply an L1-regularized regression where the variables
         | are simple 0-1 for the presence of the gene. The
         | L1-regularization helps you deal with the high-dimensionality
         | of the problem.
         | 
         | https://en.wikipedia.org/wiki/Lasso_(statistics)
         | 
         | Since these are ages, I wouldn't assume an underlying Gaussian
         | distribution. Making that change isn't as hard as you think.
         | 
         | https://en.wikipedia.org/wiki/Generalized_linear_model
         | 
         | As Always: Consult your friendly neighborhood statistician
        
       | markisus wrote:
       | Here is a corresponding introduction I found very useful, for
       | readers with advanced undergraduate / graduate level math
       | knowledge.
       | 
       | https://almostsuremath.com/stochastic-calculus/
        
         | hrududuu wrote:
         | Great resource. This was my area of graduate study, and I would
         | say this material is quite hard, in the beginner to advanced
         | PhD range.
         | 
         | And this inspiring textbook I think has high overlap with these
         | topics: https://www.amazon.com/Stochastic-Integration-
         | Differential-E...
        
           | markisus wrote:
           | Yes, by advanced undergraduate, I meant _very_ advanced
           | undergraduate. But when I was in undergrad I always heard
           | about some students like this who were off in the graduate
           | classes. And then in grad school, there was even a high
           | school student in my Algebra course who managed to correct
           | the professor on some technical issue of group theory. So I
           | don 't assume you have to be a PhD to work through this
           | material.
        
       | eachro wrote:
       | For those in quant finance, how much of this is useful in your
       | day to day?
        
         | mamonster wrote:
         | Day to day not so much unless you are in structured
         | products/exotics as a structurer, at which point yeah its
         | pretty important.
         | 
         | That said, already at masters level internships you could get
         | asked much harder questions than what this article touches on.
         | I got asked to prove the Cameron-Martin theorem once, I found
         | that to be extremely difficult in a job interview setting.
        
         | keithalewis wrote:
         | There is no need for it. Here is a simple replacement:
         | https://keithalewis.github.io/math/um1.html.
        
       | janalsncm wrote:
       | Here's an example where I ran into this recently.
       | 
       | Let's say we play a "game". Draw a random number A between 0 and
       | 1 (uniform distribution). Now draw a second number B from the
       | same distribution. If A > B, draw B again (A remains). What is
       | the average number of draws required? (In other words, what is
       | the average "win streak" for A?)
       | 
       | The answer is infinity. The reason is, some portion of the time A
       | will be extremely high and take millions of draws to beat.
        
         | zzazzdsa wrote:
         | Does this really require stochastic calculus to prove? This
         | should just be a standard integration, based on the fact that
         | the expected number of samples required for fixed A being
         | 1/(1-A).
        
         | RandomBK wrote:
         | The way the question was framed, it was ambiguous whether "draw
         | again" only applied to B, or whether A would draw again as
         | well. I'm assuming the 'infinity' answer applies only to the
         | former case?
        
           | janalsncm wrote:
           | Sorry, we only draw B again.
        
         | drdeca wrote:
         | Showing the calculation you described:
         | 
         | If p is the value drawn for A, then each time B is drawn, the
         | probability that B>A is (1-p), So, the chance that B is drawn n
         | times before being less than or equal to A is, p^(n-1) (1-p) (a
         | geometric distribution). The expected number of draws is then
         | (1/p) . Then, E[draws] = E[E[draws|A=p]] = \int_0^1
         | E[draws|A=p] dp = \int_0^1 (1/p) dp, which diverges to infinity
         | (as you said).
         | 
         | (I wasn't doubting you, I just wanted to see the calculation.)
        
       | graycat wrote:
       | Own favorite source on stochastic calculus:
       | Eugene Wong,          {\it Stochastic Processes in Information
       | and          Dynamical Systems,\/}          McGraw-Hill,
       | New York,          1971.\ \
        
       | paulfharrison wrote:
       | A further step is Langevin Dynamics, where the system has damped
       | momentum, and the noise is inserted into the momentum. This can
       | be used in molecular dynamics simulations, and it can also be
       | used for Bayesian MCMC sampling.
       | 
       | Oddly, most mentions of Langevin Dynamics in relation to AI that
       | I've seen omit the use of momentum, even though gradient descent
       | with momentum is widely used in AI. To confuse matters further,
       | "stochastic" is used to refer to approximating the gradient using
       | a sub-sample of the data at each step. You can apply both forms
       | of stochasticity at once if you want to!
        
         | zzazzdsa wrote:
         | The momentum analogue for Langevin is known as underdamped
         | Langevin, which if you optimize the discretization scheme hard
         | enough, converges faster than ordinary Langevin. As for your
         | question, your guess is as good as mine, but I would guess that
         | the nonconvexity of AI applications causes problems. Sampling
         | is a hard enough problem already in the log-concave setting...
        
       ___________________________________________________________________
       (page generated 2025-02-24 23:00 UTC)