https://gregorygundersen.com/blog/2024/09/28/black-scholes/ * Home * Blog * RSS An Intuitive Explanation of Black-Scholes I explain the Black-Scholes formula using only basic probability theory and calculus, with a focus on the big picture and intuition over technical details. Published 28 September 2024 The Black-Scholes formula is the crown jewel of quantitative finance. The formula gives the fair price of a European-style option, and its success can ultimately be measured by its impact on option markets. Before the formula's publication in 1973 (Black & Scholes, 1973; Merton, 1973), option markets were relatively small and illiquid, and options were not traded in standardized contracts. But after the formula's publication, option markets grew rapidly. The first exchange to list standardized stock options, the Chicago Board Options Exchange, was founded the same year that Black-Scholes was published. And today, options are a highly liquid, mature, and global asset class, with many different tenors, exercise rights, and underlying assets. The financial and mathematical theory underpinning Black-Scholes is rich, and one could easily spend months learning the foundational ideas: continuous-time martingales, Brownian motion, stochastic integration, valuation through replication, and risk-neutrality to name just a few key concepts. But properly contextualized, the formula can be surprisingly inevitable. It can almost feel like a law of nature rather than a financial model. My goal here is to justify this claim. To begin, let's setup the problem and then state the formula. Recall that a call option (CCC) is a contract that gives the holder the right but not obligation to buy the underlying asset (SSS) at an agreed-upon strike price (KKK). A put option (PPP) is the right but not obligation to sell the underlying short, but since calls and puts are fungible through put-call parity, we will only concern ourselves with call options in this post. If we can price one, we can price the other. We say the holder exercises the option if they choose to buy or sell the underlying. A European-style option can only be exercised at a fixed time in the future, called expiry (TTT). Clearly, the payoff at expiry of a European-style option is just a piecewise linear ramp function, payoff of CT=max[?][0,ST-K],(1) \text{payoff of $C_T$} = \max\left[0, S_T - K\right], \tag{1} payoff of CT =max[0,ST -K],(1) where STS_TST denotes the value of the stock at expiry (black line, Figure 111). The single most important characteristic of an option is this asymmetric payoff. For a call option, our downside is limited, but our upside is unlimited. In finance, this kind of asymmetric behavior is called "convexity". Given this, we might guess that the price of the call before expiry, so CtC_tCt where tKS_T \gt KST >K). This idea is not original to me; it is from (Nielsen, 1992). We have: C1=contigent value of stock={STif ST>K,0else,C2= contigent value of strike={-Kif ST>K,0else.(18) \begin{aligned} C_1 & = \text{contigent value of stock} &&= \begin{cases} S_T & \text{if $S_T \gt K$,} \\ 0 & \text{else,} \end{cases} \\ \\ C_2 &= \text {contigent value of strike} &&= \begin{cases} -K & \text{if $S_T \gt K$,} \\ 0 & \text{else.} \end{cases} \end{aligned} \tag{18} C1 C2 = contigent value of stock=contigent value of strike ={ST 0 if ST >K ,else, ={-K0 if ST >K,else. (18) Furthermore, both of these terms will have a clear, simple, probabilistic interpretation that will directly map onto Equation 141414. Let's see this. First, what is E[C2]\mathbb{E}[C_2]E[C2 ]? This is an expectation, and the value of the claim is zero when the call is out-of-the-money (when STK)=-KxP(ST>K).(19) \begin{aligned} \mathbb {E}[C_2] &= 0 \times \mathbb{P}(S_T \lt K) + (-K) \times \mathbb{P} (S_T \gt K) \\ &= -K \times \mathbb{P}(S_T \gt K). \end{aligned} \tag {19} E[C2 ] =0xP(ST K)=-KxP(ST >K). (19) And it is easy to see that P(ST>K)\mathbb{P}(S_T \gt K)P(ST >K) is equal to Ph(d2)\Phi(d_2)Ph(d2 )! P(ST>K)=P(log[?](ST/St)>log[?](K/St))=P(log[?](ST/St)-ms>log[?](K/St)-ms)=P (ZT>-d2)=1-P(ZT<-d2)=P(ZTK) =P(log(ST /St )>log(K/St ))=P( slog(ST /St )-m >slog(K/St )-m )=P(ZT >-d2 )=1-P(ZT <-d2 )=P(ZT KS_T \gt KST >K. All those extra variables embedded in d2d_2 d2 just represent normalizing the log move from StS_tSt to STS_TST , such that we can represent the equation using the CDF of the standard normal rather than the CDF of STS_TST . We could, if we wanted to, represent all of this using the CDF of the un-standardized lognormal distribution. But it's cleaner and conventional to work in a standardized space. To summarize, we have shown: expected contingent value of strike=E[C2]=-KPh(d2).(22) \text{expected contingent value of strike} = \mathbb{E}[C_2] = -K \Phi(d_2). \tag {22} expected contingent value of strike=E[C2 ]=-KPh(d2 ).(22) This represents the expected value we must pay to exercise a call option, contingent on the option being exercised. Of course, this is an expected value, but the Black-Scholes price is the price in today's terms. So we need a discount factor, giving us: E[C2] at time t=-e-r(T-t)KPh(d2).(23) \text{$\mathbb{E}[C_2]$ at time $t$} = -e^{-r(T-t)} K \Phi(d_2). \tag{23} E[C2 ] at time t=-e-r(T-t)K Ph(d2 ).(23) Now for C1C_1C1 , we again have an expectation where the contingent value is zero when the option ends out-of-the-money. This is a bit more complicated than the derivation for C2C_2C2 , since STS_TST is random while KKK is fixed. By the law of total expectation, we have: E[C1]=0xP(STK]xP(ST>K),=E[ST|ST>K]P(ST>K).(24) \begin {aligned} \mathbb{E}[C_1] &= 0 \times \mathbb{P}(S_T \lt K) + \mathbb {E}[S_T \mid S_T \gt K] \times \mathbb{P}(S_T \gt K), \\ &= \mathbb {E}[S_T \mid S_T \gt K] \mathbb{P}(S_T \gt K). \end{aligned} \tag{24} E[C1 ] =0xP(ST K]xP(ST >K),=E[ST |ST >K]P(ST >K). (24) While this is a bit trickier, it's still relatively straightforward to put into words. It is the weighted average contribution of the expected stock price, conditional on the fact that we exercise the call. Since STS_TST is lognormally distributed (Equation 161616), we can write this expectation in terms of a truncated lognormally distributed random variable S^T\hat{S}_TS^T : S^T={STif ST>K,0else.(25) \hat{S}_T = \begin{cases} S_T & \text{if $S_T \gt K$}, \\ 0 & \text{else}. \end{cases} \tag{25} S^T ={ST 0 if ST >K,else. (25) And coincidentally, I recently wrote a blog post on the expected value of the truncated lognormal distribution! Using the parameters mmm and sss from Equation 151515 here and plugging them into Equation 151515 in that blog post, we can easily compute the expectation: E[S^T]=E[ST|ST>K]=E[ST]Ph(s-log[?]K-(log[?]St+m)s)1-Ph(log[?]K-(log[?]St+m))s). (26) \begin{aligned} \mathbb{E}[\hat{S}_T] &= \mathbb{E}[S_T \mid S_T \gt K] \\ &= \frac{\mathbb{E}[S_T] \Phi\left( s - \frac{\log K - (\ log S_t + m)}{s} \right)}{1 - \Phi\left(\frac{\log K - (\log S_t + m))}{s}\right)}. \end{aligned} \tag{26} E[S^T ] =E[ST |ST >K]=1-Ph(slo gK-(logSt +m)) )E[ST ]Ph(s-slogK-(logSt +m) ) . (26) And denominator is the probability that we exercise the call! 1-Ph(log[?]K-(log[?]St+m))s)=1-Ph(-d2)=Ph(d2)=P(ST>K).(27) 1 - \Phi\left(\ frac{\log K - (\log S_t + m))}{s}\right) = 1 - \Phi(-d_2) = \Phi(d_2) = \mathbb{P}(S_T \gt K). \tag{27} 1-Ph(slogK-(logSt +m)) )=1-Ph(-d2 )=Ph (d2 )=P(ST >K).(27) Putting this together, we can see that we can rewrite Equation 242424 as: E[C1]=E[ST|ST>K]P(ST>K)=E[ST]Ph(s-log[?](K/St)-m)s)=E[ST]Ph(s+d2)=E[ST]Ph (d1).(28) \begin{aligned} \mathbb{E}[C_1] &= \mathbb{E}[S_T \mid S_T \gt K] \mathbb{P}(S_T \gt K) \\ &= \mathbb{E}[S_T] \Phi\left( s - \ frac{\log(K/S_t) - m)}{s} \right) \\ &= \mathbb{E}[S_T] \Phi\left( s + d_2 \right) \\ &= \mathbb{E}[S_T] \Phi\left( d_1 \right). \end {aligned} \tag{28} E[C1 ] =E[ST |ST >K]P(ST >K)=E[ST ]Ph(s-slog(K/St ) -m) )=E[ST ]Ph(s+d2 )=E[ST ]Ph(d1 ). (28) This looks very promising, and we can solve this because we also know the expected value of a lognormally distributed random variable. It's E[ST]=exp[?]{log[?]St+m+12s2}=Stexp[?]{(r-12s2)t+12s2t}=Ster(T-t).(29) \ begin{aligned} \mathbb{E}[S_T] &= \exp\left\{ \log S_t + m + \frac{1} {2} s^2 \right\} \\ &= S_t \exp\left\{ \left(r - \frac{1}{2} \sigma^2 \right) \tau + \frac{1}{2} \sigma^2 \tau \right\} \\ &= S_t e^{r (T-t)}. \end{aligned} \tag{29} E[ST ] =exp{logSt +m+21 s2}=St exp{(r- 21 s2)t+21 s2t}=St er(T-t). (29) Now that both makes sense in a risk-neutral world, and it looks quite promising. This means we have shown E[C1]=er(T-t)StPh(d1).(30) \mathbb{E}[C_1] = e^{r (T-t)} S_t \Phi (d_1). \tag{30} E[C1 ]=er(T-t)St Ph(d1 ).(30) And once again, we need to discount this for the current time, giving us: E[C1] at time t=StPh(d1).(31) \text{$\mathbb{E}[C_1]$ at time $t$} = S_t \Phi(d_1). \tag{31} E[C1 ] at time t=St Ph(d1 ).(31) That's it. We're done. This is a simple, probabilistic interpretation of the Black-Scholes formula. To summarize, it is the difference between two contingent values: Ct=e-r(T-t)[E[C1]+E[C2]]=StPh(d1)-e-r(T-t)KPh(d2).(32) \begin{aligned} C_t &= e^{-r(T-t)} \left[ \mathbb{E}[C_1] + \mathbb{E}[C_2] \right] \ \ &= S_t \Phi(d_1) -e^{-r(T-t)} K \Phi(d_2). \end{aligned} \tag{32} Ct =e-r(T-t)[E[C1 ]+E[C2 ]]=St Ph(d1 )-e-r(T-t)KPh(d2 ). (32) In more words, the left term is the time-discounted and weighted-average expected value of the stock, contingent on exercising. And the right term is the time-discounted expected value of the strike, again contingent on exercising. The main conceptual hurdle was realizing why we needed to replace the stock-specific drift m\mum with the risk-free rate rrr. But once we understood that, we could assume that the underlying stock's log returns were normally distributed as in Equation 151515. Everything else was just computation. [truncated_] Figure 4. Thousands of random samples from geometric Brownian motion as in Figure 333 but truncated at strike K=95K=95K=95. The truncated lognormal distribution at various time points is plotted in dashed black lines. The mean of these distributions (dashed purple line) is the mean of the truncated lognormal. The term StPh(d1)S_t \Phi(d_1)St Ph(d1 ) is just this mean weighted by P(ST>K)\mathbb{P}(S_T \gt K)P(ST >K). In my mind, this view of Black-Scholes finally helped me understand Ph (d1)\Phi(d_1)Ph(d1 ). While Ph(d2)\Phi(d_2)Ph(d2 ) has a clean interpretation--it's just the probability that the call is exercised (Equation 202020)--I initially found Ph(d1)\Phi(d_1)Ph(d1 ) less clear. But perhaps my confusion was due to my attempt to understand the term in isolation. Really, I think it helps to think about StPh(d1)S_t \Phi (d_1)St Ph(d1 ) as a single unit. This represents the weighted-average present value of the stock, contingent on the option ending in the money (Equations 282828 and 292929). I think this has a nice geometric interpretation. We can think of the contingent value of our stock as following a truncated lognormal distribution (Figure 444). Here, StPh(d1)S_t \Phi(d_1)St Ph(d1 ) is the expected value of this truncated lognormal random variable, weighted by the probability that the call is exercised. Conclusion As a final sanity check, I've visualized the Black-Scholes price for a call option across various times to expiry (Figure 555). Again, we see the inherent tension between time-decay and convexity. As time passes, an option loses value as it loses optionality, but it gains value as it gains convexity. Ultimately, the Black-Scholes PDE models this tension directly. In this post, however, we side-stepped the PDE entirely in favor of an intuitive understanding of risk-neutrality. In a world without arbitrage, everyone can perfectly hedge their option positions, and thus everyone is risk-neutral. In this world, all stock prices are converted into martingales, stochastic process without drift or memory. In this world, the fair price of an option is just its time-discounted, risk-neutral expected value. And this discounted expected value has a clean interpration: it is the difference between two contingent values, the contingent value of the stock and the contingent value of the strike. [surface] Figure 5. Visualization of the Black-Scholes formula for an initial stock price S0=100S_0 = 100S0 =100, drift m=0.04\mu = 0.04m=0.04, and volatility s=0.15\sigma = 0.15s=0.15 over hundreds of samples. At each moment, the call gains or loses value, contingent on the stock price StS_tSt and the time to expiry T-tT-tT-t. Hopefully this post justifies my original claim: Black-Scholes feels inevitable, like a natural law. It makes some very simplifying assumptions, such as constant volatility, but the model works extremely well in practice. Many decades later, investors, traders, and other market participants still use Black-Scholes daily. Much like other simple models such as linear regression, Black-Scholes is wrong but useful, interpretable, and general due to its simplicity. And it is the foundation and reference point for more complex models. 1. Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637-654. 2. Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economics and Management Science, 141-183. 3. Ito, K. (1944). Stochastic integral. Proceedings of the Imperial Academy, 20(8), 519-524. 4. Ito, K. (1951). On a formula concerning stochastic differentials. Nagoya Mathematical Journal, 3, 55-65. 5. Bru, B., & Yor, M. (2002). Comments on the life and mathematical legacy of Wolfgang Doeblin. Finance and Stochastics, 6, 3-47. 6. Derman, E. (2002). The Boy's Guide to Pricing & Hedging. Available at SSRN 364760. 7. Nielsen, L. T. (1992). Understanding N (d1) and N (d2): Risk Adjusted Probabilities in the Black-scholes Model 1. Insead Fontainebleau, France.