https://laurentrdc.xyz/posts/typed-dimensions.html

Laurent P. Rene de Cotret 
Home Software Publications About me All blog posts

Scientific computing with confidence using typed dimensions

Posted on 2024-11-20.

Estimated reading time of 14 min.

e-mail GitHub LinkedIn ResearchGate OrcID Google Scholar

I have performed non-trivial scientific calculations, in university
and beyond, for almost 15 years.

Until the end of undergraduate school, this was mostly done by hand.
Optimizing the trajectory of a heavy ball or solving the heat
equation are problems with analytical solutions, and thus can be
solved with pen(cil) and paper.

However, there were instances when numerical tools had to be used,
rather than analytical ones. This occurred, for example, when
replacing models of the physical world with measurements from the
physical world in experimental physics classes. By the end of my
B.Sc., the need to take measurements, transform them, and report
results was made much simpler by using a computer.

Fast forward to graduate school, and the amount of measurements (and
their complexity) require the use of some of the most powerful
computers I have ever used, even to this day. When implementing
numerical routines, I had to pore over the equations (translated to
computer expression) countless times, to ensure that they were
correctly applied. Most importantly, all units needed to be carefully
checked by hand. Small mistakes went unnoticed and wasted valuable
resources. Take a look at one of the computations from my Ph. D.
dissertation (PDF, Git repository):

 from scipy.constants import physical_constants
 
 def population(temperature: float, frequency: float) -> float:
     """
     Determine average phonon population at `temperature`.
 
     Parameters
     ----------
     temperature : float
         Temperature in Kelvin
     frequency : float
         Phonon frequency in Hertz
     """
     h, *_ = physical_constants["Planck constant over 2 pi in eV s"]
     kb, *_ = physical_constants["Boltzmann constant in eV/K"]
     return 0.5 * coth(h * frequency / (2 * kb * temperature)) - 0.5

This isn't a complex equation, but it showcases the problem
perfectly. temperature, frequency, h, and kb are all floating-point
numbers, with their units attached as documentation. This is not
robust.

These days, instead of meticulously going over the details of
calculations ahead of time, I now defer to my computer to check the
validity of my calculations as much as possible. That's why computers
exist: to free people from repetitive and error-prone work. In this
post, I will describe a mechanism, typed dimensions, by which we can
eliminate entire classes of bugs, and how the dimensional Haskell
package makes this easy.

Dimensions, units, and quantities

In the international system of units (also called SI units), there
are 7 basic dimensions a physical value can have:

  * time duration TT;
  * length LL;
  * mass MM;
  * electric current II;
  * thermodynamic temperature Th\Theta;
  * amount of substance NN;
  * luminous intensity JJ.

Using the symbols associated with these base dimensions, we can
express any dimension using exponents. For example, velocity (length
over time duration) has dimensions L/T=L[?]T-1L/T = L \cdot T^{-1},
while force (mass-length over time squared) has dimensions M[?]L[?]T-2M \
cdot L \cdot T^{-2}. Dimensionless quantities have all exponents
being 0.

Why are dimensions important? There are mathematical rules associated
with combining numbers with dimensions:

  * Two quantities with dimensions D1D_1 and D2D_2 can only be added
    or subtracted if (and only if) D1[?]D2D_1 \equiv D_2. For example,
    it does not make sense to add a length to a mass.

  * Any two quantities with dimensions D1D_1 and D2D_2 can be
    multiplied or divided, in which case the resulting dimension
    follows the usual rules of arithmetic. For example:

(L[?]T-1)/(L[?]Th)=(L[?]T-1)[?](L-1[?]Th-1)=L1-1[?]T-1[?]Th-1=T-1[?]Th-1 \begin{align} \
left( L \cdot T^{-1} \right) / \left( L \cdot \Theta\right) &= \left(
L \cdot T^{-1} \right) \cdot \left( L^{-1} \cdot \Theta^{-1}\right) \
\ &= L^{1 - 1} \cdot T^{-1} \cdot \Theta^{-1} \\ &= T^{-1} \cdot \
Theta^{-1} \end{align}

From these two facts, we can derive some surprising conclusions. For
example, the argument to many functions (such as sine, cosine,
exponential, etc.) must be dimensionless! Consider the Taylor series
expansion of the sine function:

sinx=x-x33!+x55!-... \sin{x} = x - \frac{x^3}{3!} + \frac{x^5}{5!} -
...

The argument xx must have the same dimensions as x3x^3, which is only
possible if and only if xx is a dimensionless quantity.

Units of measure are physical manifestations of dimensions.
Dimensions are abstract, while units of measure are concrete. For
example, the length dimension LL can be measured in units of meters.
Each basic dimension has its own canonical unit of measure, the base
unit:

  * The second (s) for time duration TT;
  * The meter (m) for length LL;
  * The kilogram (kg) for mass MM;
  * The ampere (A) for electric current II;
  * The kelvin (K) for thermodynamic temperature Th\Theta;
  * The mole (mol) for amount of substance NN;
  * The candela (cd) for luminous intensity JJ.

Units for non-basic dimensions, so-called derived units, are
combinations of base units using the usual multiplication ([?]\cdot)
and division (//) operators. For example, velocity has dimensions of
L[?]T-1L \cdot T^{-1}, and can be measured in units of meters per
second (ms\frac{m}{s}). Some derived units have names; for example,
the Newton (N) is a derived unit corresponding to kg[?]ms2\frac{\text
{kg} \cdot \text{m}}{\text{s}^2}.

Based on their dimensions, units can only be combined in certain
ways. You can only add/subtract units of the same dimensions, while
the multiplication/division of units modifies the dimensions.

Units can be converted to any other units of the same dimensions by
conversion base unit. For example, converting from units of imperial
feet (ft) to kilometers (km) involves conversion from imperial feet
to the base unit of length, the meter, and then conversion from the
meter to the kilometer.

Finally, a quantity is a number attached to units. 3.15 meter/second
is a quantity with magnitude 3.15, units of meters/second, and
dimensions of length over time duration (or L[?]T-1L \cdot T^{-1}).

Scientists take measurements of quantities, and run scientific
computations to get answers in the form of other quantities. We can
ensure correctness of these computations by realizing that dimensions
are known ahead of time. This means that for statically- and
strongly-typed languages (e.g. Haskell, Rust), we should enforce
correctness using the type system.

Type-safe dimensions

People's first contact with computational unit systems may come from
pint, a Python library to manipulate quantities. While pint is better
than nothing, it does not guarantee correctness at runtime due to
Python's dynamic nature.

Instead, I prefer the dimensional package, a Haskell library that
guarantees correctness by ensuring type-safe dimensions at
compile-time. Other tools exist for other languages - for example, F#
famously includes units of measure as a first-party feature.

We will work by example to compute the Maxwell-Boltzmann distribution
. This is the distribution of velocities of identical particles of
mass mm, given some thermodynamic temperature TT:

f(v)=(m2pkBT)32exp(-mv22kBT) f(v) = \left( \frac{m}{2 \pi k_B T}\
right)^{\frac{3}{2}} \exp{ \left( -\frac{m v^2 }{2 k_B T}\right) }

where vv is the magnitude of particle velocity in a three-dimensional
space, and kBk_B is the Boltzmann constant.

Without typed dimensions, the computation of this distribution can be
done as follows:

 -- Boltzmann constant in Joules/Kelvin
 boltzmannConstant_JpK :: Double
 boltzmannConstant_JpK = 1.380649e-23
 
 
 untypedMaxwellBoltzmannDist :: Double -- ^ Particle mass in kilogram
                             -> Double -- ^ Thermodynamic temperature in Kelvin
                             -> Double -- ^ Particle velocity in meters/second
                             -> Double -- ^ Probability density (dimensionless)
 untypedMaxwellBoltzmannDist mass_kg temp_K velocity_mps 
     = ( mass_kg / (2 * pi * boltzmannConstant_JpK * temp_K) ) ** (3/2) 
     * exp ( - (mass_kg * velocity_mps **2) / (2 * pi * boltzmannConstant_JpK * temp_K) )

It's not too hard to check units by hand - but it is tedious and
error-prone.

Now let's do this again, using dimensional.

Dear reader, be forewarned: the rest of this post is not for the
faint-of-heart.

Yes, this is overkill for simple calculations. And yet, the technique
you are about to see has saved me numerous times. There's quite a bit
more ceremony; in exchange, you get a lot more certainty about
correctness. Software design is about trade-off, and this level of
safety is required for some real-world applications.

Setup: I am using the GHC2024 language edition. We will also replace
the (implicitly imported) Prelude module to use
Numeric.Units.Dimensional.Prelude, which replaces the usual
arithmetic operators to use dimension-aware ones, as well as
importing the Unit, Dimension, and Quantity types.

 {-# LANGUAGE NoImplicitPrelude #-}
 
 import Numeric.Units.Dimensional.Prelude

To get acquainted, let's convert the definition of the Boltzmann
constant. The dimensions of the Boltzmann constant are called
DHeatCapacity in dimensional. Therefore, we can write:

 boltzmannConstant :: Quantity DHeatCapacity Double

which means that boltzmannConstant is a Quantity of dimension
DHeatCapacity, whose magnitude is represented by a Double. We use the
(*~) operator to combine a number (1.380649e-23) with a compound
unit, joule / kelvin, to get:

 boltzmannConstant :: Quantity DHeatCapacity Double
 boltzmannConstant = 1.380649e-23 *~ (joule / kelvin)

We are not yet ready to implement the Maxwell-Boltzmann distribution
function. One particular wrinkle is that exponents (such as the 3/23/
2 exponent of the first term) change the dimensions (i.e. the type)
of the input. Therefore, some type arithmetic is required. To this
end, dimensional provides (^) to raise to an integer power, and sqrt
to take the square root, while appropriately changing the dimensions
at compile time. The type signatures are non-trivial to say the
least, as it involves compile-time manipulation of types. The
operation of "raising to the three-halfs power" becomes:

 raiseToThreeHalfsPower :: Floating a 
                        => Quantity d a 
                        -> Quantity (NRoot d Pos2 ^ Pos3) a
 raiseToThreeHalfsPower x = (sqrt x) ^ pos3

where pos3 is a type-level integer. The dimension (NRoot d Pos2 ^
Pos3) will resolve to the appropriate dimension at compile-time via
type families once d is known.

Finally, we can implement the (dimensionally-typed) Maxwell-Boltzmann
function:

 maxwellBoltzmannDist :: Mass Double
                      -> ThermodynamicTemperature Double
                      -> Velocity Double
                      -> Dimensionless Double
 maxwellBoltzmannDist mass temp velocity 
     = raiseToThreeHalfsPower ( mass / (_2 * pi * boltzmannConstant * temp) )
     * exp ( negate (mass * velocity ^ pos2) / (_2 * pi * boltzmannConstant * temp) )

where _2 is the dimensionless integer 2, and (*), (/), exp, negate,
and ^ are all dimensionally-aware operators from dimensional (rather
than the Prelude module).

If you compile the program above, you'll get an error message tth_tth:

Couldn't match type 'Neg3' with 'Zero'
  Expected: Dimensionless Double
    Actual: Quantity
                (NRoot
                    (DMass
                        / ((Dim Zero Zero Zero Zero Zero Zero Zero * DHeatCapacity) * DThermodynamicTemperature))
                    Pos2 ^ Pos3
                )
                Double
(...snip...)

This is because I'm a sloppy physicist - sorry! The Maxwell-Boltzmann
distribution is not a dimensionless quantity: it actually returns a
velocity density with dimensions of 1/(L3[?]T-3)1 / (L^3 \cdot T^{-3}),
or L-3[?]T3L^{-3} \cdot T^{3}.

Let's fix this. There is no type synonym for velocity density in
dimensional, but we can create one ourselves.

 type DVelocityCube = DVelocity ^ Pos3
 
 type DVelocityDensity = Recip DVelocityCube -- dimension
 type VelocityDensity = Quantity DVelocityDensity -- quantity

Our final version of maxwellBoltzmannDist becomes:

 maxwellBoltzmannDist :: Mass Double
                      -> ThermodynamicTemperature Double
                      -> Velocity Double
                      -> VelocityDensity Double
 maxwellBoltzmannDist mass temp velocity 
     = raiseToThreeHalfsPower ( mass / (_2 * pi * boltzmannConstant * temp) )
     * exp ( negate (mass * velocity ^ pos2) / (_2 * pi * boltzmannConstant * temp) )

We can compute the result of maxwellBoltzmannDist in any compatible
unit. We will do this for a gas of diatomic nitrogen. In an
interactive Haskell session:

 ghci> import Numeric.Units.Dimensional.NonSI ( unifiedAtomicMassUnit ) -- unifiedAtomicMassUnit = amu
 ghci>
 ghci> let n2_mass          = 2     *~ unifiedAtomicMassUnit
 ghci> let room_temperature = 300.0 *~ kelvin
 ghci> let velocity         = 400   *~ (meter / second)
 ghci>
 ghci> maxwellBoltzmannDist n2_mass room_temperature velocity 
 4.466578950309018e-11 m^-3 s^3

We can easily request specific output units using the (/~) operator.
In this case, we convert result to (hour / nautical mile)^3:

 ghci> import Numeric.Units.Dimensional.NonSI ( nauticalMile, degreeRankine, knot )
 ghci>
 ghci> let n2_mass          = 2.6605E-27 *~ kilo gram
 ghci> let room_temperature = 491.0      *~ degreeRankine
 ghci> let velocity         = 777        *~ knot
 ghci>
 ghci> (maxwellBoltzmannDist n2_mass room_temperature velocity) /~ ((hour / nauticalMile) ^ pos3)
 5.041390507268275e-12

Conclusion

This was a whirlwind tour of dimensional with a practical example. If
this was your first introduction to typed dimensions in Haskell, do
not be alarmed - I bounced off of it the first time.

As mentioned, software design is about trade-off. Using dimensional
is complex, but it isn't complicated^1; it makes you address the
complexity of dimensions and units up-front, rather than hoping for
the best. Sometimes this doesn't make sense, but for critical
calculations, there is nothing like typed dimensions.

I am now a maintainer of dimensional on GitHub, and I want nothing
more than to help people use typed dimensions (when it makes sense).
If you have any feedback, for example missing features or lacking
documentation, feel free to raise an issue!

Code available in Haskell module.

---------------------------------------------------------------------

 1. Complexity is inherent, complicatedness is extraneous.-[?]

e-mail | GitHub | LinkedIn | ResearchGate | OrcID | Google Scholar

RSS feed | Atom feed

Page generated on 2024-11-21. This website was created using free and
open-source technologies. You can learn more about how this website
is generated here .

Creative Commons License