https://xvw.lol/en/articles/why-ocaml.html
xvw.lol
Fr Index
Why I chose OCaml as my primary language
2025-08-13
This article is a translation, the original version is available here
.
I started using the OCaml language regularly around 2012, and since
then, my interest and enthusiasm for this language have only grown.
It has become my preferred choice for almost all my personal
projects, and it has also influenced my professional choices. Since
2014, I have been actively participating in public conferences
dedicated to programming and software development, where I often
express my enthusiasm for OCaml in ways that may be a bit over the
top (but always passionate). This has earned me, in a friendly way,
the nickname OCaml evangelist -- a title that, I admit, I find very
flattering. Moreover, I'm not alone in thinking this. Despite the
common misconception that OCaml wouldn't be a pragmatic choice for
industry, major companies such as Meta, Microsoft, Ahref, Tarides,
OCamlPro, Bloomberg, Docker, Janestreet, Citrix, Tezos, and many
others actively use it.
* Foreword
+ Other resources
* OCaml as a language
+ On static type checking
+ Features of the language
o A multi-paradigm language
# Syntax a la ML
# Closely related to research
o Algebraic types
o Modular programming and module language
o Dependency injection and inversion
# Through modules
# Through user-defined effects
+ Regarding the future
+ Weaknesses
+ To conclude on language
* OCaml as an ecosystem
+ Compilation, runtimes, and additional targets
o A quick detour via MirageOS
+ The OCaml platform
o OPAM, the package manager
o Dune, the build-system
# On the choice of S-expressions
# Contribution to the state of the art: Selective
Applicative Functor
# Alternatives
o LSP and Merlin for editors
# The advent of VSCode, LSP as standard
o Odoc, the documentation generator
+ Available libraries
o Side note on the standard library
+ Ecosystem Conclusion
* On the community
* Some myths about OCaml
+ OCaml and F#
+ Doubled operators for floats
+ On the separation between ml and mli
o Encapsulation without mli
o Expressing the interface from ml
o To conclude on separation
* Conclusion
In this opinion piece, I will try to briefly share my encounter with
the language and list its advantages -- organized into several
sections covering the language itself, its ecosystem, and its
community. I will also attempt to debunk some popular myths (or
misconceptions) found on the Internet. For the sake of transparency,
it is important to note that, at the time of writing, my professional
work involves working for and on the OCaml ecosystem. However,
readers who have followed me for several years can attest that I was
promoting the language long before I was paid to work on the OCaml
ecosystem, sometimes rather immoderately.
Foreword
First, this article will explain why I personally believe that OCaml
is a relevant choice in many contexts. My goal is not specifically to
convince you--although that would be a very welcome side effect -- and
it's quite likely that many of the arguments I present will also
apply to other languages!
Also, very often, when I suggest OCaml to people who want to explore
new languages or try out solutions written in OCaml, I'm kindly told
that I'm always promoting OCaml. It's amusing to notice that when the
suggestions involve languages adopted by default, like JavaScript, or
more recent ones like Rust or Go, they tend to trigger fewer
reactions. This is probably because people implicitly assume that
proposing a lesser-known language leans toward irrationality and
personal preference. From my point of view, suggesting OCaml is, in
many cases where fine-grained memory control is not needed, just as
relevant as suggesting Rust (and probably more so).
To wrap up this preface, many people first encountered OCaml (or Caml
Light) during their undergraduate studies or in preparatory classes,
often using it in contexts far removed from industry. As for me, I
started getting interested in OCaml much earlier, thanks to the Site
du Zero, where a small community of functional programming
enthusiasts promoted less mainstream languages like OCaml, Erlang,
and Haskell. My interaction with OCaml at university was just a bonus
.
Other resources
I'm not the first to document the reasons for choosing OCaml. There
are many other resources that, in my opinion, are also worth checking
out, and they show that OCaml users are generally very satisfied -- so
much so that they're motivated to share how and why we chose the
language as our main technology:
* "Why OCaml?", the prologue of the book Real World OCaml, which
presents factual advantages of using OCaml (and whose
introduction includes a timeline). While the book is excellent in
many respects, I've gotten into the habit of not recommending it
because I find its usage approach quite biased, suggesting
libraries by default that aren't necessarily widely accepted in
the community.
* "Better Programming Through OCaml", the prologue of the book
(accompanied by videos) OCaml Programming: Correct + Efficient +
Beautiful, which mainly explains how learning OCaml can improve a
developer's skills in other, more popular technologies. The book
is fairly recent, and it's the one I now recommend as the go-to
resource for getting started with OCaml.
* Talk: "Why OCaml?", a presentation by Yaron Minsky, CTO of Jane
Street--an industrial user of OCaml and one of the global leaders
in finance. Yaron is also one of the authors of Real World OCaml
and the originator of the widely quoted phrase in the statically
typed programming languages world, "Make illegal states
unrepresentable". The talk offers plenty of insights into Jane
Street's motivations for choosing OCaml.
* "OCaml for Fun & Profit: An Experience Report", presented by Tim
McGilchrist at Yow 2023. After a rich introduction to the
language, it covers some very concrete use cases of OCaml in
production -- with fun and profit.
* "Replacing Python for 0Install" by Thomas Leonard. This series of
articles is, in my view, incredibly interesting. The author of
0Install, a decentralized, cross-platform software installation
system (a slightly older alternative to Nix), was looking for a
language other than Python for a new version's implementation
(the reasons for replacing Python are also documented here) and
carried out a thorough, methodical comparison of several
candidates: ATS, C#, Haskell, Go, Rust, and OCaml, alongside
Python. Years later, I'm still impressed by the rigor and nuance
of this series, which I highly recommend.
There are probably other resources and testimonials, notably on the
official website, which features both industrial and academic case
studies. There are also articles expressing the frustration OCaml can
cause. I'm aware that OCaml is not perfect--nor do I believe any
technology is perfect. I'll likely refer to some of these articles
(implicitly or explicitly) in the section on myths and in the
conclusion, where I'll try to explain in which contexts I don't find
OCaml to be a relevant choice.
OCaml as a language
Before diving into the features offered by the language, I'd like to
start with a point that I believe is essential. OCaml is a
programming language that originated from research and is used by
industrial users. This duality is important because it provides the
language with two key advantages:
* Guidance on desirable features as interesting language concepts,
supported by advanced research. For example, to my knowledge,
OCaml is the first mainstream language to offer native support
for user-defined effects, which is the result of cutting-edge
research, illustrated by numerous publications.
* Guidance on desirable features as tools for industrialization,
also backed by research and motivated by practical use cases. For
instance, recently, Jane Street, a major industrial OCaml user,
proposed the integration of affine sessions, enabling linear
resource management (somewhat Rust-like).
This intertwining of industrial and academic motivations allows OCaml
to offer a collection of solid, useful, and well-defined features. In
other words, OCaml is a living language, and since I've been using
it, I've witnessed many developments and additions that I consider
highly desirable and that debunk a common assertion against OCaml:
the language is only useful for theory or for implementing Coq/Rocq.
Although this was historically true, the motivations provided by
industrial users justify the label "An industrial-strength functional
programming language with an emphasis on expressiveness and safety."
The opening keynote of the OCaml Workshop 2021 by Xavier Leroy,
titled "25 Years Of OCaml," presents an exhaustive timeline of
OCaml's continuous design, showing the various phases of evolution
the language has undergone.
In broad terms, OCaml is a programming language from the ML family,
high-level (here, meaning it features garbage collection), statically
typed (types are checked at compile time with no implicit
conversions), with type inference (also called type synthesis),
allowing the compiler to deduce the type of an expression in most
cases. This enables programming in both functional and imperative
styles.
OCaml also provides an object-oriented programming model and a very
rich module system. The language has two compilation schemes: ocamlc,
which compiles to a bytecode executable by a virtual machine
(portable and efficient), and ocamlopt, which compiles to native
machine code (runnable on a wide variety of architectures).
Moreover, OCaml allows conversion of its bytecode to JavaScript using
Js_of_ocaml, enabling very fast interoperability within the OCaml
ecosystem (which I use extensively on this website). The same
approach is used to produce WebAssembly. For deeper interoperability
with the JavaScript ecosystem, Melange takes a somewhat different
approach than Js_of_ocaml to generate robust JavaScript.
OCaml is a highly versatile language, and I will now try to present
the features and strengths that make it -- for me -- an ideal tool for
building both personal and professional projects, starting with a
brief detour into static typing.
On static type checking
When I was preparing, with Bruno, the episode of If This Then Dev
dedicated to OCaml -- which, in the end, was recorded with Didier -- he
asked me a question that I found surprising:
"Is it really worth bothering with types when working on a
personal project quickly? Even though I can perfectly see the
value for production code, for a personal project it seems like a
waste of time to me."
I think there are two main angles to answer this. The first, and most
obvious, is that, in principle, I don't see why a personal project
should be any less disciplined than a professional one. When I write
software for myself, I could indeed get away with ignoring the corner
cases of my implementation. Sure, that's possible. But that's
probably not what I actually want to do. So, if a language and its
compiler let me set up safety nets that force me to account for all
the cases in my software, I take them -- just like writing unit tests
makes development easier, and I don't see them as a constraint.
But beyond considerations of hygiene in a personal project, I think
the negative reputation of static type checking usually stems from a
bad experience. Indeed, in languages like C or Java, types are mostly
a constraint that can be easily circumvented. In languages that place
a strong emphasis on typing -- like OCaml, Haskell, F#, Scala, or Rust
-- types act as safeguards. More importantly, in my view, types also
serve as a tool for expressive design. Using them provides safety
while also offering an incredibly rich, versatile, and concise way to
describe data.
From my experience, even though it's common to move from a
poorly-typed (sorry, the temptation is too strong) to a dynamically
typed language -- I, for instance, happily transitioned from Java to
Ruby -- moving from a language with a rich type system, like OCaml or
Haskell, makes switching to a dynamically typed language much harder.
At present, I don't know anyone who has seriously used languages like
OCaml or Haskell and was happy to return to languages with less
sophisticated type systems (though an interesting project can
sometimes justify such a technological regression).
This is not just a personal observation; static type checking is
central to the broader debate about the evolution of programming
languages. Historical languages evolve (or attempt to evolve) to
integrate more type checking. For instance, Erlang, as early as the
1980s (before its compiler source was released), experimented with
integrating a type system. Java, version by version, enhances
features aimed at improving static type verification, such as
incorporating sealed families.
Many languages are experimenting with type systems: Ruby with RBS,
Crystal (a statically typed language heavily inspired by Ruby),
Python with Mypy, Elixir (which revisits Erlang's past experiments,
offering a viable gradual typing approach), and, of course,
TypeScript, which has become widely adopted in the JavaScript
community.
While all these initiatives are encouraging and clearly move in the
right direction, for now, they primarily add safeguards but do not
yet serve as expressive design tools.
When it comes to increasingly rich type systems, the White House
recently published a report emphasizing the importance of memory
safety in software design and... endorsing the use of the Rust language
(historically written in OCaml before becoming self-hosted) over C++,
clearly showing that even official bodies (often considered outdated)
highlight the value of rich type systems. Moreover, the response from
Tarides, the company I work for at the time of writing this article,
also presents compelling arguments in favor of using OCaml for
building critical systems.
In conclusion, static type checking is really valuable and highly
recommended, and it's worth exploring languages with sophisticated
type systems (like OCaml) and, why not, going even further by
increasingly delving into formal methods.
Features of the language
Even though it's very tempting to create a massive OCaml tutorial,
the goal of this section is to present what makes OCaml, for me, a
highly relevant choice for both learning and production. The
advantages will therefore be presented (and defended), but this is
not a tutorial.
A multi-paradigm language
Nowadays, talking about multi-paradigm languages might seem
unnecessary, since a large majority of programming languages favored
by industry are already multi-paradigm. However, OCaml is a
functional programming language that also supports imperative
programming, modular programming, object-oriented programming, and,
since version 5.0.0, multi-core programming.
Just as Haskell is widely recognized in the functional programming
world, it's often assumed that adding imperative mechanisms to a
language is a bad idea -- especially if one is convinced of the
benefits of the functional style. From my perspective, there are
several perfectly legitimate reasons to use imperative programming
when the language allows it:
* Readability of an implementation. Sometimes, avoiding mutability
requires adding extra plumbing (for example, a State Monad),
which can make reading and understanding a program more
cumbersome.
* Performance. Adding such plumbing can introduce overhead, making
the execution of implementations more costly.
* Ease of use. A few years ago, Arthur Guillon ceremoniously told
me that "OCaml is a lambda calculus that trivially allows effects
," which makes it very effective for tasks like debugging, where
printing messages to standard output is simple. While I
acknowledge that this is probably not the best way to implement
logging, it undeniably provides a comfortable user experience and
enables rapid prototyping.
In general, OCaml's dual nature -- both imperative and functional --
allows you to leverage the advantages of both paradigms in different
situations and, of course, to combine them. For example, hidding a
module's imperative nature behind a functional API.
Syntax a la ML
Although syntax is often considered a minor detail, languages in the
ML family have a concise, expressive, and readable syntax. Even
though this family of syntax can be confusing when coming from more
conventional, C-inspired syntax, one gets used to it fairly quickly
and can soon realize that it is very consistent and relatively
unambiguous. However, if OCaml's syntax is problematic for you, don't
hesitate to look into ReasonML, an alternative syntax that uses
braces.
Closely related to research
OCaml is a language that originates from French research, as shown by
the history of Caml, primarily designed to implement the proof
assistant Coq/Rocq. This origin -- and the initial motivations,
implementing Coq while also serving as a programming language taught
in preparatory classes--creates a certain duality:
* The core features were not initially designed with industry in
mind. However, this assertion is no longer true, primarily
because OCaml has become a language used in industrial contexts.
While in the language's genesis, there were more tools for
building a language itself (facilitating the teaching of compiler
mechanisms) than tools for building "enterprise" applications,
projects from the community motivated by industrial use have
enriched the language and its ecosystem, making it a versatile
tool suitable for industry. For example, creating a binding with
the Tk library led to the integration in the language of named
arguments, optional arguments, and polymorphic variants.
* The set of paradigms and language features are carefully thought
out and well-theorized. Generally, the integration of a feature
(or collection of features) results from meticulous research,
based on solid theoretical foundations and reviewed by numerous
experts in the field (often recognized by the scientific
community). This rigor can sometimes slow the introduction of new
features but generally ensures their proper functioning and
theoretical stability.
This theoretical rigor, stemming from OCaml's undeniable closeness to
the research world, means that its various aspects are well
documented, illustrated by a large number of publications, and
exhibit predictable behavior. From my point of view, this makes OCaml
a very wise choice for understanding these different features in
depth. For example, I believe OCaml has allowed me to much better
understand certain traits or paradigms of programming languages.
Moreover, a great example of how meticulous and rigorous research can
support the integration of a language feature is OCaml's
implementation of an object model. Indeed, the thesis of Jerome
Vouillon, Design and Implementation of an Extension of the ML
Language with Objects, proposes an innovative object model that
integrates very well with type inference by separating the notions of
inheritance and subtyping -- inheritance being a syntactic notion and
subtyping a semantic notion -- using row polymorphism to describe
structural subtyping relationships, as opposed to nominal subtyping,
used by Java, C#, and most popular OOP languages. OCaml's object
model fully adheres to the SOLID principles without any additional
ceremony.
Algebraic types
I've been quite expansive about the reasons why I value a language
with static type checking. However, in my experience, for a
statically typed language to be truly usable, the presence of
algebraic types is necessary:
* Product types: These allow grouping values of heterogeneous types
(thus creating a conjunction of heterogeneous types). They are
generally present in all mainstream languages (for example,
objects, which introduce additional concepts, or tuples and
records).
* Sum types: These allow constructing a disjunction of
heterogeneous value types, with different cases indexed by
constructors. While some special cases of sums exist in
mainstream languages--like booleans (which are a disjunction of
two cases: true and false, i.e., two parameterless constructors)
-- support for full sum types is often cumbersome in popular
languages. For example, Kotlin and Java (and de facto C#) use a
construct associated with inheritance relations called sealing.
The integration of dedicated sum type syntax also took some time
in Scala, which, prior to recent versions, relied on sealed
families, making the expression of sums verbose and, in my view,
harder to reason about.
* Exponential types: These allow describing functions that express
types for higher-order functions (functions that can be passed as
arguments or returned as results).
Coupled with pattern matching and parametric polymorphism (or
generics), an algebraic type system is an incredibly expressive tool
for describing data structures, the state machine of a program, or
modeling a business domain with an appropriate cardinality. Even in
the 21st century, where products and exponentials are common, when I
use very popular languages, I am often frustrated by the lack of sum
types, which forces me to use verbose encodings (increasing the
domain's cardinality). This is particularly noticeable when working
with Go and TypeScript.
The appeal of this triad is, in fact, probably one of the reasons
(combined with a very ergonomic ecosystem and toolchain) behind the
success of Rust. In short, if you intend to build a new programming
language with static type checking, please, do not hesitate to
include algebraic types!
Finally, there are aspects of OCaml's type system that I haven't
covered, but which probably deserve dedicated articles. For example,
generalized algebraic data types (GADTs), which allow expressing even
more invariants.
Modular programming and module language
OCaml, through its ancestor Caml Light, was among the first languages
to offer a module system, similar to Standard ML, providing
encapsulation and abstraction while supporting separate compilation,
in the style of Modula-2. OCaml's module system is a fundamental
aspect of the language, although its complexity can be intimidating.
Indeed, in OCaml, it is possible to clearly distinguish the interface
(the signature) from the implementation (the structure), thus
facilitating encapsulation and documentation, while also allowing
function application within the module language.
I find it particularly difficult to address the topic of modules
briefly (it's a subject I've wanted to explore on my blog for years).
However, here is a list of advantages I see in OCaml's highly modular
approach:
* Separate compilation: A key feature that allows efficient
compilation of large programs by identifying junction points to
optimize parallel and incremental compilation. This approach is
leveraged by dune, the recommended build system for OCaml.
* Systematic separation of implementation and interface: Offers
several significant advantages, including encapsulation and
placing documentation in the interface. In my programming
workflow, I find this very convenient because I can implement my
structure (the module's implementation) while being guided by
type inference and specify its API in the signature (the module's
interface), deciding on the display order and providing clear
documentation that doesn't pollute the implementation space.
Additionally, encapsulation allows me to freely define
intermediate types inside the structure, for example, to
represent a program's state machine, without letting it escape.
* A powerful tool for describing data structures: By abstracting
types (hiding their implementation) and combining this with
encapsulation, it is possible to describe data structures that
maintain invariants. This is why it is common to have a structure
/signature pair for each data structure, hiding implementation
details through abstraction and encapsulation.
* Reusability and sharing: Just as it is possible to describe types
in the value language (as seen with algebraic types), it is also
possible to describe types in the module language, called
translucent signatures, which allow defining the type of a
signature without associating it with a structure. These
signatures are structurally typed, and coupled with functors
(functions in the module language), it is possible to share
behavior between modules.
* Advanced forms of polymorphism: Including Higher Kinded
Polymorphism, available in the module language. In broad terms,
you can describe "generics parameterized by generics". This
limitation in languages like F# or Java often motivates the use
of heavy encodings to work around the lack.
The theory behind module languages in ML-family languages is a vast
subject, still evolving, and very difficult to summarize in a single
paragraph. However, the introduction of Derek Dreyer's thesis,
Understanding and Evolving the ML Module System, provides an
excellent explanation of the purpose and use of modules, illustrated
with many examples. I hope to take the time in the coming weeks or
months to write more extensively about the module language than I
have already attempted, because it could be very educational and, in
my view, the topic is extremely interesting!
Dependency injection and inversion
Briefly touching on object-oriented programming in OCaml, I mentioned
that OCaml allows, through its language features, a straightforward
way to meet the prerequisites for writing SOLID code. The final point
I'd like to emphasize is the ease of dependency inversion, achievable
through language-provided features. In broad terms, the principle of
dependency inversion involves describing dependency lattices using
abstractions rather than implementations. This way, dependencies can
be injected afterward -- making context changes, for example in unit
testing, trivially implementable.
OCaml provides (at least) two tools that facilitate this inversion,
each useful in different contexts. We will draw inspiration from the
very popular teletype example to show how to invert dependencies:
let program () =
let () = print_endline "Hello World" in
let () = print_endline "What is your name?" in
let name = read_line () in
print_endline ("Hello " ^ name)
Even if it might not seem obvious, this program depends on concrete
implementations -- namely, interactions with standard input and
output.
Through modules
The most straightforward approach is to use modules, either as
first-class values or by construction, using functors. The duality
between signatures and structures makes dependency inversion obvious.
For example, to revisit our example, here's how, using first-class
modules, it becomes very easy to depend on an abstract set of
interactions. We start by describing the abstract representation of
possible interactions:
module type IO = sig
val print_endline : string -> unit
val read_line : unit -> string
end
We can now expect our program function to take a module of type IO as
an argument (we'll call this a handler) and use the functions
exported by the module, which in our example is named Handler:
let program (module Handler: IO) =
let () = Handler.print_endline "Hello World" in
let () = Handler.print_endline "What is your name?" in
let name = Handler.read_line () in
Handler.print_endline ("Hello " ^ name)
For example, in the context of unit testing, it's possible to provide
an implementation that logs all the operations called (and mocks the
read_line call to fix the returned result). This makes expressing
unit tests that verify business logic very easy to implement.
Passing a concrete implementation as an argument to our function
amounts to interpreting the program.
Through user-defined effects
OCaml version 5 arrived with a host of new features. However, the
biggest advancement is the complete redesign of the OCaml runtime to
support multi-core execution. There are several ways to describe
concurrent algorithms -- for example, using actors or channels. OCaml
has chosen to rely on effects, which simplify the management of the
program's control flow. In fact, OCaml allows users to define their
own effects, logically called user-defined effects. While they are a
powerful tool for describing concurrent programs, they also make it
easier to inject dependencies when you want to maintain control, at
the handler level, over the execution flow of a program.
Note: In my example, I am using an experimental syntax, just
merged into the OCaml main branch, which will likely be available
in version 5.3.0 of the language.
As with our previous improvement, we first need to describe the set
of operations that can be performed. We use the effect construct:
effect Print_endline : string -> unit
effect Read_line : unit -> string
Next, we can write our program in a direct style, by producing
effects:
let program () =
let () = Effect.perform (Print_endline "Hello World") in
let () = Effect.perform (Print_endline "What is your name?") in
let name = Effect.perform (Read_line ()) in
Effect.perform (Print_endline ("Hello " ^ name))
It is then possible to interpret our program afterward, using a
construction similar to pattern matching, to give a specific meaning
to each effect.
Currently, it should be noted that effect propagation is not tracked
by the type system. However, this is an experimental feature, which
is used extensively in the new version of YOCaml. I am aware that
resources are being devoted to developing an efficient type system to
track effect propagation!
In general, when I don't care about controlling the program's flow,
or I don't need to add effects after the fact, I use modules. But in
the case of YOCaml, the new effect system was leveraged to introduce
effects dedicated to unit testing, allowing, for example, the mocking
of time passing.
Once again, it's really difficult not to go on at length about
user-defined effects, which are a brand-new and very exciting feature
of the language. I'll conclude by simply sharing two articles written
by Arthur Wendling that explain the use of effects in a very
pedagogical way, along with a comprehensive bibliography on the
literature related to effect abstraction in functional programming:
* Scopes and effect handlers
* Roguelike with effect handlers
* Effect bibliography
It's worth noting that this inversion/injection could also be done
using records or objects. However, my experience with OCaml suggests
that approaches using modules or effects (when you want to manipulate
the program's control flow) are often more straightforward and easier
to reason about.
Regarding the future
OCaml is a constantly evolving language that changes with each
version. In the section on dependency inversion, I briefly mentioned
the recent inclusion of effects in the language to describe a
multi-core runtime, reflecting the ongoing evolution of OCaml over
the years. One can also note the integration of binding operators,
which make the use of the triad Functors, Applicative Functors, and
Monads more convenient -- similar to computation expressions in F#.
Currently, many very exciting projects are underway to further
improve the language:
* A deep work on the expression of effects, with a newly added
syntax, and a collection of research on the separation between
operations and effects and, of course, on the propagation of
effects in the type system.
* Jane Street proposed a non-intrusive resource management model,
inspired by Rust, introducing modalities and a bit of linearity.
* A genuine foundational work has been initiated on the module
language, making the implementation of Modular Implicits more
smoothly achievable.
We can also note the development of a hygienic macro system, the
gradual integration of a staged metaprogramming system, and the
implementation of an optimization back-end, reflecting OCaml's strong
activity in the innovation sector and making its development in the
coming years very motivating and exciting!
Weaknesses
Even though I'm convinced that OCaml is an excellent language,
claiming it is perfect would probably be disingenuous -- after all,
nothing is perfect. Here are, in my opinion, a few points that cast a
shadow on OCaml as a language:
* Lack of ad-hoc polymorphism. Although it is possible to work
around it, for example using local module openings, the absence
of ad-hoc polymorphism (via type classes -- as in Haskell, or
traits/implicit objects -- as in Rust and Scala, or canonical
structures -- as in Coq) can sometimes make certain situations
tricky. Even though I tend to prefer explicit relationships, over
the years I've found several cases where this absence can be
problematic:
+ The inability to describe type parameter constraints on
polymorphic functions, leading to polymorphic equality and
comparison functions in the standard library, which has
caused much debate and, for example, required specialized
versions of arithmetic operators for different numeric
representations (int, int64, float).
+ Risk of combinatorial explosion when describing many
relationships between modules. This is why the Preface
library proposes a somewhat complex modular decomposition.
However, even though the arrival of implicit modules is probably
not in the short-term roadmap, recent work on the module
language, as discussed in the "future of OCaml" section, is
promising.
* Cumbersome interaction between the module language and the value
language. The module language is a different language with its
own type system. Whether this counts as a weakness is debatable,
but this distinction can be intimidating. It comes from the fact
that OCaml's module system was a pioneer in module theory and
predates more recent innovations (e.g., 1ML). In practice,
besides being complex to grasp, certain parts of the language are
hard to specify correctly, for example recursive modules.
* A language comfortable for functional programming, but impure.
While I consider impurity a feature, importing idioms from purely
functional languages (e.g., Haskell) can cause difficulties
related to type inference, such as the value restriction. Even
though OCaml has relaxed this restriction, its implications on
polymorphic function inference can still be intimidating -- for
very good reasons.
* Syntax. Personally, I really like OCaml's syntax and believe
syntax should rarely be a major issue, but some choices can be
confusing. For instance, type parameters prefix the type name: a
list of a is written 'a list. Many of these choices aim to reduce
syntactic ambiguity, and you get used to them quickly. However,
coming from another language, some of these conventions may seem
surprising.
I think these weaknesses are generally debatable (because they are
often justified), but I completely understand that they can be
unsettling. However, I believe they are not enough to make OCaml
unusable and should not be a major barrier to getting started with
OCaml! The benefit of having an improvable language is that it
constantly offers a range of potential improvements, motivating work
that can also benefit other languages. And, to be entirely honest,
being aware of these rough edges, I've more often found myself
frustrated by the absence of language features that exist in OCaml in
other languages, rather than complaining about these rough edges
while writing OCaml itself. For these rough edges, there are usually
workarounds (sometimes only partially satisfying, I admit) that allow
one to work calmly and effectively.
To conclude on language
I have, in very broad strokes, outlined reasons why, in my opinion,
learning OCaml is a very relevant choice. This language allows one to
fundamentally understand certain very popular programming idioms
(often poorly defined). Moreover, some aspects of the language
perfectly serve industrial purposes, making good practices sometimes
trivial to express! Much of this appeal can be experimented with in
other languages, but OCaml's strongly multi-paradigm nature allows
one to centralize this learning in a single language. To my
knowledge, in the jungle of partially popular languages, only Scala
seems to cover as many topics, although, from my point of view, its
object model is, essentially for interoperability with other JVM
languages, far less interesting.
Since the goal of this article is not to be a tutorial, I
deliberately skimmed over certain concepts, modules and effects. I
hardly mentioned objects, polymorphic variants, or generalized
algebraic types. If these topics interest you, I encourage you to
read in detail the excellent Using, Understanding, and Unraveling The
OCaml Language by Didier Remy, along with the books I presented in
the introduction, which is a goldmine for anyone wishing to deepen
their knowledge of OCaml.
In conclusion, OCaml offers a diverse and rich set of language-level
tools for learning programming, building industrial-grade programs
that follow standards, as well as implementing complex data
structures and category-theory-based abstractions such as a
functional core, imperative traits, a rich and expressive inferred
type system (allowing the expression of algebraic types and
facilitating clear domain modeling), a module system for abstraction,
reusability, and defining compilation units, an object model, the
ability to express effects that can be propagated and interpreted a
posteriori, and other advanced features. Even just to grasp advanced
programming concepts, OCaml is an excellent candidate -- which is why
OCaml has been an obvious inspiration for many more recent languages,
with Rust being a notable example.
OCaml as an ecosystem
Having an expressive language is very beneficial for building things
(the phrasing is deliberately naive). However, in different contexts,
both professional and personal, this is not enough:
* In a professional context, it is obvious that if I want my team
and I to be productive, it is probably not very relevant to have
to build a whole tool stack before being able to start addressing
the problem we are tasked with.
* In a personal context, even though one could argue that building
your technology stack is very educational, it changes the set of
skills you actually want to develop. If, to build a small web
application to get started with OCaml as a web language, I have
to build my entire HTTP stack, it is very likely that OCaml is
not the right choice. Rest assured, however, that OCaml has a
rich tooling ecosystem for building web applications!
That's why the features offered by the language are not a sufficient
metric to describe its viability for building and maintaining
projects. The ecosystem is also a very important factor. It is for
these reasons that .NET and the JVM, through relatively less
expressive (but improving) languages like Java and C#, are also so
popular. To assess the relevance of an ecosystem, I think it is
important to consider several criteria:
* The relevance of the runtime (or compilation targets) for the
project. It's likely that I wouldn't recommend OCaml for
embedding in a tiny, exotic hardware -- though, knowing nothing
about low-level programming (because it's not my field at all), I
could be wrong.
* Its platform. Is its entire toolchain complete and ergonomic?
From my point of view, this includes a package manager, a build
system, good editor support (agnostic as possible), a solid
documentation generator, and a collection of additional tools,
such as a formatter (and many others).
* The relevance of the available libraries (and their level of
maintenance and discoverability, which generally implies having a
package manager) with particular consideration for their
ergonomics. For example, if I don't have any cryptography
primitives, I probably wouldn't choose this technology to build a
blockchain. There is a whole class of problems that are very
difficult to solve in isolation or in a professional context.
In this section, we will try to overview these different points to
see if the OCaml ecosystem lives up to the language. I want to
clarify that I am somewhat biased because I have been convinced of
OCaml's relevance since 2012, back when the ecosystem was drastically
poorer. At that time, I tried to build projects by patching the gaps,
which probably created a survivorship bias. Nowadays, thanks in part
to industrial users, the OCaml ecosystem is much richer and more
extensive, making it much easier to defend, although when some gaps
still exist, the bad faith of the old user can resurface.
Compilation, runtimes, and additional targets
Since its inception, OCaml has had two compilation targets:
* Native compilation, which produces highly efficient executables
compiled for a specific architecture (and supports a large number
of architectures). Moreover, whereas Windows was historically
largely neglected, a special effort has been made to support it
(also note the DkMl project, an independent initiative).
* Compilation to bytecode (for a virtual machine), producing
portable executables.
The presence of a virtual machine enabled the development of the
venerable Js_of_OCaml, which allows transforming OCaml bytecode into
JavaScript, making OCaml perfectly viable for developing applications
in the browser as well as in the Node runtime, and it is extensively
used for this website. Using a similar approach, WebAssembly support
was made possible very recently through the Wasm_of_OCaml project.
Supporting compilation to WASM for a language with a garbage
collector was a serious challenge, but with the recent specification
of interaction between WASM and garbage collectors, OCaml now has
perfectly decent WebAssembly compilation (and many ambitious web
projects, like Ocsigen, are beginning to support WASM natively).
Moreover, the Melange project (historically BuckleScript) offers a
way to transpile -- mapping the OCaml AST to the JavaScript AST -- as
an alternative for producing JavaScript. If I were to compare
Js_of_OCaml and Melange, beyond the different underlying methods used
to produce JavaScript (compiling to bytecode and then transforming
that bytecode into JavaScript versus syntactic transformation from
OCaml to JavaScript), I would say that Js_of_OCaml integrates better
with the OCaml ecosystem and is therefore likely intended for OCaml
developers who want to make their projects accessible from a browser
-- indeed, interaction with the existing JavaScript ecosystem can be
more cumbersome. Melange fits better with the JavaScript ecosystem
(npm and co) and is therefore likely intended for JavaScript
developers seeking to bring more safety to their JS projects (or an
existing codebase).
Nowadays, it is common to find multi-backend languages like Idris or
Nim. However, at the time, I was very impressed that OCaml could,
from the moment I started using it, also compile to JavaScript. Back
then, the only language I knew that offered multiple compilation
targets was Haxe, which were so different (incidentally, Haxe is
written in OCaml).
Indeed, in 2024, producing JavaScript has become standard, but the
first traces of Js_of_OCaml date back to 2006, making OCaml a pioneer
in the field!
A quick detour via MirageOS
In the lattice formed by the different OCaml execution and
compilation contexts, having libraries that work well in the majority
of contexts is a challenging task. Fortunately, the MirageOS project
-- a set of libraries designed to build an operating system dedicated
to running only a single application via virtualization (a unikernel)
-- introduced a true discipline for producing multi-context libraries.
In the near future, I would like to spend more time writing about
Mirage, a fascinating project that we are trying to integrate into
our projects, for example in YOCaml, our static site generator.
Moreover, in addition to providing a sound approach to distributing
intelligently compartmentalized libraries, Mirage offers a solid
foundation of libraries for building OCaml projects, which I will
discuss more extensively in the section dedicated to libraries.
The OCaml platform
The OCaml platform is a set of tools, maintained within an explicit
lifecycle (active, incubating, maintained, and deprecated), designed
to support the compiler with a coherent toolchain for OCaml code
production. It includes many tools serving different purposes;
however, in this section, I will focus only on certain aspects of the
platform, leaving you free to consult its page and roadmap for more
detailed information. In this section, we will look at, in broad
strokes, 4 main specific points:
* The package manager
* The build system (build-system)
* Editor support (including code formatting)
* The documentation generator
When using OCaml for some time, this is probably the most exciting
part of the article, because, in my opinion, it is the one that has
benefited the most from progress. And the roadmap is, in my view,
promising!
OPAM, the package manager
Even though language-specific package managers have become very
popular (if not essential) in reducing adoption friction for a
language, at the time OCaml was designed, they were rare. Indeed,
apart from CTAN, for distributing TeX packages, CPAN, inspired by
CTAN, for distributing Perl packages, and PEAR for PHP, it would take
until Gems for development technologies to consider adopting a
package manager as axiomatic for a programming language.
OPAM, for OCaml Package Manager, is a proposal from 2012 (the
official site About page presents a small timeline). In addition to
installing packages, OPAM allows you to install different versions of
OCaml and create potentially sandboxed environments, called switches.
You can use the public resource repository, hosted on GitHub, but it
is also perfectly possible to create your own package index.
Having already published several packages on OPAM, I must admit
that the CI for package addition validation is incredibly
efficient and user-friendly (each error provides a Dockerfile to
reproduce the issue locally), and that the team of people who
moderate and manage package additions/changes are extraordinarily
responsive and kind.
Even though, in the light of modern standards, one could point out
several criticisms of OPAM, for example:
* terminology that can be cumbersome to grasp (switch, invariant,
etc.)
* duplication of all packages and compilers across multiple
switches (this is a known issue for which work has already been
done)
* and probably some ergonomic issues (notably the interaction with
dune could be smoother, for which work is also currently underway
)
* some complications when managing packages in development,
referencing them from a source repository rather than from OPAM
I must admit that coming from an era when OPAM did not exist, I have
learned to live with some of these minor pitfalls, and on a daily
basis, I have little reason to complain about the tool, which has
never really let me down in my everyday use. However, if you have
encountered usage issues, I encourage you to discuss them on one of
the communication spaces so that the development team can take your
feedback into account and guide you.
There is also esy as an alternative package manager, which draws
inspiration from Nix to build a reusable store, in the same way it is
possible to use Nix with OCaml. However, being somewhat conventional,
I am not really familiar with these practices, and being satisfied
with my workflow with OPAM, I have, unfortunately, never taken the
time to seriously experiment with esy.
Dune, the build-system
As with package management, historically, OCaml had several
build-systems: the venerable ocamlbuild, oasis, ocp-build, Jenga, and
other variations around Make. However, since 2018, the community has
strongly adopted Dune, a build-system initially developed at
Janestreet.
In many aspects, Dune can be intimidating. Indeed, its documentation
is very dense -- but it has greatly improved in terms of structure
over the past few months. And, while many tools choose rule
description languages like YAML, TOML, or even JSON, Dune has opted
for S-expressions. It is also regrettable that Dune, by default,
treats all warnings as fatal.
Before explaining some of its choices (such as S-expressions), it is
very important to highlight the points that have made Dune a
standard:
* Dune is very fast and offers a highly efficient execution model
* it builds the necessary artifacts for configuration automatically
* it generates some redundant files (such as OPAM description
files)
* it trivializes the vendoring of libraries
* it allows invoking read-eval-print loops correctly provisioned by
the context
* one becomes familiar very quickly with S-expressions, which allow
rules to be described schematically and rapidly
* it is relatively agnostic and can execute arbitrary tasks
(similar to make)
* it is constantly evolving and improving from version to version
* paired with dune-release, it makes publishing packages on OPAM
incredibly simple
Perhaps I'm biased, but in my opinion, Dune is one of the most
generic and pleasant build-systems I've ever used -- even if, at first
glance, it can seem intimidating and some choices may be hard to
justify.
On the choice of S-expressions
At first glance, using a Lisp-like syntax to describe binaries,
libraries, and projects may seem surprising. However, this decision
has several advantages:
* The AST of S-expressions being drastically simple, parsing is
very straightforward and can be made highly efficient, which does
not penalize compilation speed.
* The language has termination, making it easier to inspect in case
of errors (anyone who has tried to handle errors in large YAML
files will have faced this kind of problem).
* The language is very easy to learn and to describe.
* It allows describing real programs, making Dune relatively
generic and enabling additional tasks.
So, from my point of view, the choice of S-expressions is relevant:
it allows describing complex, readable programs without being too
verbose, does not significantly slow down compilation, and enables
very concise descriptions of highly complex build rules. And to be
completely honest, you get used to it very quickly!
Contribution to the state of the art: Selective Applicative Functor
In addition to being a very pleasant build-system, Dune has
contributed to the state of the art in research by highlighting a new
construction inspired by category theory. Indeed, in 2018, Andrey
Mokhov, Neil Mitchell and Simon Peyton Jones proposed, in the
excellent paper "Build Systems a la Carte", a collection of
abstractions to re-implement -- modularly -- various build-systems.
However, for reasons related to static dependency analysis, these
models were not compatible with Dune. After several investigations
and experiments, a new construction, similar to an Applicative, a
Selective Applicative Functor, capturing Dune's prerequisites was
proposed. This information may seem anecdotal, but, in my view, it
reinforces the value (and importance) of being at the intersection of
research and industry.
Alternatives
Although widely adopted by the community, OCaml offers alternative
systems (sometimes using Dune under the hood), for example, Obazl
which provides OCaml rules for Bazel, Onix which allows building
projects with Nix, Buck2 which is an ambitious and generic project
competing with Bazel, and Drom which offers an experience similar to
Cargo, unifying package management and project building.
LSP and Merlin for editors
In the previous sections, we saw how much OCaml has progressed in
areas necessary for industrialization. On the other hand, in terms of
editor support, OCaml has had excellent support for Vim and Emacs for
over 10 years through the Merlin project, which provides editor
services enabling completion, diagnostics, code navigation features,
tools related to value deconstruction, value construction, management
(and navigation) of typed holes, polarity-based search, precise
information (with verbosity control) on value types,
jump-to-definition, etc.
In my opinion, IDE support via Merlin has been excellent in OCaml for
a very long time. Coupled with ocp-indent, which calculates the
cursor position after an action in the editor, and OCamlformat, which
allows on-the-fly (configurable) formatting of OCaml files, writing
code in Emacs or Vim is an absolute joy!
The advent of VSCode, LSP as standard
In 2015, Visual Studio Code arrived, introducing the Language Server
Protocol, which abstracts how editors interact with a language
through a server, following a uniform protocol. OCaml has a very good
LSP server that itself relies on well-established libraries in the
OCaml ecosystem, notably Merlin. Since LSP has become relatively
standard in the editor world (Vim, Emacs, and, in fact, almost all
free editors I know can interact with an LSP server), the plan is to
deprecate the Merlin server, moving entirely to LSP, making Merlin a
low-level library that provides tooling used by LSP. This is one of
the projects the Editor team at Tarides (which I'm part of) is
working on: making ocaml-lsp feature-compatible with Merlin's
historic server to reduce maintenance for alternative clients (Emacs
and Vim), only worrying about OCaml-specific requests and actions
(which, logically, are not part of the protocol).
Currently, the OCaml platform for Visual Studio Code and OCaml-eglot
are the two canonical implementations (which extend the LSP protocol
for OCaml), respectively for VSCode and Emacs. We are currently
considering the implementation of a NeoVim plugin.
A bit like with Dune, in my opinion, the tooling state is excellent,
and the roadmap is motivating! However, since this is my work, I'm
probably biased.
Odoc, the documentation generator
OCaml is distributed with a documentation generator, the venerable
OCamldoc; however, it is no longer recommended by/for the community.
Indeed, the tool being promoted is Odoc, a new tool that exists
outside the compiler and offers several very interesting features:
* a rich markup language, supporting cross-references
* the ability to write "manual" pages, ephemeral, while still
benefiting from cross-references
* very good integration with Dune
* a type-based search bar (implemented via Sherlodoc)
* inclusion of source code (written in the documentation or
documented modules)
* implementation of drivers allowing the generation of large sets
of documentation (used to implement the documentation of all
packages on OPAM)
* support for doctest via mdx
Even though the look'n feel of documentation generated by Odoc is, in
my view, far superior to that produced by OCamldoc, there is still
(once again, in my view) a bit of work needed on the UI for the tool
to be truly perfect!
I clearly have a certain fondness for the documentation of the Elixir
language, HexDoc (in terms of design and features), and personally, I
would like OCaml to move toward that example. However, it must be
acknowledged that the documentation generated by Odoc is superior to
that of many other languages. Moreover, due to the highly modular
nature of the language, a good documentation generator that
effectively supports cross-references is quite an achievement!
Available libraries
We have seen that the language is cool, and that it has tooling
which, although still evolving, is effective and pleasant to use.
Could its lack of popularity be due to a too limited set of
libraries? To be completely honest, I don't know. What I do know is
that whenever I have had to write OCaml projects, both professional
and personal, I have often found everything I needed in the package
list. I think the reasons why OCaml is mature enough for many typical
projects can be summarized in several points:
* Companies like Lexifi and Janestreet have strongly contributed to
the ecosystem by releasing many libraries necessary for their
daily use.
* Ambitious research projects, such as, in the case of the Web,
Ocsigen, used industrially in the BeSport project, have generated
a collection of useful libraries.
* As mentioned earlier, MirageOS, with its Clean Slate approach,
naturally produced many robust libraries.
* Like in popular languages such as JavaScript or Rust, motivated
contributors have provided excellent libraries.
* The language is old and has been used industrially for a long
time.
For my part, I have sometimes re-created libraries for the pleasure
of reinventing the wheel, but also, at times, to offer an alternative
interface. Moreover, OCaml allows interfacing with, among other
languages, C, enabling the creation of bindings for a large number of
libraries and tools. However, if there is a library that you find
objectively missing, I encourage you to join the community.
It is important to note that my use of OCaml has focused primarily on
three areas:
* Web development (heavily driven by Mirage, Ocsigen, and
independent projects like Dream, YOCaml, and many others)
* Blockchain development and, by extension, the use of cryptography
libraries, provided once again by Mirage, as well as by the HACL*
project, a formally verified library written in F* and extracted
to OCaml
* Development of Merlin and OCaml-LSP
All these areas still require good testing tooling, and OCaml offers
several complementary libraries to implement robust test suites.
Indeed, within the OCaml ecosystem, you can find tools to write
doctests, classic unit tests, property-based tests, fuzzing, as well
as output observation tests, inline tests (which allow testing, among
other things, private components), and cram tests.
I continue to find everything I need among the available packages,
and I'm still very impressed to see the number of packages and
alternatives grow year after year. Of course, there are some gaps,
but they have not invalidated my choice of OCaml.
Side note on the standard library
A recurring criticism of OCaml is the modesty of its standard
library. Historically, it was designed only to implement the language
itself, so it didn't include certain features useful for end users.
This situation has led to the emergence of alternative standard
libraries, the most popular of which are:
* Batteries, an alternative to the standard library that is
somewhat dated. Historically, it was a fork of Extlib.
* Base, an alternative developed by Janestreet, used quite
extensively in the book Real World OCaml. The library enforces
strong conventions, such as labeling higher-order functions
(typically with the name f).
* Core is an extension of Base.
* Containers is an extension of the standard library (in the sense
that open Containers at the beginning of a module does not break
code written with the standard library).
In addition to these alternative standard libraries, there are
specialized libraries that address general problems, such as Bos,
which provides tools to interact with an operating system, and
Preface -- shameless plug -- which allows you to realize abstractions
from category theory.
The stance of the maintainers on the standard library has evolved
over the years, and it is now possible to consider extending it.
However, additions to the standard library are often subject to
debate, and adding new modules can sometimes take a long time.
Personally, I would have preferred that the standard library continue
to serve only the development of the language and that a library
under the OCaml community umbrella be published. This separation
allows the releases of the language and its standard library to be
desynchronized and also likely simplifies compatibility between the
library and the language.
Ecosystem Conclusion
Unfortunately, I don't have the opportunity to cover all the tools of
the platform, nor the fundamental building blocks that make OCaml
enjoyable to use for personal projects as well as for industrial
projects (for example, the various existing debuggers). However, I
hope I've been able to give an overview of some tools that form a
solid foundation for using OCaml.
In my use of the language, I've sometimes had to build my own
library; however, it's not an exercise I regret. I think,
unfortunately, that if one decides never to use a language just
because 100% of the necessary libraries aren't available, it
feels--perhaps awkwardly--to me like leveling down, trapping us behind
languages backed by wealthy companies, like Java or C#, and that's a
bit sad.
On the community
Even though I've used many different programming languages, I think
OCaml is the only one with which I've had strong community
interaction. So, I'm not fully aware of how things work in other
communities, which makes my feedback somewhat irrelevant. But from my
experience, I find that the OCaml community, besides being very
productive, is:
* Very accessible: Like many other languages, OCaml has a strong
online presence. On these platforms, you can find highly
experienced contributors to the language and its ecosystem and
benefit from expert (or sometimes less technical) advice. I'd
like to give a special mention to Gabriel Scherer and Florian
Angeletti, whose answers are always thoughtful and interesting.
* Very kind: I often need to ask for help, and I've always received
clear and precise answers, whether in private or in public.
* Very brilliant: OCaml is the product of work by brilliant
researchers, and having the chance to interact with them is
incredible (and potentially a bit intimidating). Being able to
ask questions directly to people behind some of the major
discoveries in language design is a fantastic opportunity.
To conclude on the community aspect, even though I'm not fully aware
of how other communities interact, I find it a pleasure to be part of
the OCaml developer community. It's a welcoming space, conducive to
sharing and learning.
Some myths about OCaml
I'm finally reaching the most fun part of this overly long article: I
get to debunk some persistent myths about OCaml. I still can't
promise complete objectivity, but know that my intentions are good.
On the internet, you often see various criticisms or remarks about
OCaml, and I often find it tiresome to respond. However, what better
way than an article meant to share my enthusiasm for the language to
take the time to address some of these critiques and try to provide a
response?
I've selected a few, but it's likely that in the future I'll write
somewhat longer articles--similar to the members of HeyPlzLookAtMe
(fr) -- about articles I find unfair.
OCaml and F#
F# is a programming language historically very inspired by OCaml that
runs on the .NET platform (and, de facto, integrates very well with C
#). I find the language -- which I have professionally used at
DernierCri and D-Edge -- very pleasant. Historically, since .NET was
exclusively for Windows environments, OCaml didn't suffer much by
comparison. However, since the arrival of .NET Core, a cross-platform
implementation of .NET, I increasingly see statements on the internet
like:
"Why continue using OCaml when you can have the same language, F
#, with the entire .NET ecosystem, more features, and a syntax
that's more pleasant to use?"
First, I do think that having the .NET (Core) ecosystem is a huge
advantage. Regarding the syntax, I'm more reserved. Indeed, I find
that indentation-based syntax sometimes makes moving code around more
cumbersome, and even though there are criticisms of OCaml's syntax, I
must admit it hasn't let me down. The last point seems a bit more
insidious. Indeed, F# has been equipped with features not present in
OCaml, for example:
* Computation expressions (which are syntactically a more general
form than binding operators)
* Type providers (which can, unfortunately, sometimes cause issues
with .NET Core in certain name/path resolution cases)
* Active patterns
* Statically resolved type parameters
* The ability to assign methods to sums and products, which makes
sense for interoperability reasons but significantly breaks type
inference
* And probably other features that I don't know well (or are linked
to interoperability with the .NET platform, notably reflection)
These evolutions arrived gradually in the language. It would be naive
to think that OCaml hasn't evolved as well. Indeed, although
historically the two languages seemed very similar, from the very
beginning of F#'s proposal, certain features were missing:
* The absence of a module language. Indeed, the module keyword
exists in F#, but it is only used to describe static classes (and
it integrates rather awkwardly with namespaces).
* A drastically different object model (for interoperability with C
#, of course).
These two reasons alone would be enough to consider OCaml and F# as
cousin languages but very different, and in my opinion, strongly
justify preferring one over the other. In my case, OCaml over F#
makes the introductory sentence of this section moot. However, like F
#, OCaml has also evolved, and in addition to these two fundamental
differences, OCaml offers many features that are absent in F#:
* Local and generalized opens: In OCaml, you can open a module
locally within a scope, whereas in F# you can only open a module
at the top-level, which can be quite frustrating in some cases.
* Row polymorphism: OCaml supports row polymorphism on products
(via objects) and sums (via polymorphic variants).
* Generalized Algebraic Data Types (GADTs): One of the most missed
features (after the module system) for expressing precise type
constraints.
* User-defined effects: OCaml allows defining custom effect
handlers, which can simplify complex control flow and concurrency
patterns.
* Open sums: Extensible variants allow for sum types that can be
extended, though similar behavior can sometimes be simulated
using objects and inheritance.
To conclude, even though F# is a really nice language and using it
brings many advantages (notably the .NET platform), it is not just a
better version of OCaml. The two languages are very different, and
from my point of view, OCaml has a more sophisticated type system,
which makes me prefer it over F#. In my opinion, saying that F# is
just a prettier OCaml is as reasonable as saying that Kotlin is
nothing more than Scala with a lighter syntax.
Doubled operators for floats
The standard library contains the following arithmetic operators on
integers:
val ( + ) : int -> int -> int
val ( - ) : int -> int -> int
val ( * ) : int -> int -> int
But also arithmetic operators for floating point numbers:
val ( +. ) : float -> float -> float
val ( -. ) : float -> float -> float
val ( *. ) : float -> float -> float
At first glance, this may seem confusing. However, it makes perfect
sense. If we wanted to have generic operators, we would need ad-hoc
polymorphism, like in Haskell, for example, where arithmetic
operators reside in the Num type class:
class Num a where
-- more code
(+), (-), (*) :: a -> a -> a
-- more code
Without some form of ad-hoc polymorphism (via classes, traits, or
implicits) to describe a constraint on our operators, e.g., op :: Num
a => a -> a -> a, what can we do? A suggestion I've often seen online
is to use the same trick as with the = operator, whose type is val
(=) : 'a -> 'a -> bool. That doesn't work, because while we can hope
that everything is comparable (at worst, we can return false), how
can we generalize something like addition?
Support for arithmetic operators is a tricky problem, which is
actually the original motivation behind type classes (and the reason
for statically resolved type parameters in F#). From my perspective,
while waiting for modular implicits, duplicating operators to work
with integers and floats seems like a reasonable approach. And if,
for some strange reason, suffixing operators with dots when using
floats gives you hives, you can avoid it using local opens by
providing, for example, this module:
module Arithmetic (P : sig
type t
val add : t -> t -> t
val sub : t -> t -> t
val mul : t -> t -> t
val div : t -> t -> t
end) =
struct
let ( + ), ( - ), ( * ), ( / ) = P.(add, sub, mul, div)
end
Which allows extending the Int and Float modules (which already
provide the functions add, sub, mul, and div) by giving them
arithmetic operators:
module Int = struct
include Int
include Arithmetic (Int)
end
In broad terms, we create an Int module, include the previous Int
module so that our new Int module retains the entire API of the
original Int module, and then we define (and include) our arithmetic
operators. We can now repeat the same process with Float:
module Float = struct
include Float
include Arithmetic (Float)
end
And now we can use a local open so that we don't have to suffix our
operators with dots:
let x = Int.(1 + 2 + 3 + (4 * 6 / 7))
let y = Float.(1.3 + 2.5 + 3.1 + (4.6 * 6.8 / 7.9))
From my point of view, even if this can be confusing for those coming
from languages where this isn't an issue, it's a minor problem. The
lack of operator overloading seems like a rather weak argument for
not giving a language a chance -- but that's just my humble opinion.
On the separation between ml and mli
Another point that generates a lot of discussion (even recently)
concerns the separation between ml and mli files. Personally, I find
it great. Even if it can introduce a bit of repetition, it allows me
to focus on the API via module encapsulation in the mli file while
also adding documentation. I can organize the functions I expose in
any order I like, and naturally, I can abstract the types I share as
much as possible. Moreover, when I look at an implementation, the ml
code is rarely cluttered with documentation, making it easy to
navigate the different elements of the module. On top of that, it
enables separate compilation and prevents recompiling modules that
depend on other modules whose implementation alone was changed during
development (this is Dune's default behavior in the dev profile).
However, tastes vary, and when exposing complex types or module
types, this repetition can be annoying. Fortunately, there is a trick
, presented in 2020 by Craig Ferguson, that helps mitigate this
repetition: The _intf_ trick.
Additionally, there are small tricks based on the ability to pass
arbitrary module expressions to the open and include primitives,
which sometimes allow you to do without mli. I had already mentioned
this in the article OCaml, modules and import schemes.
Encapsulation without mli
You can simply use open struct (* private code *) end to avoid
exporting parts of your code without needing interfaces. For example:
open struct
(* Private API *)
let f x = x
let g = _some_private_stuff
end
(* Public API *)
let a = f 10
let b = g + 11
Expressing the interface from ml
Another similar technique is to use include (struct ... end : sig (*
public API *) end) to describe both the structure and the interface
in the same file. For example:
include (struct
type t = int
let f x = x
let g = _some_private_stuff
end : sig
type t
val f : int -> t
end)
This way, the signature and the structure live in the same file,
while still allowing precise control over encapsulation. Another
approach would be to put the signature in a dedicated module type,
like this:
module type S = sig
type t
val f : int -> t
end
include (struct
type t = int
let f x = x
let g = _some_private_stuff
end : S)
This is very similar to the first approach, except that the module
also exposes the module type S. A useful side effect of this leak is
that you can easily reference the module's signature using My_mod.S
instead of having to write module type of My_mod.
To conclude on separation
I find this separation very desirable. However, since OCaml's module
system is highly expressive, it is possible -- through some clever
encoding -- to work around this separation. From my point of view,
these approaches mainly serve to demonstrate this expressiveness,
because the downside of merging everything in one file is the loss of
separate compilation, which I consider quite unfortunate.
Conclusion
I think I have briefly covered the points I wanted to discuss. From
my perspective, OCaml is an amazing language! It offers an excellent
balance between safety and expressiveness, thanks in particular to
its advanced type system, a rich module language, objects, support
for row polymorphism via objects and polymorphic variants, and
user-defined effects! Its intersection of research and industry makes
it, in my view, a language evolving in the right direction, carefully
integrating new features to stay modern without suffering the
pitfalls of too-rapid or untested adoption.
Even though for several years OCaml's tooling might have seemed a
bit... dusty, recently, thanks in part to commercial support from
certain companies, the tooling has been drastically modernized and
continues to improve, as shown by the platform roadmap. Additionally,
the growing ecosystem of libraries makes it possible to use OCaml in
a wide range of contexts, notably thanks to its different compilation
targets (for example, the browser via js_of_ocaml and wasm_of_ocaml).
By combining an expressive language with a versatile ecosystem and a
supportive, responsive community, OCaml becomes a very compelling
choice for both personal and professional projects. Clearly,
migrating an entire codebase to OCaml is probably not a pragmatic
move, but if you have small personal projects in mind and are curious
and entertained by programming languages, I seriously encourage you
to consider OCaml!
I hope I've managed to convey my enthusiasm for this language (and
its ecosystem). If you'd like to discuss it, find projects, or
explore contribution opportunities, I'd be happy to talk with you --
or you can reach out to the community through the forum, which is
active, responsive, and welcoming!
Ring.muhokama.fun
This website if part of the Muhokama webring. I invite you to browse
it!
predecessor successor
Diffusion
Generator's source code is released under the MIT license and the
content is released (unless mention) under CC BY-SA license.
page source
Feeds
Other feeds (FR)
States
158 logs for a period of 17 days, 12 hours, 42 minutes et 11 seconds.
Activity (FR)
Proudly generated by YOCaml