https://surfingcomplexity.blog/2018/12/24/tla-is-hard-to-learn/

Skip to content
[c690634d34]

Surfing Complexity

Lorin Hochstein's ramblings about software and systems.

TLA+ is hard to learn

Lorin Hochstein software, systems December 24, 2018December 24, 2018 
4 Minutes

I'm a fan of the formal specification language TLA+. With TLA+, you
can build models of programs or systems, which helps to reason about
their behavior.

TLA+ is particularly useful for reasoning about the behavior of
multithread programs and distributed systems. By requiring you to
specify behavior explicitly, it forces you to think about
interleaving of events that you might not otherwise have considered.

The user base of TLA+ is quite small. I think one of the reasons that
TLA+ isn't very popular is that it's difficult to learn. I think
there are at least three concepts you need for TLA+ that give new
users trouble:

  * The universe as a state machine
  * Modeling programs and systems with math
  * Mathematical syntax

The universe as state machine

TLA+ uses a state machine model. It treats the universe as a
collection of variables whose values vary over time.

A state machine in sense that TLA+ uses it is similar, but not
exactly the same, as the finite state machines that software
engineers are used to. In particular:

  * A state machine in the TLA+ sense can have an infinite number of
    states.
  * When software engineers think about state machines, they think
    about a specific object or component being implemented as a
    finite state machine. In TLA+, everything is modeled as a state
    machine.

The state machine view of systems will feel familiar if you have a
background in physics, because physicists use the same approach for
system modeling: they define a state variable that evolves over time.
If you squint, a TLA+ specification looks identical to a system of
first-order differential equations, and associated boundary
conditions. But, for the average software engineer, the notion of an
entire system as an evolving state variable is a new way of thinking.

The state machine approach requires a set of concepts that you need
to understand. In particular, you need to understand behaviors, which
requires that you understand states, steps, and actions. Steps can
stutter, and actions may or may not be enabled. For example, here's
the definition of "enabled" (I'm writing this from memory):

    An action a is enabled for a state s if there exists a state t
    such that a is true for the step s-t.

It took me a long time to internalize these concepts to the point
where I could just write that out without consulting a source. For a
newcomer, who wants to get up and running as quickly as possible,
each new concept that requires effort to understand decreases the
likelihood of adoption.

Modeling programs and systems with math

One of the commonalities across engineering disciplines is that they
all work with mathematical models. These models are abstractions,
objects that are simplified versions of the artifacts that we intend
to build. That's one of the thing that attracts me about TLA+, it's
modeling for software engineering.

A mechanical engineer is never going to confuse the finite element
model they have constructed on a computer with the physical artifact
that they are building. Unfortunately, we software engineers aren't
so lucky. Our models superficially resemble the artifacts we build (a
TLA+ model and a computer program both look like source code!). But
models aren't programs: a model is a completely different beast, and
that trips people up.

Here's a metaphor: You can think of writing a program as akin to
painting, in that both are additive work: You start with nothing and
you do work by adding content (statements to your program, paint to a
canvas).

The simplest program, equivalent to an empty canvas, is one that
doesn't do anything at all. On Unix systems, there's a program called
true which does nothing but terminate successfully. You can implement
this in shell as an empty file. (Incidentally, AT&T has copyrighted
this implementation).

By contrast, when you implement a model, you do the work by adding
constraints on the behavior of the state variables. It's more like
sculpting, where you start with everything, and then you chip away at
it until you end up with what you want.

The simplest model, the one with no constraints at all, allows
all possible behaviors. Where the simplest computer program does
nothing, the simplest model does (really allows) everything. The work
of modeling is adding constraints to the possible behaviors such that
the model only describes the behaviors we are interested in.

When we write ordinary programs, the only kind of mistake we can
really make is a bug, writing a program that doesn't do what it's
supposed to. When we write a model of a program, we can also make
that kind of mistake. But, we can make another kind of mistake, where
our model allows some behavior that would never actually happen in
the real world, or isn't even physically possible in the real world.

Engineers and physicists understand this kind of mistake, where a
mathematical model permits a behavior that isn't possible in the real
world. For example, electrical engineers talk about causal filters,
which are filters whose outputs only depend on the past and present,
not the future. You might ask why you even need a word for this,
since it's not possible to build a non-causal physical device. But
it's possible, and even useful, to describe non-causal filters
mathematically. And, indeed, it turns out that filters that perfectly
block out a range of frequencies, are not causal.

For a new TLA+ user who doesn't understand the distinction between
models and programs, this kind of mistake is inconceivable, since it
can't happen when writing a regular program. Creating non-causal
specifications (the software folks use the term "machine-closed"
instead of "causal") is not a typical error for new users, but
underspecifying the behavior some variable of interest is very
common.

Mathematical syntax

Many elements of TLA+ are taken directly from mathematics and logic.
For software engineers used to programming language syntax, these can
be confusing at first. If you haven't studied predicate logic before,
the universal ([?]) and extensional ([?]) quantifiers will be new.

I don't think TLA+'s syntax, by itself, is a significant obstacle to
adoption: software engineers pick up new languages with unfamiliar
syntax all of the time. The real difficulty is in understanding
TLA+'s notion of a state machine, and that modeling is describing a
computer program as permitted behaviors of a state machine. The new
syntax is just one more hurdle.

Share this:

  * Twitter
  * Facebook
  * 

Like this:

Like Loading...

  * Tagged
  * tlaplus

[c690634d34]

Published by Lorin Hochstein

View all posts by Lorin Hochstein

Published December 24, 2018December 24, 2018

Post navigation

Previous Post Why we will forever suffer from missing timeouts, TTLs,
and queue size bounds
Next Post Inductive invariants

3 thoughts on "TLA+ is hard to learn"

 1. [bf0ce] Jorge Aranda says:
    December 25, 2018 at 12:18 am

    Alright, fine, what's the best way to start learning about TLA+?

    Reply
     1. [c6906] Lorin says:
        December 25, 2018 at 1:17 am

        Practical TLA+ by Hillel Wayne: https://www.apress.com/us/
        book/9781484238288

        Hillel also has this website: https://learntla.com

        Reply
         1. [bf0ce] Jorge Aranda says:
            December 26, 2018 at 6:21 pm

            Thank you!

Leave a Reply Cancel reply

Enter your comment here...
[                    ]

Fill in your details below or click an icon to log in:

  *  
  *  
  *  
  *  

Gravatar
Email (required) (Address never made public)
[                    ]
Name (required)
[                    ]
Website
[                    ]
WordPress.com Logo

You are commenting using your WordPress.com account. ( Log Out / 
Change )

Google photo

You are commenting using your Google account. ( Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. ( Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. ( Log Out /  Change )

Cancel

Connecting to %s

[ ] Notify me of new comments via email.

[ ] Notify me of new posts via email.

[Post Comment] 

[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
[                                             ]
Search for: [                    ] [Search]
Home page

lorinhochstein.org

Recent Posts

  * What do you work on, anyway?
  * Modernists trapped in a post-modern universe
  * Root cause of failure, root cause of success
  * Burned by 'let it burn'
  * StaffEng podcast

@norootcause on Twitter

  * @icfantv Yep! 7 hours ago
  * @RainofTerra Well done 7 hours ago
  * MTTSSBOOD: Mean time to screen shot becoming out of date
    17 hours ago
  * @TaliaRinger Student evaluations are an effective assessment of
    teaching ability. 18 hours ago
  * @vllry If you're really curious, here's a talk by my colleagues
    from a couple of years ago: youtube.com/watch?v=mEgvOf...
    18 hours ago

Follow @norootcause

Archives

  * August 2021
  * July 2021
  * June 2021
  * May 2021
  * April 2021
  * February 2021
  * January 2021
  * December 2020
  * September 2020
  * August 2020
  * July 2020
  * June 2020
  * May 2020
  * April 2020
  * March 2020
  * February 2020
  * January 2020
  * November 2019
  * October 2019
  * September 2019
  * August 2019
  * July 2019
  * June 2019
  * May 2019
  * March 2019
  * January 2019
  * December 2018
  * October 2017
  * June 2017
  * May 2017
  * February 2017
  * January 2017
  * July 2016
  * June 2016
  * October 2015
  * August 2015
  * June 2015
  * April 2015
  * February 2015
  * November 2014
  * October 2014
  * September 2014
  * August 2014
  * July 2014
  * June 2014
  * May 2014
  * April 2014
  * March 2014
  * November 2013
  * September 2013
  * August 2013
  * June 2013
  * May 2013
  * March 2013
  * February 2013
  * December 2012
  * November 2012
  * October 2012
  * August 2012
  * July 2012
  * April 2012
  * March 2012
  * February 2012
  * January 2012
  * December 2011

Categories

  * cognitive-systems-engineering
  * education
  * engineering
  * journalism
  * netflix
  * openstack
  * research
  * resilience
  * software
  * systems
  * Uncategorized

Meta

  * Register
  * Log in
  * Entries feed
  * Comments feed
  * WordPress.com

Blog at WordPress.com.
%d bloggers like this:

[b]