[HN Gopher] Formal Methods: Just Good Engineering Practice? (2024)
       ___________________________________________________________________
        
       Formal Methods: Just Good Engineering Practice? (2024)
        
       Author : aiono
       Score  : 141 points
       Date   : 2025-01-10 15:25 UTC (7 hours ago)
        
 (HTM) web link (brooker.co.za)
 (TXT) w3m dump (brooker.co.za)
        
       | ot wrote:
       | Previous discussion (Jun 2024):
       | https://news.ycombinator.com/item?id=40753989
        
         | aiono wrote:
         | Thanks, I couldn't find it via Google search on this website
         | for some reason.
        
           | onionisafruit wrote:
           | I'm glad you couldn't if that would have kept you from
           | posting it today.
        
             | aiono wrote:
             | Yeah I wasn't going to post it if it was shared before. But
             | I am happy that some are happy that it's reposted.
        
           | ot wrote:
           | Nothing wrong with reposts, it's just useful to link to
           | previous discussions for context :)
        
           | ChrisArchitect wrote:
           | Google search? Use the HN search on site here: https://hn.alg
           | olia.com/?q=https%3A%2F%2Fbrooker.co.za%2Fblog...
        
       | franktankbank wrote:
       | Too slow, planning==ossification any documentation can and will
       | be used against you in the Agile court of law.
        
         | intelVISA wrote:
         | Arguably, if 'real' Agile is ever found Formal Methods would be
         | the antithesis of it: something provable and reproducible is
         | blasphemy to True Believers.
        
           | AnimalMuppet wrote:
           | Not at all. Something _that you can 't easily change_ is
           | terrifying.
           | 
           | Reproducible? Sure, that's what unit tests are for. Make a
           | change, prove that you didn't break any behavior that anybody
           | relied on.
           | 
           | But if you have to do a three-month-long formal proof run
           | because the specification had a one-line change, then you're
           | not agile, under any meaningful definition.
           | 
           | (Where did three months come from? Thin air. I don't know how
           | long a true formal proof would take. Depends on how many
           | things you're proving, how long your spec is, and how much
           | CPU power you have. I would think, though, to formally prove
           | significant properties of a large code base would take a
           | significant time.)
        
             | franktankbank wrote:
             | Sorry, we are discussing Agile not agile.
        
       | constantcrying wrote:
       | Formal verification of software, as the article acknowledges,
       | relies heavily on the type of software and the development
       | process.
       | 
       | To use formal verification you need to have _formal requirements_
       | of the behavior of your software. Most software projects and
       | design philosophies are simply incompatible with this.
       | 
       | In software development and design can often fall together, but
       | that means that it is uniquely ill suited for formal methods. If
       | you are developing, before you are sure what you even want, then
       | formal methods do not apply.
       | 
       | But I agree that there are certain areas, mostly smaller, safety
       | critical, systems, which rely on upfront specifications, which
       | can heavily benefit from formal verification. E.g. Aerospace
       | software relies heavily on verification.
        
         | aiono wrote:
         | I think it's more formalisability than requiring upfront
         | design. For example you may have an insurance claim automation
         | system which you can't design upfront because most insurance
         | providers have unspecified behaviour. But that doesn't mean you
         | can refine your automation system as you get more information
         | from interactions. You would still get the benefit that you
         | ensure not leaving out any cases or not having any
         | contradiction in your system
        
           | constantcrying wrote:
           | But that still relies on having a prior notion of the formal
           | requirements of your system. I know little about insurance
           | systems, but deriving a formal specification seems like a
           | nightmare task. Although, as you mentioned, if you had a
           | partial one you certainly would get some benefits from it.
        
             | aiono wrote:
             | It's been a while since I worked on those systems, but you
             | usually decide on some rules that you refine over time. So
             | they are purely logical decisions that you can formalize. I
             | don't see why it would be hard to specify. I don't mean to
             | specify all up front but one can specify the exact decision
             | procedure that is implemented right now very easily in my
             | experience. Generally you have a state machine representing
             | the process.
             | 
             | This state machine is usually embedded in the code, but
             | code has a lot of noise around the state machine that makes
             | it harder to see the underlying state machine.
        
         | bassp wrote:
         | It's not all or nothing!
         | 
         | I work on a very "product-y" back end that isn't fully
         | specified, but I have formally specified _parts_ of it.
         | 
         | For instance, I property-based-tested a particularly nasty
         | state machine I wrote to ensure that, no matter what kind of
         | crazy input I called an endpoint with, the underlying state
         | machine never made any invalid transitions. None of the code
         | around the state machine has a formal spec, but because the
         | state machine does, I was able to specify it.
         | 
         | In the process, I found some very subtle bugs I'd have never
         | caught via traditional unit testing.
        
           | constantcrying wrote:
           | Completely agree, but obviously you relied on the state
           | machine being sufficiently separate and having a formal
           | specification for it.
           | 
           | State machines are quite common, in aerospace software, where
           | the requirements even specify the states and transitions. If
           | you can formally verify them, I believe, you absolutely
           | should, as often there can be a lot of subtlety going on
           | there, especially if you have a distributed system of state
           | machines sending messages to one another.
        
           | senderista wrote:
           | Property-based testing is also effective in finding bugs even
           | in the absence of any formal model. Basically you just need
           | to identify informal invariants ("properties"), encode them
           | as asserts, and then run tests with enough coverage to likely
           | find any invariant violations.
        
         | trending486 wrote:
         | You are going to have requirements whether you like it or not.
         | It doesn't matter if you discover them during requirements
         | engineering and validate and deconflict them on a simple text
         | document, or if you discover them as you go during coding
         | (possibly after having coded the wrong thing), or if the client
         | discovers them during the "sprint review" of some allegedly
         | agile cult. The only difference is how much money and time are
         | you willing to trade for being called "agile". Ironically the
         | traditional way of doing a requirements stage is the less
         | expensive of all three options. It is also the most aligned
         | with the original agile spirit, since it converges with the
         | client as soon as posible, at the point where changes are the
         | most cheap (changing a line on a text document).
        
         | agentultra wrote:
         | It hasn't been my experience that it is as niche as this. I
         | believe the "costs," people refer to in these discussions have
         | come way down over the last couple of decades. I've taught
         | developers how to use tools like TLA+ and Alloy in week.
         | 
         | It's not a skill that requires a PhD and years of research to
         | acquire these days.
         | 
         | Nor does writing a basic, high level specification.
         | 
         | If anything you will learn something about the system you are
         | modelling by using a model checker. And that can be useful even
         | if it is used for documentation or teaching.
         | 
         | The fundamental power of formal methods is that they force you
         | to think things through.
         | 
         | All too often I find software developers eager to believe that
         | they can implement concurrent algorithms armed with their own
         | wit, a type checker, and a smattering of unit tests. It can be
         | humbling to find errors in your design and assumptions after
         | using a model checker. And perhaps it's hubris that keeps
         | programmers from using such tools in more "mundane" and "real
         | world" contexts.
         | 
         | There are a lot more "small" distributed systems than you'd
         | expect and state spaces are generally larger than you'd
         | anticipate if you weren't open to formalizing your work.
        
           | constantcrying wrote:
           | I would say that the "nicheness" depends on how you treat
           | software. Your development process and software architecture
           | are engineering choices you make and these engineering
           | choices affect how well formal specification applies.
           | 
           | I didn't talk about "costs" or about "how hard" it is, but
           | that common practices in software development make using
           | formal methods infeasible. If you want to use formal
           | verification you likely have to change how you develop
           | software. In an environment where there is general
           | uncertainty about architecture and behavior, as is common in
           | agile environments, formal verification is difficult to
           | implement. By its nature, formal verification discourages
           | fast iteration.
        
             | staunton wrote:
             | > By its nature, formal verification discourages fast
             | iteration.
             | 
             | Not necessarily. You can develop the specification and
             | implementation in parallel. That way, at every point in
             | time, you could have a much more thorough and deep
             | understanding of what you're currently building, as well as
             | how exactly you decide to change it.
        
           | aleph_minus_one wrote:
           | > I've taught developers how to use tools like TLA+ and Alloy
           | in week.
           | 
           | TLA+:
           | 
           | - https://en.wikipedia.org/wiki/TLA%2B
           | 
           | - https://lamport.azurewebsites.net/tla/tla.html
           | 
           | Alloy:
           | 
           | - https://alloytools.org/
           | 
           | -
           | https://en.wikipedia.org/wiki/Alloy_(specification_language)
        
           | gte525u wrote:
           | Do you have any resources you could link to - for those that
           | are curious?
        
             | bassp wrote:
             | Not the OP, but Hillel Wayne's course/tutorial
             | (https://www.learntla.com/) is fantastic. It's focused on
             | building practical skills, and helped me build enough
             | competence to write a few (simple, but useful!) specs for
             | some of the systems I work on.
        
           | billfruit wrote:
           | Alloy is a brute force model checker (for rather small
           | models).
           | 
           | Is that the state of the art for formal methods? How do you
           | think of formally verifying systems with floating points
           | calculations, with randomness, with databases and external
           | environments?
        
             | agentultra wrote:
             | > Is that the state of the art for formal methods?
             | 
             | I think the state of the art is quite broad as there are
             | many tools.
             | 
             | Model checking is, in my estimation, the most useful for
             | industry programmers today. It doesn't require as much
             | training to use as automated theorem proving, it doesn't
             | provide the guarantees that mathematical induction and
             | proof do; but you get far more for your buck than from
             | testing alone.
             | 
             | However model checkers are very flexible because the
             | language they use, math, is also very flexible. TLA+ has
             | been used to find errors in the Xbox 360 memory controller
             | before it went to launch as well as in Amazon's S3 [0] --
             | errors so complex that it would have been highly improbable
             | that a human would detect them and the result of finding
             | those errors and solving them saved real businesses a lot
             | of money.
             | 
             | It comes down to your ability to pick good levels of
             | abstraction. You can start with a very high level
             | specification that admits no details about how the file
             | system works or the kernel syscalls involved and still get
             | value out of it. If those details _do_ matter there's a
             | strategy called _refinement_ where you can create another
             | specification that is more specialized and write an
             | expression that implies if the refined specification holds
             | then the original specification does as well.
             | 
             | Tools for randomness and numerical methods exist and
             | depending on your needs you may want to look at other tools
             | than _just_ a model checker. TLA+, for example, isn 't good
             | at modelling "amounts" as it creates a rather large state
             | space quickly; so users tend to create abstractions to
             | represent these things with a finite number of states.
             | 
             | [0] https://lamport.azurewebsites.net/tla/formal-methods-
             | amazon....
        
           | umvi wrote:
           | Seems like it would really slow you down though if you adopt
           | it too early. Sometimes you don't even know if the thing you
           | want to do is possible.
           | 
           | My development style is:
           | 
           | - prototype ideas in a quick and dirty dynamic typed language
           | just to gain confidence it's going to work (a lot of ideas
           | die here)
           | 
           | - rewrite in a more performant and safe language with a "type
           | checker and a smattering of unit tests" (usually here I'm
           | "done" and have moved onto the next idea/task. If there's an
           | issue I fix it and add another test case)
           | 
           | I'm trying to imagine where formal verification comes in. I'm
           | imagining something like:
           | 
           | - prototype ideas in a quick and dirty dynamic typed language
           | just to gain confidence it's going to work (a lot of ideas
           | die here)
           | 
           | - Formally model the requirements
           | 
           | - rewrite in a language that can be formally verified and
           | which is hopefully performant and lets me do things like simd
           | and/or cuda if needed
           | 
           | - Never have to fix a bug unless there was a bug in the
           | requirements (?)
           | 
           | To me, it just seems like it would take an order of magnitude
           | longer to develop things this way for not much benefit (I've
           | traded development time and potentially runtime performance
           | in exchange for correctness)
        
             | adsharma wrote:
             | https://adsharma.github.io/agentic-transpilers/
             | 
             | Long way to go. But there is a design.
             | 
             | Use z3/SMT instead of TLA+
        
           | pnathan wrote:
           | What's your recommendation on how to rapidly learn TLA+? I
           | spent some time staring at references and the UI a few months
           | ago and came away very defeated. But I'd like to actually
           | level up here.
        
           | ajmurmann wrote:
           | For distributed systems this makes sense. Most people aren't
           | writing distributed system components though but things where
           | the risk usually isn't technical like business software. I
           | worked on a project where I got into arguments with the PMs
           | because I pushed back on optimizing performance on the main
           | page of our pre-launch product. I argued that we don't have
           | any real feedback that the page is what is needed and the PM
           | thought I was insane for doubting the main page was needed.
           | We completely redesigned that page twice, deleted it entirely
           | and then had to bring it back because the sales team liked it
           | for initial demos.
           | 
           | Every process is a tradeoff and it anyways depends on your
           | specific circumstances which choice is best for your team and
           | project.
        
             | agentultra wrote:
             | It does depend a lot on circumstance and context.
             | 
             | Is it absolutely important that your system is correct? ...
             | begs the question, _correct with respect to what?_
             | Generally: a specification.
             | 
             | There are lots of situations where you don't know what your
             | system is supposed to do, where testing a few examples is
             | sufficient, or it's not terribly important that you know
             | that it does what you say it ought to. Generating a batch
             | report or storing some customer responses in a database?
             | Trust the framework, write a few tests if you need to,
             | nobody is going to find a formal specification valuable
             | here.
             | 
             | However, if you need to deploy configuration to a cluster
             | and need to ensure there is at least two nodes with the a
             | version of the configuration that matches the database in
             | the load balancer group at all times during the migration?
             | Need to make sure the migration always completes and never
             | leaves the cluster in a bad state?
             | 
             | Even smaller in scale: need to make sure that references in
             | your allocator don't contain addresses outside of your
             | memory pool? Need to make sure that all locks are
             | eventually released?
             | 
             | It's definitely much faster to iterate on a formal
             | specification first. A model checker executes your model
             | against the entire state space. If you're used to test-
             | driven development or working in a statically typed
             | language, this is useful feedback to get early on in the
             | design process.
             | 
             | What the scope is that is appropriate for using tools like
             | this is quite large and not as niche as some folks imply. I
             | don't do aerospace engineering but I've used TLA+ to model
             | deployment scripts and find bugs in OpenStack deployments,
             | as well as to simply learn that the design of certain async
             | libraries are sound.
             | 
             |  _Update_ : more examples.
        
         | aidenn0 wrote:
         | Everything you said about formal verification is also true
         | about tests; do you think software is also uniquely ill suited
         | for TDD?
        
           | constantcrying wrote:
           | I did not say that software is uniquely ill suited for formal
           | verification. That also would be total nonsense.
           | 
           | I said that certain philosophies of software design are
           | uniquely unsuited for formal verification.
           | 
           | Besides, tests and formal verification are different. A test
           | is essentially a formal specification for a single point or,
           | depending on how you test, multiple, potentially random,
           | points. Writing or changing a test is less time intensive
           | than writing or changing a full formal specification for an
           | entire subsystem, therefore tests are more suited to volatile
           | software development practices.
        
             | staunton wrote:
             | > therefore tests are more suited to volatile software
             | development practices.
             | 
             | And having _no_ tests is even more suited...
        
         | hinkley wrote:
         | I worked on a project where a couple people formally proved our
         | design sound while I was still fixing bugs in the spec. We
         | don't have spherical cows. Theoretical models aren't reality.
         | The map is not the territory.
        
         | billfruit wrote:
         | Dijkstra for example didn't think so, I think. He had a din
         | view of testing as compared with proofs for ensuring software
         | correctness.
        
         | pron wrote:
         | "Formal" means "written in a language that can be interpreted
         | by a computer" and that is the very thing programmers do.
         | Writing a code _is_ writing a formal specification of the
         | program 's behaviour, and by definition, every piece of
         | software must do that.
         | 
         | But you're right that to benefit from any formal methods, you
         | need to compare the program's behaviour to something else that
         | isn't the program itself, and that other thing also needs to be
         | written in a formal language. To be able to write that other
         | thing, you do need to have a precise understanding of what the
         | wanted behaviour should be so that you could express it in a
         | formal language, but it doesn't have to cover the full
         | behaviour of the software.
         | 
         | As an example of what that means, take automated (unit) tests.
         | Automated tests _are_ a formal specification, and running them
         | _is_ a formal verification method. Tests are a relatively weak
         | specification, and executing them is a relatively weak
         | verification, compared to what we normally mean by  "formal
         | methods", but there is no clear qualitative difference --
         | conceptual or even practical -- between tests and more powerful
         | formal specification and verification. You can think of these
         | practices as more powerful tests or as tests that work
         | differently, but tests _are_ formal specifications, and if they
         | 're applicable to a piece of software, it's likely that richer
         | formal specification methods are, too (learning the cost/benfit
         | of other methods is similar to learning the cost/benefit of
         | tests -- you don't get it right at first, but you learn along
         | the way).
        
           | akoboldfrying wrote:
           | >Automated tests are a formal specification, and running them
           | is a formal verification method.
           | 
           | This is a great way to describe it, and an important concept
           | to grasp.
           | 
           | Another kind of formal verification that exists between
           | "standard" unit tests and model checking/automated theorem
           | proving is property-based testing a la quickcheck: You have a
           | function you want to test (e.g., a sort), you describe the
           | properties of inputs ("a list of integers") and state the
           | properties you want to hold of the output ("every number
           | should appear the same number of times in the output as in
           | the input, and every number should be >= the number to its
           | left") and ask the system to generate and test lots of random
           | inputs for you. These properties can often be used more or
           | less directly in model checkers, which makes the checking
           | exhaustive (on some small, finite domain).
        
         | PhilipRoman wrote:
         | You can also use proof assistants as "unit tests" for entire
         | classes of behaviors rather than specific values. This lets you
         | add proofs incrementally even when a formal spec is too
         | difficult.
        
         | senderista wrote:
         | I'm not sure that "aerospace software relies heavily on
         | verification" is very commonly the case at all, even if one
         | would like it to be...
        
           | kmoser wrote:
           | What makes you say that? I'm pretty sure it's one of the more
           | heavily verified industries. If it wasn't, air travel would
           | be much more dangerous.
        
         | nickpsecurity wrote:
         | That's true. Back when they were inventing security, the
         | requirement for high-security systems (A1 under TCSEC) required
         | formal verification. Those who did it said its cost ranged from
         | not too burdensome to very high. What's the reason?
         | 
         | While many requirements existed, the design assurance required
         | specifying what problem was being solved, how it's being
         | solved, and the safety/security properties. Then, proving that
         | the security properties were embodied in the artifacts in all
         | states. That problem is difficult enough that most people who
         | succeeded were also geniuses good at programming and math.
         | Geniuses are in short supply.
         | 
         | The lessons learned papers also pointed out a recurring
         | problem. Each requirement, security policy, algorithm, etc
         | needed a mathematical specification. Then, proof techniques
         | were necessary that could handle each. Instead of developing
         | software features, you are now both developing software and
         | doing R&D on applied mathematics.
         | 
         | Steve Lipner's The Ethics of Perfection pointed out it took
         | several quarters to make a major change on a product. The
         | markets value development velocity over quality, even demand
         | side. So, that's already a no go for most businesses. Likewise,
         | the motivations of OSS developers work against high assurance
         | practices as well.
         | 
         | If you want formal verification, then you are forced to make
         | certain decisions. For one, you need to build simple systems
         | that are easy to verify, from design to language. Then, you use
         | the easiest, most-automated tools for the job. Most drop full
         | verification to do automated, partial checking: TLA+, Alloy,
         | SPARK Ada, Meta's Infer, etc.. If doing full verification, it's
         | better to make or build on reusable components like we see done
         | with CompCert and seL4. GEMSOS and VAX VMM advocated that back
         | in the day, too.
         | 
         | So, most software development isn't fit for use of formal
         | verification. I think we can default on a combo of contracts
         | (specs), automated testing, static analysis, and safe
         | languages. If distributed, stronger consistency when possible.
         | Then, use CI to run the verification on each commit. Some tools
         | can generate patches, too.
        
         | zozbot234 wrote:
         | Static type checking is a kind of formal verification of
         | software - there are formal requirements (the program doesn't
         | go "wrong" in a number of rigorously defined ways) that are
         | automatically checked. And you can certainly do "design and
         | development" together in type-safe languages.
        
       | commandlinefan wrote:
       | I see this line of reasoning about formal methods a lot: software
       | is big and complicated and hard to get right... therefore formal
       | methods.
       | 
       | On the one hand, I _want_ this to be true both for selfish and
       | practical reasons: selfishly because I'm very very good at
       | learning things that require an academic learning approach (read
       | a book, work some exercises, practice) and if something I'm good
       | at is important, that means more money for me. Practically
       | because they're right, software _is_ big and complicated and hard
       | to get right and as a practitioner, it's really frustrating when
       | it does fail and I'm scrambling to figure out why.
       | 
       | On the other hand, though, nobody ever seems to make a compelling
       | case for how formal methods proposes to solve that problem. This
       | author actually does better than most by pointing out how most
       | modern "design" is a waste of time but doesn't really go into why
       | TLA, say, is better than (demonstrably mostly useless) UML.
       | There's sort of an implied suggestion that if you dedicate a few
       | months (or years?) to learning TLA, you'll reach enlightenment
       | and understand how it's helpful in a way that's impossible to
       | articulate to the unenlightened. And that's not impossible to
       | imagine! Calculus and Bayesian statistics are kind of like that;
       | you need to really understand them before you can really see why
       | they're useful.
       | 
       | I always find myself left applying what I call "project manager"
       | logic - I need to make a snap decision right now about whether or
       | not I should invest time in this "formal method" stuff or not so
       | I have to be observational: if it was really that helpful, more
       | people would be applying it and the benefits would speak for
       | themselves. They've been around a long, long time, though, and
       | never caught on - it's hard not to conclude that there's probably
       | a reason.
        
         | hitchstory wrote:
         | >if it was really that helpful, more people would be applying
         | it and the benefits would speak for themselves. They've been
         | around a long, long time, though, and never caught on - it's
         | hard not to conclude that there's probably a reason.
         | 
         | My impression is that there are actually not that many business
         | domains where a large investment in time and money to get
         | domain logic correctness from 98% to 99.99% correct is actually
         | called for.
         | 
         | Formal methods _are_ a large investment, too. No two ways
         | around it.
         | 
         | Also, while they havent really caught on in general, some of
         | their ideas have made it into modern type systems.
        
         | t43562 wrote:
         | IMO UML is useless because whatever diagram gets produced means
         | different things to different people and it's very complex but
         | not check-able so people can make UML diagrams that are self
         | contradictory or nonsense.
         | 
         | I find myself using a "method" of some kind when faced with a
         | problem that's hard to think about. A communications protocol -
         | nice to have a state machine to describe it for example. TLA
         | obviously fits that niche even better. I've been lucky enough
         | not to have too many problems recently that felt like they
         | justified that effort but it's of incredible value when one
         | does. Domain Specific Languages - so much better to use a
         | parser framework than hand-code if you want to avoid all sorts
         | of undesirable outcomes.
         | 
         | Currently most of my rework comes from changes to the
         | requirements and our "customers" not really knowing what they
         | want till we give them something and they say "not that."
         | 
         | This is partly because the people asking for things don't fully
         | think out all the implications of what they're asking. It's
         | mostly about not having enough knowledge in one place to make
         | good decisions on.
        
         | bee_rider wrote:
         | I only am familiar with formal verification in the context of a
         | hardware class, which is like programming but the cost:benefit
         | is wildly different (can't fix a physical chip after it has
         | been fabricated very easily) and the types of designs are very
         | different.
         | 
         | But, the impression I got was that the rigors of the formal
         | verifier would sort of impose a limit on the complexity of the
         | design just based on... completing in a reasonable timeframe
         | and in a reasonable amount of memory space. Maybe the true
         | victory of demanding formal verification would be fixing:
         | 
         | > software is big and complicated and hard to get right
         | 
         | By making big complex programs a pain to work with, haha.
        
         | hinkley wrote:
         | If we're going to boil this frog, we need to steal wisdom from
         | TLA, not teach it. Type systems have borrowed a lot from
         | Hinley-Milner, and are themselves a formal, partial proof.
         | 
         | I think I'd like to see a descendent of Property Based Testing
         | that uses SAT or TLA techniques to rapidly condense the input
         | space in a repeatable fashion. We should be able to deduce
         | through parsing and code coverage that passing 12 to a function
         | can never follow different branches than 11, but that -1 or
         | 2^17 < n < 2^32 might.
        
         | sesm wrote:
         | For Calculus you can easily explain to the unenlightened that
         | many physical and engineering laws work in terms of speed and
         | acceleration of measurable values, and Calculus is a
         | mathematical framework for working with this.
        
       | thefaux wrote:
       | Most of the articles I've read about formal methods feel like
       | lead gen for consultants. That's fine but feels obnoxious when
       | they implicitly act as though they have reached formal methods
       | induced enlightenment that you can too if you buy a pack of
       | trainings for your employees/coworkers or hire me as an employee
       | to fix your bad (irresponsibly dangerous even!) programming
       | habits.
       | 
       | Get back to me when the formal methods actually generate high
       | quality code that cannot deviate from the spec.
        
         | erpellan wrote:
         | How about
         | https://en.wikipedia.org/wiki/SPARK_(programming_language)
        
           | lolinder wrote:
           | Ada is terminally underappreciated.
        
         | aiono wrote:
         | > Get back to me when the formal methods actually generate high
         | quality code that cannot deviate from the spec
         | 
         | That would be useful but there is a fundamental issue: code is
         | too specific. In formal specification you usually don't specify
         | that detailed but you specify the general behaviour of the
         | system. So usually the specification corresponds to a lot of
         | programs with subtle differences. That's why code a
         | documentation falls short: you don't know what is intentional
         | and what is just a random choice. It's simply too specific to
         | describe high level requirements.
         | 
         | The other way around (verification of program with respect to
         | the specification) can be more feasible to implement.
        
       | noelwelsh wrote:
       | I would guess a majority of developers use formal methods these
       | days. We just tend to call them type systems, and for some reason
       | consider them a distinct category. If simulations count as formal
       | methods, then tests, and particularly property-based testing,
       | also make the cut.
        
       | jcgrillo wrote:
       | The lightweight formal methods callout is a good one. Maintaining
       | a suite of proptest[1] strategies alongside the codebase is not a
       | very much larger investment than writing unit tests by hand, but
       | the insights they provide due to extensive coverage and compact
       | understandable failure cases are way better. And crucially this
       | approach _does_ align with normal software development practices.
       | 
       | [1] https://crates.io/crates/proptest
        
         | andrepd wrote:
         | Yeah but proptest / qcheck is not formal methods at all. It's
         | randomised testing.
        
           | jcgrillo wrote:
           | The assertion is that they're _lightweight_ formal methods.
           | Or is the article (and the proceedings of SOSP  '21 it links
           | to) wrong?
           | 
           | EDIT: ah I see where there might be confusion--obviously a
           | library for generating random test data and making assertions
           | about code under test itself doesn't constitute anything like
           | "formal methods" but the idea of _using_ that library in the
           | way described in the paper linked from the article does. But
           | that 's kinda always the thing about software libraries..
        
           | pfdietz wrote:
           | Property tests are informal tests of formal properties. They
           | don't guarantee the properties hold, or that the formal
           | properties are complete, but they exploit the existence of
           | these formal properties.
           | 
           | Once you have formal properties, you can write property-based
           | tests using them, and I wonder how much of the benefit of
           | formal methods could be obtained just by doing this. It's
           | another example of using increased computing power (testing)
           | to substitute for expensive hand labor (proving theorems).
           | 
           | I'll also observe that even theorem proving systems benefit
           | from a kind of property based testing. If there's a goal to
           | prove the existence of a value satisfying some property, this
           | is essentially a property based testing problem. Similarly,
           | find a counterexample to a universally quantified formula
           | (also an existential problem) can be used to prune off
           | unproductive branches of a search tree.
        
             | jcgrillo wrote:
             | There's something also in the UX dynamics. As a developer
             | writing property based tests I'm encouraged to think much
             | more carefully about system invariants, otherwise there's
             | not much value added over unit tests. For anything
             | nontrivial this entails building a model of the system and
             | checking it against the system under test, like they did at
             | AWS. So the decision to use this tool shapes how you think
             | about the system--it makes you reason more formally about
             | it rather than just winging it and writing tests to
             | exercise the code.
        
           | Tryk wrote:
           | With an infinite domain (e.g.numbers) randomised testing is
           | necessary, no?
        
         | jillesvangurp wrote:
         | > Maintaining a suite of proptest[1] strategies alongside the
         | codebase is not a very much larger investment than writing unit
         | tests by hand,
         | 
         | I actually generate a unit tests with LLMs a lot lately. They
         | do a decent job and you can just ask it to be a little more
         | exhaustive, test any edge cases it can think of, or instruct it
         | to deal with specific ones. I know a fair bit about how to
         | write good tests and the effort you can put in that. But LLMs
         | can generate better tests than me way faster.
         | 
         | If anything, they are less likely to do a half-assed job of it
         | than me because I tend to run out of patience doing repetitive
         | tedious shit. This is a healthy trait to have for a software
         | engineer: we are supposed to make stuff easier by automating
         | it. If it feels repetitive, your reflex should be to write code
         | to make it less repetitive. Documentation is the same. I
         | generate that too these days and since it is so easy, I do it
         | more often and sooner.
         | 
         | LLMs might trigger a minor revolution in the adoption of formal
         | method verification. Generating a correct specification is a
         | combination of tedious and probably relatively easy for an LLM
         | given enough context like working code, documentation, and
         | other hints as to what the thing should do.
         | 
         | I'd be a lot more likely to bother with that stuff if I can
         | just let it generate specification and then briefly glance
         | through them than if I have to spell everything out manually.
         | 
         | I think using Rust kind of signals that you care about
         | correctness. It's compiler is probably the closest to proving
         | your system is probably correct that you can get without
         | resorting to formal methods. And probably a lot easier than
         | trying to bolt on formal methods to languages that don't even
         | use a compiler or explicitly specified types.
        
       | trending486 wrote:
       | Modern formal methods like TLA+ and Alloy are as easy to pick up
       | as Python, and other than having to write a spec (an ultra-
       | simplified model of part of a system) they are completely
       | automatic (based on model checkers). There is no reason for a
       | modern software engineer not to have them on her radar. As a
       | matter of fact most of the cloud systems you are using everyday
       | have been verified with modern formal tools: Azure Cosmos DB,
       | Dynamo DB, MongoDB, CockroachDB, ... and many others.
        
         | tombert wrote:
         | I haven't done enough with Alloy to speak with any degree of
         | competence with it, so I'll only speak on TLA+.
         | 
         | While I do think TLA+ is relatively easy to pick up (especially
         | compared to Isabelle or Coq), and I think it's pretty awesome,
         | I'm hesitant to say it's as "easy to pick up as Python". You
         | need a basic understanding of set theory and state machines to
         | even get started, and more advanced concepts like algorithm
         | refinement to get into the really useful juicy stuff.
         | 
         | When I've tried to push TLA+ at work, the mathey syntax is
         | usually a non-starter for most of my engineering colleagues. I
         | don't think my coworkers are stupid by any stretch, but they
         | are decidedly uninterested in re-learning any discrete math (if
         | they ever learned it in the first place, which isn't a
         | guarantee). For an engineer LARPing as an academic like me,
         | TLA+'s notation isn't really hard at all, but I will often
         | forget that most engineers _only_ think in code.
        
           | Taikonerd wrote:
           | _> When I 've tried to push TLA+ at work, the mathey syntax
           | is usually a non-starter_
           | 
           | Would they be interested in Quint?[0] If I understand
           | correctly, Quint is very similar to TLA+, but with a more
           | "code-looking" notation. And it has some nice dev tooling,
           | like an LSP, which also lowers the barrier to entry.
           | 
           | [0]: https://github.com/informalsystems/quint
        
             | tombert wrote:
             | Possible, I haven't tried it.
             | 
             | It's still an uphill battle, because the comparison to unit
             | tests always comes up, and I have to explain how they're
             | not equivalent.
        
         | adsharma wrote:
         | Why not use python itself then? The design by contract PEP is
         | old. Not sure if anyone is working on it.
         | 
         | https://adsharma.github.io/agentic-transpilers/
         | 
         | https://adsharma.github.io/pysmt
        
           | dragonwriter wrote:
           | A 21-year-old deferred PEP is probably safely described as
           | dead with only the most distant hope that it might one day be
           | revived, or, more likely replaced with an entirely different
           | PEP addressing a similar domain with a new approach.
        
       | AnimalMuppet wrote:
       | Good engineering practice is to use the _appropriate_ level of
       | rigor. It depends on what the cost of failure is, and what the
       | cost of the rigor is.
        
       | brap wrote:
       | I'm not super experienced with formal verification, but I did dip
       | my toes in it a few times.
       | 
       | My impression is that it's far from a magic bullet. Writing
       | formal specs is basically like writing the code/tests just
       | differently. And the more it covers the more it becomes the same
       | thing. And it suffers from the same problems.
       | 
       | My conclusion every time was that the code itself _is_ the formal
       | spec and the formal spec _is_ the code.
       | 
       | By analogy with construction, the code is both the building and
       | the blueprint.
        
         | davidmurdoch wrote:
         | This. In my very limited experience (i didn't write the code or
         | specs), I've seen the runtime code find more bugs in the formal
         | specs than the formal specs finding bugs in the runtime code.
        
         | IshKebab wrote:
         | > My conclusion every time was that the code itself is the
         | formal spec and the formal spec is the code.
         | 
         | Yes you can end up with tautological specs, where it's more or
         | less a copy of the code. E.g. you aren't going to benefit much
         | from formally verifying `max()`.
         | 
         | But there are _many many_ cases where the formal specification
         | is _much much_ simpler and more obviously correct than the
         | actual code. The classic examples are:
         | 
         | * Any kind of encoding / decoding transformation has the
         | property `decode(encode(x)) == x`.
         | 
         | * Sorting: any sorting algorithm should result in a sorted
         | array (forall i, j: i < j --> array[i] <= array[j])
        
       | nimish wrote:
       | Formal methods work great when the price of failure is absolute.
       | Mostly pointless otherwise but can be a good exercise I guess.
        
         | jillesvangurp wrote:
         | With most testing and verification, there's a law of
         | diminishing returns. It helps you find stuff that you need to
         | fix and there is always stuff to find. But at some point you've
         | found enough of the stuff that needed fixing that you can use
         | the software and it starts making money for you. Most people
         | stop there. It's not going to make much more money if you
         | continue your efforts and the risk of a lot of financial damage
         | is usually not that high. A good software license will ensure
         | that. You might be better off paying a decent lawyer than
         | wasting time on formal methods. Lawyers aren't cheap. But
         | neither is having your software engineering team faff about
         | with a lot of complex tools for weeks on end.
         | 
         | And with software you can just do an update if something is
         | found later. Not a big deal usually. There are exceptions of
         | course. With hardware things get more expensive. But still,
         | judging from the state of e.g. most bluetooth and other
         | hardware I've ever owned, the barrier of good enough is pretty
         | low there too. Mostly things work and you can usually work
         | around minor issues when they don't.
         | 
         | Some, software justifies/requires going above and beyond doing
         | testing. Especially if it controls critical hardware. I've
         | never worked on such stuff. And even there the notion of
         | releasing often and breaking stuff by testing it seems to be
         | catching on. For example SpaceX is doing agile rocket
         | development. They launch starship every few months until they
         | get it reliable enough to launch things into orbit.
        
         | aiono wrote:
         | I doubt that although I agree that it's much more useful when
         | cost of a failure is higher. For example I work in a lab that
         | formalizes requirements and we have real customers that pay us
         | for formalization because they find it useful. Some products
         | are things that a failure could cause injury or even possibly
         | death. But not all systems have that high costs and they still
         | see benefits.
        
       | synchronousq wrote:
       | I just want to note, there exist two main flavors of formal
       | methods: extrinsic techniques, which are disjoint from the code
       | itself and generally reason about the specifications of code, and
       | intrinsic techniques, which are inline with the code itself and
       | reason about the code more directly. Historically, intrinsic
       | techniques (such as type systems) reason about code at a
       | functional level, while extrinsic techniques (such as decidable
       | model checkers like Spin/P) reason about a model of the code,
       | ascribed to formalism like an automata. But imo we're currently
       | in a complete golden age of formal methods research, and
       | extrinsic techniques are falling out of flavor in comparison to
       | intrinsic methods as pushed by type system advancements and
       | projects like Verus [1].
       | 
       | [1] https://github.com/verus-lang/verus
        
         | hashxyz wrote:
         | I don't see how the distinction makes any sense when the Verus
         | project you linked requires you to write correctness specs. It
         | sounded like intrinsic techniques were preferred because they
         | would not require you to write and maintain a separate spec,
         | but this is not the case.
        
           | lolinder wrote:
           | I prefer intrinsic techniques because it prevents the model
           | from being out of sync with the implementation.
           | 
           | The thing that's never made any sense to me about using a
           | marble checker for anything but concurrency issues (which are
           | hard enough to warrant it) is that once you've validated your
           | model you have to go actually implement it, and that's
           | usually the most error prone part of the process.
           | 
           | If the correctness spec has to be written manually but
           | prevents you from diverging from the spec in your
           | implementation, that's a huge step up from extrinsic model
           | checkers.
        
         | adsharma wrote:
         | TLA+ and the like work because they target a very small
         | specification language.
         | 
         | With the large footprint of Rust, I've seen questions raised
         | about how this will work. But haven't seen good answers.
         | 
         | Would love to read more.
        
       | begueradj wrote:
       | For those interested in the information exchange about this same
       | article: https://news.ycombinator.com/item?id=40753989
        
       | markusde wrote:
       | I'd recommend anyone with a passing interest in the role formal
       | techniques can play in software development watch this [1] talk.
       | Mike Dodds is a principal scientist at Galois (a company which
       | has a lot of experience with applying formal methods in industry
       | and government) and the talk does a good job at explaining where
       | they've seen value-added from formal methods, and the right kind
       | of formal methods for different applications.
       | 
       | [1]: https://www.youtube.com/watch?v=gfvvowAc130
        
       | Animats wrote:
       | The article is thin on specifics.
       | 
       | Some problems specify well. A database is an example. A spec for
       | a database can be written in terms of exhaustive search, running
       | the query against everything. Now show that an efficient database
       | yields the same output.
        
       | IshKebab wrote:
       | IMO formal software verification is still waaay too difficult to
       | be worth it in all but the most extreme cases. That's really
       | different to formal hardware verification where it is a no-
       | brainer.
       | 
       | I keep trying to learn it, but you need to be a real expert. Like
       | "I wrote the compiler" level expert for most systems.
       | 
       | For example I tried to prove a varint encoder/decoder. It worked
       | for one or two bytes, but not more. Asking for help the reason
       | was that the compiler internally only unrolls loops 5 times, or
       | some random internal detail like that that you could never really
       | hope to know.
       | 
       | I've been learning Lean recently, and ... I mean I like it, but
       | if you learn it you're going to encounter documentation like
       | this:
       | 
       | > Definitional equality includes e-equivalence of functions and
       | single-constructor inductive types. That is, fun x => f x is
       | definitionally equal to f, and S.mk x.f1 x.f2 is definitionally
       | equal to x, if S is a structure with fields f1 and f2. It also
       | features proof irrelevance, so any two proofs of the same
       | proposition are definitionally equal. It is reflexive, symmetric,
       | and a congruence.
       | 
       | And that's not really a knock on Lean - it seems to have some of
       | the better documentation out of the alternatives.
        
       ___________________________________________________________________
       (page generated 2025-01-10 23:00 UTC)