[HN Gopher] To be a better programmer, write little proofs in yo...
       ___________________________________________________________________
        
       To be a better programmer, write little proofs in your head
        
       Author : mprast
       Score  : 433 points
       Date   : 2025-07-15 17:05 UTC (1 days ago)
        
 (HTM) web link (the-nerve-blog.ghost.io)
 (TXT) w3m dump (the-nerve-blog.ghost.io)
        
       | hiAndrewQuinn wrote:
       | Oh, I have a relevant and surprisingly simple example: Binary
       | search. Binary search and its variants leftmost and rightmost
       | binary search are surprisingly hard to code correctly if you
       | don't think about the problem in terms of loop invariants. I
       | outlined the loop invariant approach in [1] with some example
       | Python code that was about as clear and close to plain English at
       | I could get.
       | 
       | Jon Bentley, the writer of Programming Pearls, gave the task of
       | writing an ordinary binary search to a group of professional
       | programmers at IBM, and found that _90%_ of their implementations
       | contained bugs. The most common one seems to have been
       | accidentally running into an infinite loop. To be fair, this was
       | back in the day when integer overflows had to be explicitly
       | guarded against - but even then, it 's a surprising number.
       | 
       | [1]: https://hiandrewquinn.github.io/til-site/posts/binary-
       | search...
        
         | qsort wrote:
         | To be fair you're picking an example that's extremely finicky
         | about indices. It's probably the hardest basic algorithm to
         | write down without errors. Up there with Hoare partition.
        
           | noosphr wrote:
           | You're right, but invariants are the most bang for the buck
           | you can get out of formal methods without having to use a DSL
           | built for computer assisted proofs.
           | 
           | That they work for binary search is a very strong signal to
           | people not familiar with them that they work for nearly
           | everything (they do).
        
         | fragmede wrote:
         | A research paper from Google in 2006 noted that "almost all"
         | binary search (and merge sort) implementations contain a bug
         | (and had for decades), so 90% seems impressive in the face of
         | that.
         | 
         | https://research.google/blog/extra-extra-read-all-about-it-n...
        
           | hiAndrewQuinn wrote:
           | The bug in that paper is actually the very buffer overflow
           | bug I was referring to. Given that Jon himself made that
           | error in the "definitive" implementation, it seems unlikely
           | that he would have spotted it in the 10% of implementations
           | he considered correct. Under Google's stricter criteria it
           | seems likely to me that not a single person got the search
           | implementation truly correct enough.
        
           | jansan wrote:
           | In Javascript that Google "bug" only occurs if an array has
           | more than 4503599627370495 items. You may run into memory
           | issues before running into that bug.
        
         | sedatk wrote:
         | I knew about the infamous Binary Search bugs in books, and I
         | dared to write the first bug-free implementation in my book,
         | very carefully. Still, it ended up having bugs. :) Luckily,
         | Manning's early access program let me fix them before printing.
        
           | edanm wrote:
           | Oh interesting, where can we read about the bug that was
           | there?
        
             | sedatk wrote:
             | It was quite silly. The most common bug with binary search
             | is calculating the middle point: `(start + end) / 2`. It
             | looks straightforward, but "start + end" could cause an
             | integer overflow for large data structures, and that would
             | break the function. IIRC, my handling for that case also
             | turned out to be incorrect, and one of the readers caught
             | it. I don't remember if there were other bugs, there could
             | be, but that one hurt the most :)
        
               | edanm wrote:
               | Ah yes this is the "classic" bug that was in the Java
               | standard library undiscovered for twenty years, iirc.
               | 
               | I find this topic really interesting, thanks for sharing.
               | I doubt I could code a binary search without any bugs
               | from scratch :)
        
         | None4U wrote:
         | What are the odds you write a binary search that you'll use
         | more than once instead of just running it and writing down the
         | result?
        
           | JohnKemeny wrote:
           | I once needed to write binary search for a microcontroller in
           | C (no libraries). The routine ran about every hour, with appx
           | 4M data points.
        
             | kamaal wrote:
             | I once worked lots of projects in C, microcontrollers,
             | embedded systems etc. It was a start up.
             | 
             | Every time I needed to write something algorithmically
             | demanding, I could do it in a day or two. Im not into
             | Leetcoding, or competitive coding.
             | 
             | Most regular everyday programmers can work these things out
             | in one or two workdays.
             | 
             | Its definitely not like the competitive programmers say,
             | like if you aren't into this full time, at the time you
             | need to write something you won't have access to time,
             | internet and even an IDE and have to write the code in a
             | Google doc(which needs internet connection, when I pointed
             | this out in the interviews they didn't like it).
        
           | groby_b wrote:
           | About 100%?
           | 
           | When do you write code that _doesn 't_ need to search?
           | (Unless you hand it all to the database, in which case...
           | sure, you're good)
        
           | powersnail wrote:
           | That's a very surprising angle of questioning. Are you
           | writing some sort of compile-time-only programs?
        
           | n4r9 wrote:
           | I have an object representing piecewise linear functions,
           | defined by a sequence of (x_i, y_i) tuples. It has a method
           | to evaluate y for a given value of x. The first step is to
           | find the least i for which x_i is greater than or equal to x:
           | this uses a binary search. Initially, I stored the x values
           | in a separate array and used dotnet's inbuilt
           | Array.BinarySearch. Later I removed this array to save on
           | memory and it now runs a handbuilt binary search on the first
           | items of the array of points instead.
        
         | monkeyelite wrote:
         | The C++ standard library phrases binary search as a
         | partitioning problem - finding the index where a predicate goes
         | from false to true, which is helpful for me.
        
           | hoseja wrote:
           | Recalling the subtle difference between std::lower_bound and
           | std::upper_bound and where it matters is somewhat annoying
           | before it becomes muscle memory.
        
         | scuol wrote:
         | The way I always remember leftmost and rightmost binary search
         | (the C++ equivalents-ish of lower_bound and upper_bound) is to
         | always have a "prior best" and then move the bounds according
         | to the algo
         | 
         | while (l < r)
         | 
         | {
         | 
         | //find the midpoint
         | 
         | auto mp = l + (l-r)/2;
         | 
         | if (nums[mp] == target) { prior = target;
         | 
         | #ifdef upper_bound
         | 
         | l = target + 1; // move the left bound up, maybe there's more
         | up there we can look for!
         | 
         | #else
         | 
         | //lower bound, we found the highest known instance of target,
         | but let's look in the exclusive left half a bit more
         | 
         | r = target - 1;
         | 
         | #endif
         | 
         | }
         | 
         | excuse the terrible formatting, it's been a long day grinding
         | leetcode after getting laid off...
        
           | hiAndrewQuinn wrote:
           | Godspeed, fellow LeetCoder. I'm not currently grinding but I
           | still have my handful of practice problems in active Anki
           | rotation.
           | 
           | I have my rightmost code at part III of the miniseries, [1].
           | It loks quite similar, but I save the -1 for the very end
           | return.                   def rightmost_bsearch(L, T):
           | l, r = 0, len(L)                while l < r:             #
           | Loop invariants. Always true! Comment out in production!
           | for somewhat_smol in L[0:l]:               assert
           | somewhat_smol <= T         # Note: Weak inequality now.
           | for lorg in L[r:len(L)]:               assert lorg > T
           | mid = (l + r) // 2                  if L[mid] > T:
           | r = mid             else:               l = mid + 1
           | return r - 1 # return the first element AFTER L[r:len(L)].
           | 
           | (It should technically be BEFORE, I guess. If it helps all
           | rightmost bsearches are also leftmost bsearches on the
           | reversed array, so AFTER is secretly not wrong)
           | 
           | [1]: https://hiandrewquinn.github.io/til-site/posts/binary-
           | search...
        
         | godelski wrote:
         | This makes for an interesting test. I checked Claude Sonnet,
         | just for the he'll of it                 Prompt:       Please
         | write a bug free binary search in python            Answer:
         | def binary_search(arr, target):         """         Performs
         | binary search on a sorted array.                Args:
         | arr: A sorted list of comparable elements             target:
         | The element to search for                Returns:
         | The index of target if found, -1 otherwise         """
         | left = 0         right = len(arr) - 1                while left
         | <= right:             # Avoid potential overflow by using this
         | formula             mid = left + (right - left) // 2
         | if arr[mid] == target:                 return mid
         | elif arr[mid] < target:                 left = mid + 1
         | else:                 right = mid - 1                return -1
         | 
         | # Example usage and test cases if __name__ == "__main__": #
         | Test cases test_array = [1, 3, 5, 7, 9, 11, 13, 15]
         | print(binary_search(test_array, 7))    # Output: 3
         | print(binary_search(test_array, 1))    # Output: 0
         | print(binary_search(test_array, 15))   # Output: 7
         | print(binary_search(test_array, 4))    # Output: -1
         | print(binary_search([], 5))            # Output: -1
         | print(binary_search([5], 5))           # Output: 0
         | print(binary_search([5], 3))           # Output: -1
        
           | hiAndrewQuinn wrote:
           | Very good! Looks good to me. One small callout:
           | mid = left + (right - left) // 2
           | 
           | This implementation detail is to my knowledge unnecessary in
           | Python because Python's built-in int type has arbitrary-
           | precision integers. It's intended to avoid buffer overflows
           | in languages like C.
           | 
           | Imagine, say, that left is 1 and right is 2^63 - 1. In Python
           | left + right will just give you 2^63, no big deal. In C, left
           | + right will overflow and produce undefined behavior; in
           | practice it usually gives you I think -2^63, which is
           | obviously going to screw up the bsearch a bit. It isn't
           | wrong, just slightly less idiomatic for the language.
           | 
           | Python's interpreter may or may not be able to recognize and
           | refactor the slight inefficiency of the extra arithmetic
           | operation out. I will only point out that most of the time
           | when we write our own bsearch it's because we want to
           | optimize something really fundamental to the codebase. Any
           | time you have to whip up your own algorithm it's prima facie
           | evidence that that part of the code might be good to profile.
        
             | naniwaduni wrote:
             | Even in languages like C, it's mostly voodoo in practice
             | unless you're using undersized indices anyway--you surely
             | don't have anywhere near SIZE_MAX/2 elements.
             | 
             | The underlying problem is sloppy use of int for indices.
             | The roundabout calculation will still overflow at twice the
             | size.
        
               | namibj wrote:
               | For scalar code that's fair, but keep in mind that with
               | `gather`, you can vectorize it, and there you do worry
               | about integer width as more width means fewer lanes.
               | Though tbf you probably should use B+ Tree structures
               | chunked to your bulk/cache fetch "line/page" size, and
               | vectorized higher-than-radix-2 search within each node.
        
         | mdnahas wrote:
         | I read about this and started using binary search as my
         | interview question. It worked well - about 2/3rds of highly
         | credentialed applicants could not write a working
         | implementation in 20 minutes. Most failures went into an
         | infinite loop on simple cases! The ones who could write it
         | usually did so quickly.
         | 
         | I think part of reason is that most people were taught a bad
         | interface. Even the code on Wikipedia says "Set L to 0 and R to
         | n-1". That is, R is an inclusive bound. But we've learned that
         | for most string algorithms, it is better when your upper bound
         | is an exclusive bound, that is, n.
         | 
         | I've wanted to do an experiment testing that hypothesis. That
         | is, ask a large number of people to write it with different
         | function prototypes and initial calls and see how many buggy
         | implementations I get with inclusive vs exclusive upperbound vs
         | length.
        
           | andyjohnson0 wrote:
           | > I read about this and started using binary search as my
           | interview question. It worked well - about 2/3rds of highly
           | credentialed applicants could not write a working
           | implementation in 20 minutes.
           | 
           | I'd be interested to know why you feel that this is a useful
           | interview question.
           | 
           | Implementing a well-known and widely implemented algorithm
           | under stressful interview conditions wouldn't seem to me to
           | demonstrate much about a candidate. In that situation I'd
           | much rather be given a printout of some code and asked to
           | talk around the implementation - how the code maps to the
           | abstract algorithm, implementation trade-offs, readability,
           | etc.
        
             | mettamage wrote:
             | It showcases they have done some leetcode. Or it showcases
             | they can write code when someone is looking over your
             | shoulder. There are company cultures where these skills are
             | needed.
             | 
             | Companies that don't fall into a culture like this are
             | indeed deluded.
             | 
             | I would do the following:
             | 
             | 1. Write a tax calculator that reads in data from a CSV, or
             | a similar question to showcase programming ability.
             | 
             | 2. Do a small paid project. If it was good enough, hire
             | them. If it wasn't, then give feedback and do another one
             | in a week, tell them to learn as much about the topic as
             | possible.
             | 
             | 3. Do a small paid project a week later. If their learning
             | agility is high, then hire them still.
        
           | Etheryte wrote:
           | Based on your description, it sounds like a bad interview
           | question. When otherwise capable candidates can't answer your
           | trivia, maybe it isn't a useful signal? More than anything,
           | it just sounds like you're selecting for people who practiced
           | leetcode vs those who didn't.
        
             | zozbot234 wrote:
             | It's not whether you can solve it right away, it's how you
             | _approach_ the problem solving process to reach a viable
             | solution. If  "try to prove that this code is actually
             | correct" is part of that process, that's the kind of
             | "trivia" that can really be widely applicable even in a
             | working environment, in a way that random leetcode algos
             | often aren't.
             | 
             | I vote for making "how would you think about proving this
             | correct?" a question in all algo-focused interviews.
        
             | hiAndrewQuinn wrote:
             | To be precise, Zoz said
             | 
             | >about 2/3rds of highly _credentialed_ applicants
             | 
             | Credentialed [?] capable.
        
           | hiAndrewQuinn wrote:
           | Indeed, this is actually one reason why I like Python's
           | closed-left, open-right syntax. It lets us sidestep the
           | inclusive bound issue entirely, because, as noted in the
           | post, for all 0 <= l <= r <= len(L) where L is a Python list,
           | L         == L[0:len(L)]         == L[0:l] + L[l:len(L)]
           | == L[0:l] + L[l:r] + L[r:len(L)]
           | 
           | I actually didn't like this syntax until I had to work this
           | out. Now everyone else's approach seems silly to me.
        
             | rockostrich wrote:
             | Don't most slice methods behave this way? Javascript's
             | `Array.prototype.slice`, Java's `Arrays.copyOfRange`,
             | Golang's slicing syntax is similar to Python's except the
             | 3rd value is the max size of the resulting slice rather
             | than the step value all behave this way.
        
       | Animats wrote:
       | Now if we can get LLMs to do that, they might code better. Proofs
       | are a lot of work, but might keep LLMs on track.
        
         | mprast wrote:
         | I was thinking the same thing!
        
           | Nevermark wrote:
           | In any other industry the "worker class" would be sabotaging
           | the machines.
           | 
           | In the merry software world, the challenges of improving our
           | potential replacements are far too interesting and satisfying
           | to leave room for any worry about consequences! :)
        
         | bcheung wrote:
         | The problem is the training corpus tends towards mediocre code.
         | But with an agentic loop that analyzes the code against those
         | criteria and suggests changes then I think it might be
         | possible. I wouldn't try to get it to generate that right off
         | the bat.
        
         | AIPedant wrote:
         | Most proof assistants do a good job with autogenerating a
         | (deterministically correct) proof given a careful description
         | of the theorem. Working on integrating this into mainstream
         | languages seems much more worthwhile than training an LLM to
         | output maybe-correct proof-seeming bits of text.
        
         | purplesyringa wrote:
         | IME, LLMs are completely incapable of reasoning about anything
         | remotely difficult. All you'll get are proofs by assertion, not
         | anything rigid you can trust.
        
       | gregorygoc wrote:
       | Writing correct proofs is hard. Program verification is hard. In
       | my opinion if you are hand weaving it there's no benefit.
       | Thinking about invariants and pre-post conditions is often
       | unnecessary or greatly reduced if you write idiomatic code for
       | the language and codebase. Check out "The Practice of
       | Programming" by R. Pike and B. W. Kernighan. The motto is:
       | simplicity, clarity, generality. I find it works really well in a
       | day job.
       | 
       | On a slightly related note... Competitive programming is
       | surprisingly good at teaching you the right set of techniques to
       | make sure your code is correct. You won't progress beyond certain
       | stage unless you pick them up.
        
         | wizzwizz4 wrote:
         | Turn your first paragraph on its head:
         | 
         | Appropriate abstractions (i.e., "idiomatic code for the
         | language and codebase") make program verification easy. If you
         | are hand-weaving an appropriately-abstracted program, there's
         | little benefit to thinking about loop invariants and pre-post
         | conditions, since they don't exist at that level of generality:
         | correct proofs follow directly from correct code.
        
           | gregorygoc wrote:
           | No, appropriate abstractions are insufficient for my
           | argument. For example: there's one way to write an idiomatic
           | loop in C and it inherits necessary invariants by
           | construction.
           | 
           | I highly recommend reading the book, it explains concept of
           | writing idiomatic code way better than I ever could.
        
             | chowells wrote:
             | That's a really funny example, given how many bugs have
             | been found in C programs because idiomatic loops are wrong
             | in the edge cases.
             | 
             | How do you idiomatically write a loop to iterate over
             | signed ints from i to j (inclusive) in increasing order,
             | given i <= j?
             | 
             | What does that loop do when j is INT_MAX?
        
               | wk_end wrote:
               | I can imagine someone who sketches out little proofs in
               | their head - or even on paper - missing that case too.
               | It's easy to forget you're not doing normal arithmetic
               | when doing arithmetic in C!
        
               | chowells wrote:
               | Yeah, it's a hard case in general. But C's idioms really
               | don't encourage you to think about it. You really need to
               | default to a loop structure that checks the termination
               | condition after the loop body but before the increment
               | operation for inclusive end coordinates.
               | 
               | It's easy to think that's what do/while is for, but it
               | turns out to be really hard to do the increment operation
               | after the conditional, in general. What you really want
               | is a loop structure with the conditional in the middle,
               | and the only general purpose tool you get for that is a
               | break. C (or any language with similar syntax) really
               | doesn't have an idiom for doing this correctly.
        
               | wizzwizz4 wrote:
               | This may be hubris, but...                 int i = start;
               | do thing_with(i) while (i++ <= end);
        
               | chowells wrote:
               | I did consider that, but I wrote "in general" for a
               | reason. It works very specifically in the case of "add
               | one" or "subtract one", but it doesn't work with anything
               | more complicated, like chasing pointers or
               | adding/subtracting more than one at a time.
               | 
               | You could write functions to do the update and return the
               | old value so you could use them in the same way, but I
               | don't like this either. This is mostly because it orders
               | the termination check and the update logic the wrong way
               | around. If there's IO involved in checking for the next
               | thing, for example, side effects of that unnecessary
               | operation might interfere with other code.
               | 
               | You could resolve that by moving the termination check
               | into the update logic as well, but now you're seriously
               | complecting what should be independent operations. I
               | don't think the tradeoff is there versus just using a
               | break. But mostly, this is a self-inflicted problem in
               | C's language constructs and idioms. I just don't have
               | this problem in many other languages, because they
               | provide end-inclusive looping constructs.
        
               | trealira wrote:
               | > I did consider that, but I wrote "in general" for a
               | reason. It works very specifically in the case of "add
               | one" or "subtract one", but it doesn't work with anything
               | more complicated, like chasing pointers or
               | adding/subtracting more than one at a time.
               | 
               | You're reminding me of the book "Modern Compiler Design."
               | The author goes over how to compile a general Pascal-
               | style for-loop correctly, accounting for increasing or
               | decreasing ranges, differing step sizes, and accounting
               | for overflow. It was written using just goto statements,
               | so I adapted a version of it to C. Just replace "intN_t"
               | with an actual integer size. It works by calculating the
               | number of times the loop will run. If "from" is equal to
               | "to," it's still going to run at least once. Again, this
               | is not mine, just adapted from the author's code (Dick
               | Grune's).                 // enumerate: print out numbers
               | from "from" to "to", inclusive, with step size "by"
               | void enumerate(intN_t from, intN_t to, intN_t by) {
               | uintN_t loop_count;           intN_t i;           if (by
               | > 0) {               if (from > to) return;
               | loop_count = (to - from) / by + 1;           } else if
               | (by < 0) {               if (from < to) return;
               | loop_count = (from - to) / -by + 1;           } else /*
               | (by == 0) */ {               // Maybe some kind of error
               | return;           }           for (i = from; ; i += by) {
               | printf("%d\n", i);               if (--loop_count == 0)
               | break;           }       }
               | 
               | You can see it's more complicated than the idiomatic C
               | for-loop, haha. But that's just a general solution. Like
               | you guys noted, it could be simpler for step sizes of 1.
        
               | Izkata wrote:
               | Isn't i going to be one past end on the last iteration?
               | 
               | I think you want either "++i <= end" or "i++ < end", and
               | that's assuming "end" is inclusive. If it's exclusive
               | then "++i < end", right?
               | 
               | (Using "<=" in a termination condition has become a code
               | smell for me, it's been wrong more often than not, so it
               | sticks out as "pay close attention here")
        
               | chowells wrote:
               | Actually, I think what you really need is "i++ != end".
               | The whole point was inequalities don't work how you want
               | when end is INT_MAX. You want to just terminate after the
               | loop body where i was equal to end. Except now I'm not
               | sure if that evades the UB of a signed int exceeding
               | INT_MAX. I don't know C well enough to know if the UB is
               | in performing the increment or reading the value
               | afterwards.
               | 
               | Edit: Just checked in with a C expert. The UB is in the
               | increment operation, so that's not correct after all. You
               | really do just need to separate out the update from the
               | test entirely.
        
               | trealira wrote:
               | Yeah, it would have to be something like this if you
               | wanted to avoid the signed integer overflow at the end of
               | the loop:                 assert(i <= j);       for (;;)
               | {           do_something_with(i);           if (i == j)
               | break;           i++;       }
        
         | xg15 wrote:
         | Have to strongly disagree here. I don't think the OP meant
         | thinking up a complete, formal, proof. But trying to understand
         | what kind of logical properties your code fulfills - e.g. what
         | kind of invariants should hold - will make it a lot easier to
         | understand what your code is doing and will remove a lot of the
         | scare factor.
        
           | mprast wrote:
           | yes, what i had in mind were more proof sketches than proofs
        
           | Nevermark wrote:
           | Yes, we could call this "maintaining plausible provability".
           | 
           | Code for which there is not even a toehold for an imagined
           | proof might be worth cordoning off from better code.
        
           | Sharlin wrote:
           | Types constitute this sort of a partial proof. Not enough to
           | encode proofs of most runtime invariants (outside powerful
           | dependent type systems) but the subset that they _can_ encode
           | is extremely useful.
        
           | gregorygoc wrote:
           | Yeah, and I'm saying if your code is idiomatic you get
           | necessary invariants for free.
        
             | auggierose wrote:
             | Is idiomatic related to idiotic?
        
               | bmn__ wrote:
               | People downvoted you because instead of polluting the
               | discussion, you could have looked this up yourself.
               | 
               | The answer is yes, both words are related to idios,
               | "own", "self", "private".
        
               | auggierose wrote:
               | Please don't comment about the voting on comments. It
               | never does any good, and it makes boring reading.
               | 
               | https://news.ycombinator.com/newsguidelines.html
        
               | bmn__ wrote:
               | lame
        
               | spauldo wrote:
               | It is if you're writing vimscript.
        
             | wahern wrote:
             | I bought The Practice of Programming years ago. It's a
             | great book that is no less relevant today, but I don't see
             | your argument. The suggestions you've summarized are
             | critical advice, but rather than obviate the need for the
             | proof-like mindset, they complement it. Idiomatic code
             | doesn't directly help you solve and implement difficult
             | algorithmic or architectural problems. However, idiomatic
             | code certainly helps reduce noise so that the irreducibly
             | complex aspects of a solution conveniently standout, both
             | conceptually and literally.
        
               | gregorygoc wrote:
               | I do agree they do complement each other... to some
               | extent. My point was rather that if you write a good
               | code, keeping logical arguments in your head should be
               | reduced to the minimum. To the point that only testing
               | edge cases and a happy path in your mind should be
               | sufficient.
               | 
               | To give a more concrete example: I have recently seen a
               | very complicated piece of logic checking whether a
               | subsequent code should be even invoked, but it was hidden
               | in a debris of core logic and other verification. It
               | easily could have been separated and rewritten as a
               | series of early returns and this is what a precondition
               | is. I'm sure someone who wrote the code had it in their
               | mind but was not familiar enough with the early return
               | technique to embed this in code. Had they been, their
               | brain power could've been utilized more efficiently.
        
         | pipes wrote:
         | Could you elaborate on those techniques from competitive
         | programming please. Genuinely interested! :)
        
           | kevinventullo wrote:
           | +1 This is definitely the wall I hit with competitive
           | programming. I logically know how to solve the problem, my
           | code just ends up having one too many bugs that I can't track
           | down before time is up.
        
         | mathgradthrow wrote:
         | There is no substitute for writing correct programs, no matter
         | how hard it is. If you want correct programs you have to write
         | them correctly.
        
         | monkeyelite wrote:
         | The most basic idea of a proof is an argument for why something
         | is true. It's not about avoiding small mistakes, it's about
         | getting directionally correct.
        
         | maxbond wrote:
         | I think the causality is flipped here, when you carefully
         | consider a problem the result is often very clean and clear
         | code. The clarity in your thinking is reflected in the code and
         | it's structure.
         | 
         | But writing clean and clear code in the hopes that it's good
         | aesthetics will result in correctness would be cargo culting.
         | (Writing clean code is still worthwhile of course, and clean
         | code + code review is likely to result in better correctness.)
         | 
         | Form follows function, not the other way around.
        
       | bluetomcat wrote:
       | > My thesis so far is something like "you should try to write
       | little proofs in your head about your code." But there's actually
       | a secret dual version of this post, which says "you should try to
       | write your code in a form that's easy to write little proofs
       | about."
       | 
       | Easier said than done. It is certainly feasible on greenfield
       | projects where all the code is written by you (recently), and you
       | have a complete mental model of the data layout and code
       | dependencies. It's much harder to prove stuff this way when you
       | call foo(), bar() and baz() across unit boundaries, when they
       | modify global state and are written by different developers.
        
         | akavi wrote:
         | A part of "being a good developer" is being able to evolve
         | systems in this direction. Real systems are messy, but you can
         | and should be thoughtful about:
         | 
         | 1. Progressively reducing the number of holes in your
         | invariants
         | 
         | 2. Building them such that there's a pit of success (engineers
         | coming after you are aware of the invariants and "nudged" in
         | the direction of using the pathways that maintain them).
         | Documentation can help here, but how you structure your code
         | also plays a part (and is in my experience the more important
         | factor)
        
           | mprast wrote:
           | yeah, this is what i was trying to get at with that notion of
           | "proof-affinity"; imo a well-structured codebase is one in
           | which you can easily prove stuff to yourself about code you
           | didn't necessarily write
        
         | xg15 wrote:
         | Even then, you can get a map of "safe" and "unsafe" locations
         | in the codebase, I.e. you can get a mental model of which
         | functions modify global state and which don't.
        
         | msteffen wrote:
         | I think your frustration is illustrative, actually: the reason
         | mutable global state is bad is because in order to prove that a
         | piece of code reading it is correct, you have to know what _the
         | entire rest of the program_ is doing.
         | 
         | If, instead, you make the global variable immutable, make the
         | state variable into a function argument, or even wrap the
         | mutable global state in some kind of helper class, then you
         | only need to know what the callers of certain functions are
         | doing. The visibility of those functions can be limited. Caller
         | behavior can be further constrained with assertions inside the
         | function. All of these (can) make it easier to prove that the
         | reader is correct.
         | 
         | I think most programmers already do this, actually; they just
         | don't think of their decisions this way.
        
           | bluetomcat wrote:
           | It's a good direction to follow, but it can only get you so
           | far. Some pieces of code do naturally evolve into a
           | functional formalism, while others are inherently imperative.
           | Usually, the top levels of your program (event loop) are
           | imperative and deal with stateful "devices" like IO, the
           | screen and storage subsystems. The "leaf" functions from the
           | call graph can be functional, but you still can't reason
           | about the whole program when it is imperative at the top.
        
             | thfuran wrote:
             | Yes, but if you can actually drive all that out to the very
             | top level and leave everything everything else clean,
             | that's a huge improvement over how many programs are
             | structured.
        
         | jimbokun wrote:
         | > It's much harder to prove stuff this way when you call foo(),
         | bar() and baz() across unit boundaries, when they modify global
         | state and are written by different developers.
         | 
         | I think this reinforces the article's point.
         | 
         | Code like this is much more likely to contain bugs and be
         | harder to maintain without introducing more bugs, than programs
         | written from the outset with this goal of "provability".
        
           | hinkley wrote:
           | I just finished cave diving in a code base that's had a very
           | old ticket to clean up the mess.
           | 
           | I went in with 3 tickets in mind to fix. I found half a dozen
           | more while I was down there, and created 2 myself. I don't
           | know if I got off easy or I was distracted and missed things.
           | 
           | The other project I'm working on is not dissimilar. Hey did
           | you guys know you have a massive memory leak?
        
           | monocasa wrote:
           | And in fact, when you look at formally proven code, 80% of
           | the shtick is reducing complexity of mutation to something
           | tractable. That's valuable whether or not you then actually
           | go through the process of formally proving it.
        
         | hinkley wrote:
         | > when they modify global state and are written by different
         | developers.
         | 
         | Once cancer has metastasized, the treatment plans are more
         | aggressive and less pleasant. The patient can still be saved in
         | many cases, but that depends on a lot of external factors.
        
         | ericmcer wrote:
         | This made me think how I constantly rethink my approach to
         | coding and learning how to do it "right" over and over.
         | 
         | I wonder if someone like a John Carmack is just like... yeah I
         | got it or he also is constantly feeling like he was bad 5 years
         | ago and is doing it "better" now.
        
           | saghm wrote:
           | This is an important point I like to mention to people.
           | Thinking back to old code you wrote and how you would choose
           | to do it better than you originally did it a sign of growth,
           | and if that stops happening after a certain point, it's a
           | sign you've stopped growing. Given that the best programmers
           | in the world still don't write perfectly bug-free code, I
           | don't have any illusions that there will always be plenty of
           | room for me to continue improving as a programmer, so I
           | consider having those thoughts to be an unequivocally good
           | thing. I'd honestly be a bit wary of anyone who didn't think
           | they needed to learn or grow any more as a programmer!
        
       | nayuki wrote:
       | I learned the foundations of theoretical computer science in
       | university. I agree with the sentiment of this post, though it is
       | hard to perform in practice.
       | 
       | In addition to pre-conditions and post-conditions, I would like
       | to emphasize that loop invariants and structural induction are
       | powerful techniques in CS proofs.
       | https://en.wikipedia.org/wiki/Loop_invariant ,
       | https://en.wikipedia.org/wiki/Structural_induction
       | 
       | These notes from UofT's CSC236H are on-topic:
       | https://www.cs.toronto.edu/~david/course-notes/csc236.pdf#pa...
        
         | sitkack wrote:
         | Those notes are great, David Liu seems while a wholesome dude
         | https://www.cs.toronto.edu/~david/research.html
        
         | atlintots wrote:
         | UofT mentioned! Let's go!
        
       | skybrian wrote:
       | The software he's working on seems rather obscure and perhaps not
       | launched yet? get-nerve.com can't be resolved.
        
         | mprast wrote:
         | not launched yet! I will put up a post when it launches; coming
         | in a few months!
        
       | roadside_picnic wrote:
       | What's interesting about this view is that theorem provers
       | ultimately boil down to sufficiently advanced type checking (the
       | book _Type Theory and Formal Proof_ is an excellent overview of
       | this topic). Literally the heart of proof assistants like Lean is
       | "if it compiles, you've proved it".
       | 
       | So more practical type systems can be viewed as "little" theorem
       | provers in the sense the author is describing. "Thinking in
       | types" so to speak, is mathematically equivalent to "writing
       | little proofs in your head". I would also add that _merely_ using
       | types is not equivalent to  "thinking in types" as one would be
       | required to do writing a sufficiently advanced Haskell program.
        
         | trealira wrote:
         | It's true that theorem provers can be just sufficiently
         | advanced type systems (the Curry Howard correspondence), but
         | not all theorem provers operate that way. Isabelle/HOL operates
         | on higher order logic. Imperative ones like SPARK or Dafny just
         | rely on encoding preconditions and postconditions and things
         | like loop invariants and just use SMT solvers for verification,
         | IIRC.
         | 
         | Having an advanced type system does seem to encourage this sort
         | of informal proof oriented thinking more than imperative and
         | somewhat typeless languages do, though, since preconditions and
         | postconditions and loop invariants and inductive proofs on
         | loops are things you have to do yourself entirely in your head.
        
       | xg15 wrote:
       | I'd also add mutability and immutability to the list of
       | properties. Keeping as much state as possible immutable doesn't
       | just help with multithreading, it will also greatly reduce the
       | headache when trying to understand the possible states of a
       | program.
        
         | jimbokun wrote:
         | The article does include those.
        
       | dblohm7 wrote:
       | I don't do this often, but when I do, it's almost always when
       | writing non-trivial concurrent code. I'll often "fuzz" the
       | scheduling of multiple tasks around the region of code I'm
       | working on to prove to myself that it works.
        
       | cespare wrote:
       | I really identify with this way of thinking. One domain where it
       | is especially helpful is writing concurrent code. For example, if
       | you have a data structure that uses a mutex, what are the
       | invariants that you are preserving across the critical sections?
       | Or when you're writing a lock-free algorithm, where a proof seems
       | nearly required to have any hope of being correct.
        
       | taeric wrote:
       | I feel like this is reaching for the vocabulary that you get in
       | mathematics of axioms, postulates, and theorems. Not in a bad
       | way, mind. Just the general idea of building a separate
       | vocabulary for the different aspects of a system of knowledge.
       | 
       | I also think things get complicated with holding things constant
       | as a necessity. Often times, the best way to find a bug is to ask
       | not "how was this possible?" but "why would another part of the
       | system modify this data?"
       | 
       | Obviously, if you can make things reference capable data, you
       | should do so. But, all too often "this data being X" has a
       | somewhat self evident explanation that can help. And just "copy
       | so I don't have to worry about it" lands you squarely in "why is
       | my copy out of date?"
        
         | phkahler wrote:
         | Another nice pattern is to have any particular value written
         | exactly once, or from exactly one place in the code if it gets
         | updated.
         | 
         | For embedded control I'm finding the blackboard pattern with
         | single assignment to be super useful.
        
       | bcheung wrote:
       | I wonder how many of these rules can be incorporated into a
       | linter or be evaluated by an LLM in an agentic feedback loop. It
       | would be nice to encourage code to be more like this.
       | 
       | I notice you didn't really talk much about types. When I think of
       | proofs in code my mind goes straight to types because they
       | literally are proofs. Richer typed languages move even more in
       | that direction.
       | 
       | One caveat I would add is it is not always desirable to be forced
       | to think through every little scenario and detail. Sometimes it's
       | better to ship faster than write 100% sound code.
       | 
       | And as an industry, as much as I hate it, the preference is for
       | languages and code that is extremely imperative and brittle. Very
       | few companies want to be writing code in Idris or Haskell on a
       | daily basis.
        
         | ngruhn wrote:
         | Sorry for being pedantic but isn't the story that
         | types=theorems and programs=proofs?
        
           | sitkack wrote:
           | "Propositions as Types" by Philip Wadler
           | https://www.youtube.com/watch?v=IOiZatlZtGU
        
       | GMoromisato wrote:
       | Totally tangential, but I love to read post-mortems of people
       | fixing bugs. What were the initial symptoms? What was your first
       | theory? How did you test it? What was the resolution? Raymond
       | Chen does this a lot and I've always enjoyed it.
       | 
       | I learn more from these concrete case studies than from general
       | principles (though I agree those are important too).
       | 
       | One of my most recent bugs was a crash bug in a thread-pool that
       | used garbage-collected objects (this is in C++) to manage network
       | connections. Sometimes, during marking, one of the objects I was
       | trying to mark would be already freed, and we crashed.
       | 
       | My first thought was that this was a concurrency problem. We're
       | supposed to stop all threads (stop the world) during marking, but
       | what if a thread was not stopped? Or what if we got an event on
       | an IO completion port somehow during marking?
       | 
       | I was sure that it had to be a concurrency problem because (a) it
       | was intermittent and (b) it frequently happened after a
       | connection was closed. Maybe an object was getting deleted during
       | marking?
       | 
       | The only thing that was weird was that the bug didn't happen
       | under stress (I tried stress testing the system, but couldn't
       | reproduce it). In fact, it seemed to happen most often at
       | startup, when there weren't too many connections or threads
       | running.
       | 
       | Eventually I proved to myself that all threads were paused
       | properly during marking. And with sufficient logging, I proved
       | that an object was not being deleted during marking, but the
       | marking thread crashed anyway.
       | 
       | [Quick aside: I tried to get ChatGPT to help--o3 pro--and it kept
       | on suggesting a concurrency problem. I could never really
       | convince it that all threads were stopped. It always insisted on
       | adding a lock around marking to protect it against other
       | threads.]
       | 
       | The one thing I didn't consider was that maybe an object was not
       | getting marked properly and was getting deleted even though it
       | was still in use. I didn't consider it because the crash was in
       | the marking code! Clearly we're marking objects!
       | 
       | But of course, that was the bug. Looking at the code I saw that
       | an object was marked by a connection but not by the listener it
       | belonged to. That meant that, as long as there was a connection
       | active, everything worked fine. But if ever there were no
       | connections active, and if we waited a sufficient amount of time,
       | the object would get deleted by GC because the listener had
       | forgotten to mark it.
       | 
       | Then a connection would use this stale object and on the next
       | marking pass, it would crash.
        
         | TacticalCoder wrote:
         | > Totally tangential, but I love to read post-mortems of people
         | fixing bugs.
         | 
         | I know I already posted it moons ago but... Around 1991 I made
         | a little game, similar to Canon Ball on the MSX (which later on
         | Pang / Buster Bros did copy).
         | 
         | I had one weird case where sometimes the game would crash.
         | Plain crash. But sometimes after playing for 15 minutes and
         | already passing several levels. I just couldn't find it. I
         | couldn't reason about it. I was stuck.
         | 
         | So I decided to rewrite not the entire game but the part
         | dealing with the inputs / game logic to make it 100%
         | deterministic. It took me a long time to do that. Then
         | eventually I could record myself playing: I've record only the
         | player inputs and at which moment they'd happen, which would
         | make for tiny savefiles btw.
         | 
         | And eventually while I was playing and recording, the crash
         | occured. I tried my replay: it worked... It replayed the
         | savefile flawlessly and the game crashed again.
         | 
         | At that point I knew the bug was gone: being able to reproduce
         | a bug in a deterministic way means I was going to fix it.
         | 
         | Turns out it was a dangling pointer (ah, C...): when the hero
         | would catch an extra allowing him to fire two shots at once on
         | screen (usually he'd only be allowed one) and would the first
         | shot kill the last ball on screen, then on the next level the
         | second shot would somehow (due to an error on my part) continue
         | to live its live, eventually corrupting memory.
         | 
         | Fun stuff.
         | 
         | FWIW having deterministic game engines wasn't a thing back
         | then. It became common later on, with games like Age of Empires
         | and Warcraft III etc. using them: it was obvious for savefiles
         | allowing to replay/exchangs games were tinier than tiny: they'd
         | only save at which frame a player input happened and they'd
         | replay the entire game from there [1]
         | 
         | I still have the code to that game, it was surprisingly fun.
         | Someone here already offered help in the past to get it back on
         | its feet. I've also got an executable that runs. I just don't
         | remember how the tooling worked. Still have the BAT files etc.
         | to build it, but not the tools. I really should look into that
         | one of these days but I'm kinda busy with other projects /
         | life.
         | 
         | [1] which also raised another issue: when game engines were
         | upgraded, you could have savefiles only playing on older
         | version of the game, so players would exchanges games and add
         | the version of the game they were destined to
        
       | pokemyiout wrote:
       | Is there a typo in the induction section? Should `simplifyTree`
       | and `simplifyGraph` be the same function?
       | 
       | function simplifyTree(root: Node): Node {                 let
       | newChildren = [] as Array<Node>;            for (const child of
       | root.children) {         const simplifiedChild =
       | simplifyGraph(child);                if
       | (shouldContract(simplifiedChild)) {           for (const
       | grandChild of simplifiedChild.children) {
       | newChildren.push(grandChild);               }         } else {
       | newChildren.push(simplifiedChild);         }       }
       | root.children = newChildren;            return root;     }
        
       | dkenyser wrote:
       | This reminds me of reading Accelerated C++ back in the day and I
       | think the part that stuck with me most from that book was the
       | idea of "holding the invariant in your head" (I'm paraphrasing a
       | bit).
       | 
       | It made me think a lot more about every line of code I wrote and
       | definitely helped me become a better programmer.
        
       | mathgradthrow wrote:
       | Monotone just means order preserving. f is monotone if f(a) >=
       | f(b) whenever a >= b.
        
       | wrs wrote:
       | As an undergrad at Carnegie Mellon in the 80s, I was explicitly
       | _taught_ to do all of these things in one of the first-year
       | programming courses. And it has served me very well since then.
       | 
       | I especially remember how learning the equivalence of recursion
       | and induction immediately eliminated the "frustrated trial and
       | error" approach for making recursive algorithms that work.
        
         | TheShermanTank wrote:
         | Took the course last year, and I started to appreciate it more
         | while taking functional programming lmao
        
       | wiseowise wrote:
       | To be a better programmer, program more and read less bullshit on
       | the internet.
        
       | tristramb wrote:
       | The idea that you should design programs with proof in mind goes
       | back to T. J. Dekker's solution to the mutual exclusion problem
       | in 1959. The story is told by Edgser Dijkstra in EWD1303 (https:/
       | /www.cs.utexas.edu/~EWD/transcriptions/EWD13xx/EWD130...). Much
       | of Dijkstra's later work can be seen as him working out the
       | consequences of this insight.
        
       | thesuperbigfrog wrote:
       | Why not include little proofs in the code when you can?
       | 
       | Several programming languages and libraries support Design-by-
       | Contract (https://en.wikipedia.org/wiki/Design_by_contract) which
       | lets you specify preconditions, postconditions, and invariants
       | directly in your code.
       | 
       | Those predicates can be checked in various ways (depending on how
       | deeply Design-by-Contract is supported) to help you know that
       | your code is working correctly.
       | 
       | Ada supports Design-by-Contract as part of the language:
       | https://learn.adacore.com/courses/intro-to-ada/chapters/cont...
       | 
       | SPARK extends Ada Design-by-Contract into full proofs:
       | https://learn.adacore.com/courses/intro-to-spark/index.html
       | 
       | Rust has the Contracts crate:
       | https://docs.rs/contracts/latest/contracts/
       | 
       | Other programming languages have various levels of support or
       | libraries for Design-by-Contract:
       | https://en.wikipedia.org/wiki/Design_by_contract#Language_su...
        
         | bigmadshoe wrote:
         | Even standard assertions work as a version of this
        
           | thesuperbigfrog wrote:
           | Standard assertions certainly are better than keeping the
           | little proofs only in your head.
           | 
           | Many Design-by-Contract implementations are nicer than
           | standard assertions because they help to express intent
           | better and easily refer to the 'old' value of a parameter to
           | verify desired results.
        
       | bubblebeard wrote:
       | This is something I've practiced myself quite a lot in recent
       | years, and in my experience it's so much harder while at your
       | desk. Doing something else while letting your brain work really
       | helps. For me, that's usually going for a walk, taking a run in
       | the woods, or doing some repeditive household chore.
        
       | JohnMakin wrote:
       | I think I stumbled across a similar concept in the more difficult
       | post-grad classes I ended up in a long time ago. I began at some
       | point late in my undergrad doing math tests entirely in pen. I
       | didn't understand why, but it resulted in higher scores almost
       | always, and much neater scratchwork, which I had attributed to
       | the reason, but I think what was helping was something along the
       | lines of what this post is getting at.
       | 
       | What was helping me was that before I wrote a single expression,
       | I thought about it carefully in my head and where it would lead
       | before putting pen to paper, because I didn't want to make a
       | bunch of messy scratch out marks on it. Or, sometimes, I'd use a
       | healthy amount of throwaway scratch paper if allowed. Once my
       | path was fully formed in my head I'd begin writing, and it
       | resulted in far fewer mistakes.
       | 
       | I don't always take this approach to writing code but often I do
       | formulate a pretty clear picture in my head of how it is going to
       | look and how I know it will work before I start.
        
       | ge96 wrote:
       | While having been employed as a developer for almost 7 years now,
       | I remember taking this class maybe discrete math, I was not a fan
       | of that class, where you have problems like p -> q
       | 
       | Also farthest in math/physics I got was intro to quantum
       | mechanics which the multiple-pages to solve a problem lost me
       | 
       | Being a good programmer... I don't have a degree so I've never
       | really tried to get into FAANG. I also am aware after trying
       | Leetcode, I'm not an algorithm person.
       | 
       | What's funny is at my current job which it's a multi-national
       | huge entity thing but I have to try and push people to do code
       | reviews or fix small errors that make something look bad (like a
       | button being shorter than an input next to it).
       | 
       | I am self-aware with true skill, I can make things, but I don't
       | think I'd ever be a John Carmack. If you follow a framework's
       | pattern are you a good developer? I can see tangible metrics like
       | performance/some slow thing, someone better makes it faster.
       | 
       | Recently someone forked a repo of a hardware project I made. It's
       | fun watching them change it, to understand what I wrote and then
       | change it to fit their needs.
       | 
       | When I see my old code I do recognize how it was verbose/could be
       | much simpler.
        
       | pclowes wrote:
       | The best way I have found to integrate this approach is Test
       | Driven Development.
       | 
       | When done well, every test you write before you see it fail and
       | then you write the barest amount of code that you think will make
       | it pass is a mini-proof. Your test setup and assertions are what
       | cover your pre/post conditions. Base cases are the invariant.
       | 
       | The key here is to be disciplined, write the simplest test you
       | can, see the test fail before writing code, write the smallest
       | amount of code possible to make the test pass. Repeat.
       | 
       | The next level is how cohesive or tightly coupled your tests are.
       | Being able to make changes with minimal test breakage "blast
       | radius" increases my confidence of a design.
        
         | swat535 wrote:
         | TDD is also great for playing around with ideas and coming up
         | with a clean interface for your code. It also ensures that your
         | code is testable, which leads to improved readability.
         | 
         | You'll know quickly where you're going wrong because if you
         | struggle to write the test first, it's a symptom of a design
         | issue for example.
         | 
         | That being said, I wouldn't use it as dogma, like everything
         | else in CS, it should be used at the right time and in the
         | right context.
        
           | pclowes wrote:
           | I agree on the Dogma aspect. Plenty of times I have abandoned
           | it. However, I did find it very helpful to spend my first
           | couple years in a strict Xtreme Programming (XP) shop. The
           | rigor early on was very beneficial and its a safety net for
           | when I feel out of my depth in an area.
           | 
           | I tend to go the other way, I don't use TDD when I am playing
           | around/exploring as much as when I am more confident in the
           | direction I want to go.
           | 
           | Leaving a failing test at the end of the day as a breadcrumb
           | for me to get started on in the morning has been a favored
           | practice of mine. Really helps get the engine running and
           | back into flow state first thing.
           | 
           | The dopamine loop of Red -> Green -> Refactor also helps
           | break through slumps in otherwise tedious features.
        
         | saurik wrote:
         | A test is not a proof, and you can prove something works
         | without ever even running it. There are also properties of a
         | system which are impossible to test, or are so non-
         | deterministic that you a test will only fail once every million
         | times the code is executed. You really need to just learn to
         | prove stuff.
         | 
         | The most complex piece of code I have ever written, as a
         | relevant story, took me a month to prove to everyone that it
         | was correct. We then sent it off to multiple external auditors,
         | one of which who had designed tooling that would let them do
         | abstract interpretation with recursion, to verify I hadn't made
         | any incorrect assumptions. The auditors were confused, as the
         | code we sent them didn't do anything at all, as it had a stupid
         | typo in a variable name... I had never managed to figure out
         | how to run it ;P. But... they found no other bugs!
         | 
         | In truth, the people whom I have met whom are, by far, the
         | worst at this, are the people who rely on testing, as they seem
         | to have entirely atrophied the ability to verify correctness by
         | reading the code and modeling it in some mathematical way. They
         | certainly have no typos in their code ;P, but because a test
         | isn't a proof, they always make assumptions in the test suite
         | which are never challenged.
        
           | pclowes wrote:
           | Interesting, could you show me a formal proof that can't be
           | expressed in logic (ie. code) and then tested?
           | 
           | My thought here is that since proofs are logic and so is code
           | you can't have a proof that can't be represented in code. Now
           | admittedly this might look very different than typical say
           | JUnit unit tests but it would still be a test validating
           | logic. I am not saying every system is easily testable or
           | deterministic but overall, all else equal, the more tested
           | and testable a system is the better it is.
           | 
           | IME things that are very hard to test are often just poorly
           | designed.
        
             | aljgz wrote:
             | Consider a function that gets an array of integers and a
             | positive number, and returns the sum of the array elements
             | modulo the number. How can we prove using tests, that this
             | always works for all possible values?
             | 
             | Not discounting the value of tests: we throw a bunch of
             | general and corner cases at the function, and they will
             | ring the alarm if in the future any change to the function
             | breaks any of those.
             | 
             | But they don't prove it's correct for all possible inputs.
        
               | pclowes wrote:
               | I would lean towards types and property testing here
               | using tools like Coq.
        
               | aljgz wrote:
               | Can you elaborate?
        
             | layer8 wrote:
             | Tests can generally only test particular inputs and/or
             | particular external states and events. A proof abstracts
             | over all possible inputs, states, and events. It proves
             | that the program does what it is supposed to do regardless
             | of any particular input, state, or events. Tests, on the
             | other hand, usually aren't exhaustive, unless it's
             | something like testing a pure function taking a single
             | 32-bit input, in which case you can actually test its
             | correctness for all 2^32 possible inputs (after you
             | convinced yourself that it's truly a pure function that
             | only depends on its input, which is itself a form of
             | proof). But 99.99% of code that you want to be correct
             | isn't like that.
             | 
             | As an example, take a sorting procedure that sorts an
             | arbitrarily long list of arbitrarily long strings. You
             | can't establish just through testing that it will produce a
             | correctly sorted output for all possible inputs, because
             | the set of possible inputs is unbounded. An algorithmic
             | proof, on the other hand, can establish that.
             | 
             | That is the crucial difference between reasoning about code
             | versus merely testing code.
        
               | pclowes wrote:
               | I agree with that but I would say that if I required
               | formal verification of that kind I would move the proof
               | based rationale into the type system to provide those
               | checks.
               | 
               | I would add Tests can be probabilistically exhaustive (eg
               | property based testing) and answer questions beyond what
               | proof based reasoning can provide ie. is this sorting of
               | arbitrary strings efficient and fast?
        
               | joseneca wrote:
               | Proofs are arguably still better than tests at evaluating
               | efficiency, at least for smaller components/algorithms in
               | isolation. While there are cases where constant factors
               | that can't be described well in a proof matter, in most
               | cases, the crucial element of an algorithm's efficiency
               | lies in how the complexity scales, which can be proven in
               | the vast majority of cases. On the other hand, relying
               | solely on benchmarking introduces a lot of noise that can
               | be difficult to sort through.
        
           | Kambing wrote:
           | > A test is not a proof
           | 
           | Actually, a test _is_ a proof! Or more specifically, a
           | traditional test case is a narrow, specific proposition. For
           | example, the test `length([1, 2, 3]) = 3` is a proposition
           | about the behavior of the `length` function on a concrete
           | input value. The proof of this proposition is _automatically
           | generated_ by the runtime, i.e., the proof that this
           | proposition holds is the execution of the left-hand side of
           | the equality and observing it is identical to the right-hand
           | side. In this sense, the runtime serves as an automated
           | theorem prover (and is, perhaps, why test cases are the most
           | accessible form of formal reasoning available to a
           | programmer).
           | 
           | What we colloquially consider "proof" are really more
           | abstract propositions (e.g., using first-order logic) that
           | require reasoning beyond simple program execution. While the
           | difference is, in some sense, academic, it is important to
           | observe that testing and proving (in that colloquial sense)
           | are, at their core, the same endeavor.
        
             | JohnKemeny wrote:
             | That is stretching the definitions of proofs.
             | assert random() != 0.
             | 
             | QED
        
         | GuB-42 wrote:
         | I am not a fan of Test Driven Development, not at all.
         | 
         | Having your invariants and pre/post conditions correct is not
         | enough. You also need to do the right thing. For example, you
         | have a function that adds two durations in the form hh:mm:ss,
         | you have mm <= 60 and ss <= 60 as an invariant, testing it is a
         | good thing, but it won't tell you that your addition is
         | correct. Imagine your result is always 1s too much, you can
         | even test associativity and commutativity and it will pass.
         | Having these correct is necessary but not sufficient.
         | 
         | Problem is, when you write tests first, especially tight, easy
         | to run unit tests, you will be tempted to write code that pass
         | the tests, not code that does the right thing. Like throwing
         | stuff at your tests and see what sticks.
         | 
         | I much prefer the traditional approach of first solving the
         | problem the best you can, which may of may not involve thinking
         | about invariants, but always with the end result in mind. And
         | only when you are reasonably confident with your code, you can
         | start testing. If it fails, you will have the same temptation
         | to just pass the test, but at least, you wrote it at least once
         | without help from the tests.
         | 
         | Maybe that's just me, but when I tried the "tests first"
         | approach, the end result got pretty bad.
        
           | JohnKemeny wrote:
           | You've missed the most important point of TDD--and indeed, of
           | tests.
           | 
           | Tests do not ensure that your functions are correct; they
           | ensure that you are alerted when their behavior changes.
           | 
           | Second, TDD is a way to force dogfooding: having to use the
           | functions you're going to write, before you write them, helps
           | you design good interfaces.
        
             | GuB-42 wrote:
             | > Tests do not ensure that your functions are correct; they
             | ensure that you are alerted when their behavior changes.
             | 
             | I agree with that part and I am not against tests, just the
             | idea of writing tests first.
             | 
             | > helps you design good interfaces
             | 
             | I am sure plenty of people will disagree but I think
             | testability is overrated and leads to code that is too
             | abstract and complicated. Writing tests first will help you
             | write code that is testable, but everything is a
             | compromise, and to make code more testable, you have to
             | sacrifice something, usually in the form of adding
             | complexity and layers of indirection. Testability is good
             | of course, but it is a game of compromises, and for me,
             | there are other priorities.
             | 
             | It makes sense at a high level though, like for public
             | APIs. Ideally, I'd rather write both sides at the same
             | time, as to have a real use case rather than just a test,
             | and have both evolve together, but it is not always
             | possible. In that case, writing the test first may be the
             | next best thing.
        
               | MoreQARespect wrote:
               | >It makes sense at a high level though
               | 
               | This is the way I've always done TDD.
               | 
               | I don't think it makes sense to do it any other way. If a
               | test case doesn't map on to a real scenario you're trying
               | to implement the code for it doesn't make any sense to
               | write it.
               | 
               | I find that people who write the test after tend to miss
               | edge cases or (when they're trying to be thorough) write
               | _too many_ scenarios - covering the same code more than
               | once.
               | 
               | Writing the test first and the code that makes it pass
               | next helps to inextricably tie the test to the actual
               | code change.
               | 
               | >but it is not always possible
               | 
               | I don't think I've written any production code in years
               | where I gave up because it was intrinsically not
               | possible.
        
               | GuB-42 wrote:
               | > I don't think I've written any production code in years
               | where I gave up because it was intrinsically not
               | possible.
               | 
               | What I meant by "not possible" is writing both sides of
               | the API at the same time. For example, you write a
               | library for overlaying maps on video feeds, it is good if
               | you are also writing the application that uses it. For
               | example a drone controller. So in the early phase, you
               | write the library specifically for your drone controller,
               | changing the API as needed.
               | 
               | But sometimes, the drone controller will be made by
               | another company, or it may be a project too big not to
               | split up, that's the "not possible" part. And without a
               | clear, in control use case, you have to make guesses, and
               | writing tests can help make good guesses.
        
           | pclowes wrote:
           | The flow here for me is if the code is doing the wrong thing
           | I:
           | 
           | - Write a test that demonstrates that it is doing the wrong
           | thing
           | 
           | - Watch it fail
           | 
           | - Change the code to do the right thing
           | 
           | - Ensure the test passes
           | 
           | And in return I get regression prevention and verified
           | documentation (the hopefully well structured and descriptive
           | test class) for almost free.
           | 
           | I don't think any amount of testing absolves the programmer
           | from writing clear, intention-revealing code that is correct.
           | TDD is just a tool that helps ensure the programmers
           | understanding of the code evolves with code. There have been
           | so many times where I write code and expect a test to
           | fail/pass and it doesn't. This detects subtle minute drift in
           | understanding.
        
           | gitremote wrote:
           | I do TDD and write proofs as tests. TDD practitioners never
           | said TDD is a substitute for thinking.
           | 
           | > Problem is, when you write tests first, especially tight,
           | easy to run unit tests, you will be tempted to write code
           | that pass the tests, not code that does the right thing. Like
           | throwing stuff at your tests and see what sticks.
           | 
           | I never had that problem, but I knew how to code before I
           | started TDD.
        
       | init1 wrote:
       | Instead of just thinking about the proof in your mind or putting
       | it in comments you can actually embed pre/post conditions and
       | compile time proofs in actual code using ADA SPARK. You use Pre
       | and Post aspects for subprograms and pragma Invariant for loops
       | and such constructs. The spark toolchain will run provers against
       | your code and show if it can show the proofs hold.
       | 
       | https://www.youtube.com/watch?v=tYAod_61ZuQ
        
       | linguistbreaker wrote:
       | I wonder if there is a rigorous/formal method to identify
       | 'monotonicity' in code? Seems like it should be
       | straightforward...
        
       | iamwil wrote:
       | I really liked this post. Does anyone have other blog posts that
       | they liked that articulates a taste for good engineering?
        
       | mreid wrote:
       | Another thing I learnt from my math degree that I find helps a
       | lot when programming and software engineering more generally is
       | *defining your terms*.
       | 
       | So many communication issues on teams occur when people are using
       | the same words to mean different things.
        
         | segmondy wrote:
         | lookup "ubiquitous language" it's a phrase from domain driven
         | design
        
           | mreid wrote:
           | Thanks, I hadn't seen that term before.
           | 
           | Just to be clear, I wasn't claiming that "communicating
           | clearly" is a new idea in software engineering, I'm mainly
           | commenting on how effective embracing it can be.
           | 
           | When doing math, pretty much every term is "load-bearing" in
           | that arguments will make use of specific aspects of a concept
           | and how it relates to other concepts.
           | 
           | If you look at most graduate-level math textbooks or papers,
           | they typically start with a whole bunch of numbered
           | definitions that reference each other, followed by some
           | simple lemmas or propositions that establish simple
           | relationships between them before diving into more complex
           | theorems and proofs.
           | 
           | The best software projects I've seen follow a roughly similar
           | pattern: there are several "core" functions or libraries with
           | a streamlined API, good docs, and solid testing; on top of
           | that there are more complex processes that treat these as
           | black-boxes and can rely on their behavior being well-defined
           | and consistent.
           | 
           | Probably the common thread between math and programming is
           | both lean heavily on abstraction as a core principle.
        
       | wanderingmind wrote:
       | One thing that I feel that is discussed less is, what are the
       | high level constraints and levers that can be set to enable a
       | group of programmers to become a better programmers. For example,
       | how much impact does architecture have? bug tracing, choice of
       | language, tools. People handwave and say that one is better than
       | another because of <reason 1>, <reason 2>, but to be able to
       | explain how each of this choice impacts the code in a semi-
       | rigorous way by explaining all the way of how it impacts
       | individual code components and their interaction (as a graph or a
       | tree) would be immensely helpful. If anyone knows of resources
       | like that, please let us know
        
         | hansvm wrote:
         | I think we're still at the stage of "this team lead makes shit
         | happen" rather than "this strategy makes shit happen," with a
         | lot of the nuances being fuzzy and/or wrong.
         | 
         | Tossing out a few observations:
         | 
         | 1. People make mistakes. Your strategy needs to account for
         | that (resilient runtime, heavy type-checking, convenient and
         | often-used REPL, etc).
         | 
         | 2. Above a certain project size, nobody remembers everything.
         | Your strategy needs to account for that (excellent multifaceted
         | documentation, disallow long-range interactions in your code,
         | etc).
         | 
         | 3. Dependencies have a vastly higher cost than you expect, even
         | in the short term. Plan for that (vendor more things, in-house
         | more things, allocate resources to dependency management, cut
         | scope, etc).
         | 
         | I could go on. The core point is that certain properties of
         | projects are "plainly" true to most people who have been around
         | any length of time. I don't think we're yet at a point where we
         | can often predict anything meaningful about some specific new
         | technology, but a mental framework of "this succeeded/failed
         | _because_ of {xyz}" helps tremendously in figuring out if/how
         | that new idea will fit into your current workplace.
        
         | skydhash wrote:
         | The objective is always to attain some kind of synergy between
         | the business logic and, as others have said, algorithm
         | simplicity. The best way to go about it is to start from the
         | chosen language/environment/library, and build out a DSL that
         | can express the business rules. That's the basic premise of
         | DDD, but where all this got complicated is splitting into
         | contexts and the translation between their boundary.
         | 
         | I believe programmers should learn a bit about programming
         | language theory, mostly the bits about what is a language. Then
         | how to recognize the things behind the labels, and how they
         | morph (either from an operation or from a translation between
         | contexts). Then it's a matter of using the knowledge to move
         | from the environment to a DSL that can express the business
         | rules.
         | 
         | Architecture is the draft for the above, defining the starting
         | point and a direction. And design is where you start to make
         | decisions following the plan. For someone that have an idea of
         | the destination, they can judge both.
        
         | joseneca wrote:
         | If only there was a clear answer to this. Software engineering,
         | as it has appeared to me, has always seemed to evolve with
         | opinions moreso than tangible improvements. That being said,
         | some practices are almost universally agreed to lead to better
         | code, like defensive programming (proving your assertions
         | relating to program conditions in the actual code), uniform
         | documentation, and, in general, taking your time in
         | implementing structures that will be used throughout the
         | codebase. Formalizing your logic in proofs can be one part of
         | that.
        
       | markus_zhang wrote:
       | I'm too old for that, so I write the algorithm on paper and run
       | it with some numbers, also on paper. I don't have the patience to
       | do it often but whenever I did I usually got it done in one or
       | two tries.
        
       | lliamander wrote:
       | This is similar to an intuition I've had about what it means to
       | program "algorithmically". We often draw a distinction between
       | "algorithms" and "business logic", but fundamentally they are the
       | same thing. They are both plans of the steps necessary to
       | accomplish a goal. The only difference, in my mind, is the style
       | in which the program is written. To program "algorithmically"
       | means to take steps to make undesirable program states impossible
       | to represent.
       | 
       | - In the case of search or sort algorithms, where the primary
       | concern is the speed of computation, undesirable states would be
       | performing unnecessary or duplicate computations.
       | 
       | - In the case of encryption algorithms, undesirable states would
       | be those that leak encrypted data.
       | 
       | - In the case of an order shipping and fulfillment system, an
       | undesirable state would be marking an order as fulfilled when not
       | all of the items have been delivered.
       | 
       | The more care that is taken to prevent undesirable states, the
       | more the program takes on an algorithmic style. And the only way
       | you can be sure that those undesirable states are impossible is
       | to think in terms of proofs and invariants.
        
       | alganet wrote:
       | Real programmers use butterflies.
       | 
       | https://xkcd.com/378/
       | 
       | ---
       | 
       | Let me elaborate: there is a huge history of dubious allegations
       | of what constitutes "a real programmer" or stuff that makes you
       | "a better programmer".
       | 
       | Combobulation and descombobulation of McGuffins is often the best
       | analogy though. Not to dismerit other kinds of thinkings, but
       | already dismeriting them, that's what this is all about. When in
       | doubt, showing the code is often what works.
        
       | rufus_foreman wrote:
       | You want to be a better programmer? Ignore these ridiculous
       | people and write code every day.
       | 
       | That's all there is.
       | 
       | You want to get better at something? Do it all day every day.
        
       | Aziell wrote:
       | I often run through the code in my head first, especially for
       | things like binary search where it's easy to mess up the logic.
       | It feels like a quick mental check, sometimes even faster than
       | writing tests.
       | 
       | I'm curious though. When you're coding, do you actually pause and
       | think through the logic first, or is it more like writing and
       | fixing along the way?
        
       | rramadass wrote:
       | Folks might find the following two books useful in this regard;
       | 
       |  _How to Think About Algorithms_ by Jeff Edmonds.
       | 
       |  _Persuasive Programming_ by Jerud Mead and Anil Shinde.
        
       | mparramon wrote:
       | They taught me in school to think about preconditions, invariant
       | and postconditions for loops. Then I switched to only using
       | functional affordances like .map() and .each(). I am much happier
       | now.
        
       | begueradj wrote:
       | > sketch a proof in your head as you go that your code will
       | actually do what you want it to do.
       | 
       | Could anyone, please, explain me the meaning ? I can't get it.
        
         | saghm wrote:
         | Everyone writes code that has bugs in it, but no one intends to
         | write code that has bugs. The bugs occur because there's a
         | disparity between what you think your code is doing and what
         | your code is actually doing. The quote you give is an
         | suggestion for how to reduce how often this occurs by
         | considering why you should be confident about your intent being
         | properly conveyed by the code you've written.
         | 
         | It might help to think of this almost as debugging your program
         | as you write it rather than only after you run it. Debugging a
         | program is essentially the same process of trying to figure out
         | why the program isn't doing what you think of doing. While
         | there's information you can get by inspecting a running program
         | or its outputs, there's often some amount of consideration you
         | need to do in order to figure out why you think certain
         | behavior might be happening. There's no fundamental reason this
         | consideration can't happen before you actually run the program;
         | even as you're writing the code, you can think about all the
         | ways it could potentially go wrong, and why you're confident
         | that this isn't the case. If you can't come up with a strong
         | argument for that, it's a good sign you might need to
         | reconsider your approach!
        
           | begueradj wrote:
           | It makes sense. Thank you for the effort.
        
       | graycat wrote:
       | Maybe a related practice (what I do): Write actual proofs or good
       | documentation that contains essentially proofs and combine that
       | with Knuth's _Literate_ programming.
       | 
       | For what is to be _proved_ , try to use _divide and conquor_ to
       | make the whole proof consist of separate steps, each of which
       | gets a short proof and each of those steps might become a
       | _subroutine_ (whatever the terminology and functionality of the
       | programming language and environment).
        
       | agnishom wrote:
       | > write little proofs in your head
       | 
       | This should be obvious, right? This is how you are supposed to
       | write code. You are supposed to have a simple proposition about
       | what each chunk of your code does.
        
       | w3news wrote:
       | To be a better programmer, write little proofs in your code. We
       | call that tests and types, proof that it should do what you
       | expect. Especially when writing tests first, then types, then the
       | code. Start with a test per acceptance criteria, that well
       | describes what it should do, and is clear what you send and
       | receive. Also in an API you can describe the API in OpenAPI or
       | GraphQL with all the properties and types, and you can validate
       | on runtime the data on your specification, that specification is
       | than also a proof that the application does what you described in
       | your specification.
       | 
       | So OpenAPI/GraphQL, tests and types and proof that the system
       | works like intended. Always start with that before writing code.
       | The specification is a solid base that doesn't change a lot if
       | you start with it. How the code works, you can refactor it, and
       | proof with the specification if it sill does the same like
       | before.
       | 
       | Code is less important than the specifications.
        
         | bmn__ wrote:
         | The five properties used as headings in the article are able to
         | be expressed in good type systems. This way a huge part of the
         | specification becomes code, and the compiler guarantees to
         | uphold the expressed properties in many or all cases. We should
         | strive to have a programming future where this idea is normal
         | and commonplace.
         | 
         | OpenAPI and GraphQL have exceedingly poor types, and will not
         | bring us there unless they make a fifty year leap forward.
        
       | shahzaibmushtaq wrote:
       | Almost every programmer gets better through their own trial and
       | error.
       | 
       | For me, the best "trick" is to start with pseudocode.
       | 
       | The _Isolation_ part is understandable and supports the
       | philosophy:
       | 
       | > When requirements change, you extend the behavior of [your
       | program] by adding new code, not by changing old code that
       | already works.
        
       | derelicta wrote:
       | I wished we had learnt about proofs in my degree. It might
       | possibly part of the graduate program tho.
        
       | amelius wrote:
       | Is this how LLMs (should) write code?
        
       | irchans wrote:
       | This might be the best article that I have read this year. It
       | might change the way that I program. I really want to write up
       | examples for each of the ideas presented: Monotonicity, Pre- and
       | post-conditions, Invariants, Isolation, Induction, and Proof-
       | affinity as a quality metric. I think it's easy to find these
       | ideas in my code, but I want to try to practice writing proofs
       | "in my head" as I code to make my code better. It is not apparent
       | to me now how this will work, but if I write up some examples, I
       | imagine it will become clearer.
        
       | runeblaze wrote:
       | I literally don't even write proofs in my head. Every time I
       | write algo I just ask Cursor for a Lean/Rocq proof of
       | correctness. No tests. No test coverage. The sheer power of type
       | theory
        
       ___________________________________________________________________
       (page generated 2025-07-16 23:01 UTC)