[HN Gopher] To be a better programmer, write little proofs in yo...
___________________________________________________________________
To be a better programmer, write little proofs in your head
Author : mprast
Score : 433 points
Date : 2025-07-15 17:05 UTC (1 days ago)
(HTM) web link (the-nerve-blog.ghost.io)
(TXT) w3m dump (the-nerve-blog.ghost.io)
| hiAndrewQuinn wrote:
| Oh, I have a relevant and surprisingly simple example: Binary
| search. Binary search and its variants leftmost and rightmost
| binary search are surprisingly hard to code correctly if you
| don't think about the problem in terms of loop invariants. I
| outlined the loop invariant approach in [1] with some example
| Python code that was about as clear and close to plain English at
| I could get.
|
| Jon Bentley, the writer of Programming Pearls, gave the task of
| writing an ordinary binary search to a group of professional
| programmers at IBM, and found that _90%_ of their implementations
| contained bugs. The most common one seems to have been
| accidentally running into an infinite loop. To be fair, this was
| back in the day when integer overflows had to be explicitly
| guarded against - but even then, it 's a surprising number.
|
| [1]: https://hiandrewquinn.github.io/til-site/posts/binary-
| search...
| qsort wrote:
| To be fair you're picking an example that's extremely finicky
| about indices. It's probably the hardest basic algorithm to
| write down without errors. Up there with Hoare partition.
| noosphr wrote:
| You're right, but invariants are the most bang for the buck
| you can get out of formal methods without having to use a DSL
| built for computer assisted proofs.
|
| That they work for binary search is a very strong signal to
| people not familiar with them that they work for nearly
| everything (they do).
| fragmede wrote:
| A research paper from Google in 2006 noted that "almost all"
| binary search (and merge sort) implementations contain a bug
| (and had for decades), so 90% seems impressive in the face of
| that.
|
| https://research.google/blog/extra-extra-read-all-about-it-n...
| hiAndrewQuinn wrote:
| The bug in that paper is actually the very buffer overflow
| bug I was referring to. Given that Jon himself made that
| error in the "definitive" implementation, it seems unlikely
| that he would have spotted it in the 10% of implementations
| he considered correct. Under Google's stricter criteria it
| seems likely to me that not a single person got the search
| implementation truly correct enough.
| jansan wrote:
| In Javascript that Google "bug" only occurs if an array has
| more than 4503599627370495 items. You may run into memory
| issues before running into that bug.
| sedatk wrote:
| I knew about the infamous Binary Search bugs in books, and I
| dared to write the first bug-free implementation in my book,
| very carefully. Still, it ended up having bugs. :) Luckily,
| Manning's early access program let me fix them before printing.
| edanm wrote:
| Oh interesting, where can we read about the bug that was
| there?
| sedatk wrote:
| It was quite silly. The most common bug with binary search
| is calculating the middle point: `(start + end) / 2`. It
| looks straightforward, but "start + end" could cause an
| integer overflow for large data structures, and that would
| break the function. IIRC, my handling for that case also
| turned out to be incorrect, and one of the readers caught
| it. I don't remember if there were other bugs, there could
| be, but that one hurt the most :)
| edanm wrote:
| Ah yes this is the "classic" bug that was in the Java
| standard library undiscovered for twenty years, iirc.
|
| I find this topic really interesting, thanks for sharing.
| I doubt I could code a binary search without any bugs
| from scratch :)
| None4U wrote:
| What are the odds you write a binary search that you'll use
| more than once instead of just running it and writing down the
| result?
| JohnKemeny wrote:
| I once needed to write binary search for a microcontroller in
| C (no libraries). The routine ran about every hour, with appx
| 4M data points.
| kamaal wrote:
| I once worked lots of projects in C, microcontrollers,
| embedded systems etc. It was a start up.
|
| Every time I needed to write something algorithmically
| demanding, I could do it in a day or two. Im not into
| Leetcoding, or competitive coding.
|
| Most regular everyday programmers can work these things out
| in one or two workdays.
|
| Its definitely not like the competitive programmers say,
| like if you aren't into this full time, at the time you
| need to write something you won't have access to time,
| internet and even an IDE and have to write the code in a
| Google doc(which needs internet connection, when I pointed
| this out in the interviews they didn't like it).
| groby_b wrote:
| About 100%?
|
| When do you write code that _doesn 't_ need to search?
| (Unless you hand it all to the database, in which case...
| sure, you're good)
| powersnail wrote:
| That's a very surprising angle of questioning. Are you
| writing some sort of compile-time-only programs?
| n4r9 wrote:
| I have an object representing piecewise linear functions,
| defined by a sequence of (x_i, y_i) tuples. It has a method
| to evaluate y for a given value of x. The first step is to
| find the least i for which x_i is greater than or equal to x:
| this uses a binary search. Initially, I stored the x values
| in a separate array and used dotnet's inbuilt
| Array.BinarySearch. Later I removed this array to save on
| memory and it now runs a handbuilt binary search on the first
| items of the array of points instead.
| monkeyelite wrote:
| The C++ standard library phrases binary search as a
| partitioning problem - finding the index where a predicate goes
| from false to true, which is helpful for me.
| hoseja wrote:
| Recalling the subtle difference between std::lower_bound and
| std::upper_bound and where it matters is somewhat annoying
| before it becomes muscle memory.
| scuol wrote:
| The way I always remember leftmost and rightmost binary search
| (the C++ equivalents-ish of lower_bound and upper_bound) is to
| always have a "prior best" and then move the bounds according
| to the algo
|
| while (l < r)
|
| {
|
| //find the midpoint
|
| auto mp = l + (l-r)/2;
|
| if (nums[mp] == target) { prior = target;
|
| #ifdef upper_bound
|
| l = target + 1; // move the left bound up, maybe there's more
| up there we can look for!
|
| #else
|
| //lower bound, we found the highest known instance of target,
| but let's look in the exclusive left half a bit more
|
| r = target - 1;
|
| #endif
|
| }
|
| excuse the terrible formatting, it's been a long day grinding
| leetcode after getting laid off...
| hiAndrewQuinn wrote:
| Godspeed, fellow LeetCoder. I'm not currently grinding but I
| still have my handful of practice problems in active Anki
| rotation.
|
| I have my rightmost code at part III of the miniseries, [1].
| It loks quite similar, but I save the -1 for the very end
| return. def rightmost_bsearch(L, T):
| l, r = 0, len(L) while l < r: #
| Loop invariants. Always true! Comment out in production!
| for somewhat_smol in L[0:l]: assert
| somewhat_smol <= T # Note: Weak inequality now.
| for lorg in L[r:len(L)]: assert lorg > T
| mid = (l + r) // 2 if L[mid] > T:
| r = mid else: l = mid + 1
| return r - 1 # return the first element AFTER L[r:len(L)].
|
| (It should technically be BEFORE, I guess. If it helps all
| rightmost bsearches are also leftmost bsearches on the
| reversed array, so AFTER is secretly not wrong)
|
| [1]: https://hiandrewquinn.github.io/til-site/posts/binary-
| search...
| godelski wrote:
| This makes for an interesting test. I checked Claude Sonnet,
| just for the he'll of it Prompt: Please
| write a bug free binary search in python Answer:
| def binary_search(arr, target): """ Performs
| binary search on a sorted array. Args:
| arr: A sorted list of comparable elements target:
| The element to search for Returns:
| The index of target if found, -1 otherwise """
| left = 0 right = len(arr) - 1 while left
| <= right: # Avoid potential overflow by using this
| formula mid = left + (right - left) // 2
| if arr[mid] == target: return mid
| elif arr[mid] < target: left = mid + 1
| else: right = mid - 1 return -1
|
| # Example usage and test cases if __name__ == "__main__": #
| Test cases test_array = [1, 3, 5, 7, 9, 11, 13, 15]
| print(binary_search(test_array, 7)) # Output: 3
| print(binary_search(test_array, 1)) # Output: 0
| print(binary_search(test_array, 15)) # Output: 7
| print(binary_search(test_array, 4)) # Output: -1
| print(binary_search([], 5)) # Output: -1
| print(binary_search([5], 5)) # Output: 0
| print(binary_search([5], 3)) # Output: -1
| hiAndrewQuinn wrote:
| Very good! Looks good to me. One small callout:
| mid = left + (right - left) // 2
|
| This implementation detail is to my knowledge unnecessary in
| Python because Python's built-in int type has arbitrary-
| precision integers. It's intended to avoid buffer overflows
| in languages like C.
|
| Imagine, say, that left is 1 and right is 2^63 - 1. In Python
| left + right will just give you 2^63, no big deal. In C, left
| + right will overflow and produce undefined behavior; in
| practice it usually gives you I think -2^63, which is
| obviously going to screw up the bsearch a bit. It isn't
| wrong, just slightly less idiomatic for the language.
|
| Python's interpreter may or may not be able to recognize and
| refactor the slight inefficiency of the extra arithmetic
| operation out. I will only point out that most of the time
| when we write our own bsearch it's because we want to
| optimize something really fundamental to the codebase. Any
| time you have to whip up your own algorithm it's prima facie
| evidence that that part of the code might be good to profile.
| naniwaduni wrote:
| Even in languages like C, it's mostly voodoo in practice
| unless you're using undersized indices anyway--you surely
| don't have anywhere near SIZE_MAX/2 elements.
|
| The underlying problem is sloppy use of int for indices.
| The roundabout calculation will still overflow at twice the
| size.
| namibj wrote:
| For scalar code that's fair, but keep in mind that with
| `gather`, you can vectorize it, and there you do worry
| about integer width as more width means fewer lanes.
| Though tbf you probably should use B+ Tree structures
| chunked to your bulk/cache fetch "line/page" size, and
| vectorized higher-than-radix-2 search within each node.
| mdnahas wrote:
| I read about this and started using binary search as my
| interview question. It worked well - about 2/3rds of highly
| credentialed applicants could not write a working
| implementation in 20 minutes. Most failures went into an
| infinite loop on simple cases! The ones who could write it
| usually did so quickly.
|
| I think part of reason is that most people were taught a bad
| interface. Even the code on Wikipedia says "Set L to 0 and R to
| n-1". That is, R is an inclusive bound. But we've learned that
| for most string algorithms, it is better when your upper bound
| is an exclusive bound, that is, n.
|
| I've wanted to do an experiment testing that hypothesis. That
| is, ask a large number of people to write it with different
| function prototypes and initial calls and see how many buggy
| implementations I get with inclusive vs exclusive upperbound vs
| length.
| andyjohnson0 wrote:
| > I read about this and started using binary search as my
| interview question. It worked well - about 2/3rds of highly
| credentialed applicants could not write a working
| implementation in 20 minutes.
|
| I'd be interested to know why you feel that this is a useful
| interview question.
|
| Implementing a well-known and widely implemented algorithm
| under stressful interview conditions wouldn't seem to me to
| demonstrate much about a candidate. In that situation I'd
| much rather be given a printout of some code and asked to
| talk around the implementation - how the code maps to the
| abstract algorithm, implementation trade-offs, readability,
| etc.
| mettamage wrote:
| It showcases they have done some leetcode. Or it showcases
| they can write code when someone is looking over your
| shoulder. There are company cultures where these skills are
| needed.
|
| Companies that don't fall into a culture like this are
| indeed deluded.
|
| I would do the following:
|
| 1. Write a tax calculator that reads in data from a CSV, or
| a similar question to showcase programming ability.
|
| 2. Do a small paid project. If it was good enough, hire
| them. If it wasn't, then give feedback and do another one
| in a week, tell them to learn as much about the topic as
| possible.
|
| 3. Do a small paid project a week later. If their learning
| agility is high, then hire them still.
| Etheryte wrote:
| Based on your description, it sounds like a bad interview
| question. When otherwise capable candidates can't answer your
| trivia, maybe it isn't a useful signal? More than anything,
| it just sounds like you're selecting for people who practiced
| leetcode vs those who didn't.
| zozbot234 wrote:
| It's not whether you can solve it right away, it's how you
| _approach_ the problem solving process to reach a viable
| solution. If "try to prove that this code is actually
| correct" is part of that process, that's the kind of
| "trivia" that can really be widely applicable even in a
| working environment, in a way that random leetcode algos
| often aren't.
|
| I vote for making "how would you think about proving this
| correct?" a question in all algo-focused interviews.
| hiAndrewQuinn wrote:
| To be precise, Zoz said
|
| >about 2/3rds of highly _credentialed_ applicants
|
| Credentialed [?] capable.
| hiAndrewQuinn wrote:
| Indeed, this is actually one reason why I like Python's
| closed-left, open-right syntax. It lets us sidestep the
| inclusive bound issue entirely, because, as noted in the
| post, for all 0 <= l <= r <= len(L) where L is a Python list,
| L == L[0:len(L)] == L[0:l] + L[l:len(L)]
| == L[0:l] + L[l:r] + L[r:len(L)]
|
| I actually didn't like this syntax until I had to work this
| out. Now everyone else's approach seems silly to me.
| rockostrich wrote:
| Don't most slice methods behave this way? Javascript's
| `Array.prototype.slice`, Java's `Arrays.copyOfRange`,
| Golang's slicing syntax is similar to Python's except the
| 3rd value is the max size of the resulting slice rather
| than the step value all behave this way.
| Animats wrote:
| Now if we can get LLMs to do that, they might code better. Proofs
| are a lot of work, but might keep LLMs on track.
| mprast wrote:
| I was thinking the same thing!
| Nevermark wrote:
| In any other industry the "worker class" would be sabotaging
| the machines.
|
| In the merry software world, the challenges of improving our
| potential replacements are far too interesting and satisfying
| to leave room for any worry about consequences! :)
| bcheung wrote:
| The problem is the training corpus tends towards mediocre code.
| But with an agentic loop that analyzes the code against those
| criteria and suggests changes then I think it might be
| possible. I wouldn't try to get it to generate that right off
| the bat.
| AIPedant wrote:
| Most proof assistants do a good job with autogenerating a
| (deterministically correct) proof given a careful description
| of the theorem. Working on integrating this into mainstream
| languages seems much more worthwhile than training an LLM to
| output maybe-correct proof-seeming bits of text.
| purplesyringa wrote:
| IME, LLMs are completely incapable of reasoning about anything
| remotely difficult. All you'll get are proofs by assertion, not
| anything rigid you can trust.
| gregorygoc wrote:
| Writing correct proofs is hard. Program verification is hard. In
| my opinion if you are hand weaving it there's no benefit.
| Thinking about invariants and pre-post conditions is often
| unnecessary or greatly reduced if you write idiomatic code for
| the language and codebase. Check out "The Practice of
| Programming" by R. Pike and B. W. Kernighan. The motto is:
| simplicity, clarity, generality. I find it works really well in a
| day job.
|
| On a slightly related note... Competitive programming is
| surprisingly good at teaching you the right set of techniques to
| make sure your code is correct. You won't progress beyond certain
| stage unless you pick them up.
| wizzwizz4 wrote:
| Turn your first paragraph on its head:
|
| Appropriate abstractions (i.e., "idiomatic code for the
| language and codebase") make program verification easy. If you
| are hand-weaving an appropriately-abstracted program, there's
| little benefit to thinking about loop invariants and pre-post
| conditions, since they don't exist at that level of generality:
| correct proofs follow directly from correct code.
| gregorygoc wrote:
| No, appropriate abstractions are insufficient for my
| argument. For example: there's one way to write an idiomatic
| loop in C and it inherits necessary invariants by
| construction.
|
| I highly recommend reading the book, it explains concept of
| writing idiomatic code way better than I ever could.
| chowells wrote:
| That's a really funny example, given how many bugs have
| been found in C programs because idiomatic loops are wrong
| in the edge cases.
|
| How do you idiomatically write a loop to iterate over
| signed ints from i to j (inclusive) in increasing order,
| given i <= j?
|
| What does that loop do when j is INT_MAX?
| wk_end wrote:
| I can imagine someone who sketches out little proofs in
| their head - or even on paper - missing that case too.
| It's easy to forget you're not doing normal arithmetic
| when doing arithmetic in C!
| chowells wrote:
| Yeah, it's a hard case in general. But C's idioms really
| don't encourage you to think about it. You really need to
| default to a loop structure that checks the termination
| condition after the loop body but before the increment
| operation for inclusive end coordinates.
|
| It's easy to think that's what do/while is for, but it
| turns out to be really hard to do the increment operation
| after the conditional, in general. What you really want
| is a loop structure with the conditional in the middle,
| and the only general purpose tool you get for that is a
| break. C (or any language with similar syntax) really
| doesn't have an idiom for doing this correctly.
| wizzwizz4 wrote:
| This may be hubris, but... int i = start;
| do thing_with(i) while (i++ <= end);
| chowells wrote:
| I did consider that, but I wrote "in general" for a
| reason. It works very specifically in the case of "add
| one" or "subtract one", but it doesn't work with anything
| more complicated, like chasing pointers or
| adding/subtracting more than one at a time.
|
| You could write functions to do the update and return the
| old value so you could use them in the same way, but I
| don't like this either. This is mostly because it orders
| the termination check and the update logic the wrong way
| around. If there's IO involved in checking for the next
| thing, for example, side effects of that unnecessary
| operation might interfere with other code.
|
| You could resolve that by moving the termination check
| into the update logic as well, but now you're seriously
| complecting what should be independent operations. I
| don't think the tradeoff is there versus just using a
| break. But mostly, this is a self-inflicted problem in
| C's language constructs and idioms. I just don't have
| this problem in many other languages, because they
| provide end-inclusive looping constructs.
| trealira wrote:
| > I did consider that, but I wrote "in general" for a
| reason. It works very specifically in the case of "add
| one" or "subtract one", but it doesn't work with anything
| more complicated, like chasing pointers or
| adding/subtracting more than one at a time.
|
| You're reminding me of the book "Modern Compiler Design."
| The author goes over how to compile a general Pascal-
| style for-loop correctly, accounting for increasing or
| decreasing ranges, differing step sizes, and accounting
| for overflow. It was written using just goto statements,
| so I adapted a version of it to C. Just replace "intN_t"
| with an actual integer size. It works by calculating the
| number of times the loop will run. If "from" is equal to
| "to," it's still going to run at least once. Again, this
| is not mine, just adapted from the author's code (Dick
| Grune's). // enumerate: print out numbers
| from "from" to "to", inclusive, with step size "by"
| void enumerate(intN_t from, intN_t to, intN_t by) {
| uintN_t loop_count; intN_t i; if (by
| > 0) { if (from > to) return;
| loop_count = (to - from) / by + 1; } else if
| (by < 0) { if (from < to) return;
| loop_count = (from - to) / -by + 1; } else /*
| (by == 0) */ { // Maybe some kind of error
| return; } for (i = from; ; i += by) {
| printf("%d\n", i); if (--loop_count == 0)
| break; } }
|
| You can see it's more complicated than the idiomatic C
| for-loop, haha. But that's just a general solution. Like
| you guys noted, it could be simpler for step sizes of 1.
| Izkata wrote:
| Isn't i going to be one past end on the last iteration?
|
| I think you want either "++i <= end" or "i++ < end", and
| that's assuming "end" is inclusive. If it's exclusive
| then "++i < end", right?
|
| (Using "<=" in a termination condition has become a code
| smell for me, it's been wrong more often than not, so it
| sticks out as "pay close attention here")
| chowells wrote:
| Actually, I think what you really need is "i++ != end".
| The whole point was inequalities don't work how you want
| when end is INT_MAX. You want to just terminate after the
| loop body where i was equal to end. Except now I'm not
| sure if that evades the UB of a signed int exceeding
| INT_MAX. I don't know C well enough to know if the UB is
| in performing the increment or reading the value
| afterwards.
|
| Edit: Just checked in with a C expert. The UB is in the
| increment operation, so that's not correct after all. You
| really do just need to separate out the update from the
| test entirely.
| trealira wrote:
| Yeah, it would have to be something like this if you
| wanted to avoid the signed integer overflow at the end of
| the loop: assert(i <= j); for (;;)
| { do_something_with(i); if (i == j)
| break; i++; }
| xg15 wrote:
| Have to strongly disagree here. I don't think the OP meant
| thinking up a complete, formal, proof. But trying to understand
| what kind of logical properties your code fulfills - e.g. what
| kind of invariants should hold - will make it a lot easier to
| understand what your code is doing and will remove a lot of the
| scare factor.
| mprast wrote:
| yes, what i had in mind were more proof sketches than proofs
| Nevermark wrote:
| Yes, we could call this "maintaining plausible provability".
|
| Code for which there is not even a toehold for an imagined
| proof might be worth cordoning off from better code.
| Sharlin wrote:
| Types constitute this sort of a partial proof. Not enough to
| encode proofs of most runtime invariants (outside powerful
| dependent type systems) but the subset that they _can_ encode
| is extremely useful.
| gregorygoc wrote:
| Yeah, and I'm saying if your code is idiomatic you get
| necessary invariants for free.
| auggierose wrote:
| Is idiomatic related to idiotic?
| bmn__ wrote:
| People downvoted you because instead of polluting the
| discussion, you could have looked this up yourself.
|
| The answer is yes, both words are related to idios,
| "own", "self", "private".
| auggierose wrote:
| Please don't comment about the voting on comments. It
| never does any good, and it makes boring reading.
|
| https://news.ycombinator.com/newsguidelines.html
| bmn__ wrote:
| lame
| spauldo wrote:
| It is if you're writing vimscript.
| wahern wrote:
| I bought The Practice of Programming years ago. It's a
| great book that is no less relevant today, but I don't see
| your argument. The suggestions you've summarized are
| critical advice, but rather than obviate the need for the
| proof-like mindset, they complement it. Idiomatic code
| doesn't directly help you solve and implement difficult
| algorithmic or architectural problems. However, idiomatic
| code certainly helps reduce noise so that the irreducibly
| complex aspects of a solution conveniently standout, both
| conceptually and literally.
| gregorygoc wrote:
| I do agree they do complement each other... to some
| extent. My point was rather that if you write a good
| code, keeping logical arguments in your head should be
| reduced to the minimum. To the point that only testing
| edge cases and a happy path in your mind should be
| sufficient.
|
| To give a more concrete example: I have recently seen a
| very complicated piece of logic checking whether a
| subsequent code should be even invoked, but it was hidden
| in a debris of core logic and other verification. It
| easily could have been separated and rewritten as a
| series of early returns and this is what a precondition
| is. I'm sure someone who wrote the code had it in their
| mind but was not familiar enough with the early return
| technique to embed this in code. Had they been, their
| brain power could've been utilized more efficiently.
| pipes wrote:
| Could you elaborate on those techniques from competitive
| programming please. Genuinely interested! :)
| kevinventullo wrote:
| +1 This is definitely the wall I hit with competitive
| programming. I logically know how to solve the problem, my
| code just ends up having one too many bugs that I can't track
| down before time is up.
| mathgradthrow wrote:
| There is no substitute for writing correct programs, no matter
| how hard it is. If you want correct programs you have to write
| them correctly.
| monkeyelite wrote:
| The most basic idea of a proof is an argument for why something
| is true. It's not about avoiding small mistakes, it's about
| getting directionally correct.
| maxbond wrote:
| I think the causality is flipped here, when you carefully
| consider a problem the result is often very clean and clear
| code. The clarity in your thinking is reflected in the code and
| it's structure.
|
| But writing clean and clear code in the hopes that it's good
| aesthetics will result in correctness would be cargo culting.
| (Writing clean code is still worthwhile of course, and clean
| code + code review is likely to result in better correctness.)
|
| Form follows function, not the other way around.
| bluetomcat wrote:
| > My thesis so far is something like "you should try to write
| little proofs in your head about your code." But there's actually
| a secret dual version of this post, which says "you should try to
| write your code in a form that's easy to write little proofs
| about."
|
| Easier said than done. It is certainly feasible on greenfield
| projects where all the code is written by you (recently), and you
| have a complete mental model of the data layout and code
| dependencies. It's much harder to prove stuff this way when you
| call foo(), bar() and baz() across unit boundaries, when they
| modify global state and are written by different developers.
| akavi wrote:
| A part of "being a good developer" is being able to evolve
| systems in this direction. Real systems are messy, but you can
| and should be thoughtful about:
|
| 1. Progressively reducing the number of holes in your
| invariants
|
| 2. Building them such that there's a pit of success (engineers
| coming after you are aware of the invariants and "nudged" in
| the direction of using the pathways that maintain them).
| Documentation can help here, but how you structure your code
| also plays a part (and is in my experience the more important
| factor)
| mprast wrote:
| yeah, this is what i was trying to get at with that notion of
| "proof-affinity"; imo a well-structured codebase is one in
| which you can easily prove stuff to yourself about code you
| didn't necessarily write
| xg15 wrote:
| Even then, you can get a map of "safe" and "unsafe" locations
| in the codebase, I.e. you can get a mental model of which
| functions modify global state and which don't.
| msteffen wrote:
| I think your frustration is illustrative, actually: the reason
| mutable global state is bad is because in order to prove that a
| piece of code reading it is correct, you have to know what _the
| entire rest of the program_ is doing.
|
| If, instead, you make the global variable immutable, make the
| state variable into a function argument, or even wrap the
| mutable global state in some kind of helper class, then you
| only need to know what the callers of certain functions are
| doing. The visibility of those functions can be limited. Caller
| behavior can be further constrained with assertions inside the
| function. All of these (can) make it easier to prove that the
| reader is correct.
|
| I think most programmers already do this, actually; they just
| don't think of their decisions this way.
| bluetomcat wrote:
| It's a good direction to follow, but it can only get you so
| far. Some pieces of code do naturally evolve into a
| functional formalism, while others are inherently imperative.
| Usually, the top levels of your program (event loop) are
| imperative and deal with stateful "devices" like IO, the
| screen and storage subsystems. The "leaf" functions from the
| call graph can be functional, but you still can't reason
| about the whole program when it is imperative at the top.
| thfuran wrote:
| Yes, but if you can actually drive all that out to the very
| top level and leave everything everything else clean,
| that's a huge improvement over how many programs are
| structured.
| jimbokun wrote:
| > It's much harder to prove stuff this way when you call foo(),
| bar() and baz() across unit boundaries, when they modify global
| state and are written by different developers.
|
| I think this reinforces the article's point.
|
| Code like this is much more likely to contain bugs and be
| harder to maintain without introducing more bugs, than programs
| written from the outset with this goal of "provability".
| hinkley wrote:
| I just finished cave diving in a code base that's had a very
| old ticket to clean up the mess.
|
| I went in with 3 tickets in mind to fix. I found half a dozen
| more while I was down there, and created 2 myself. I don't
| know if I got off easy or I was distracted and missed things.
|
| The other project I'm working on is not dissimilar. Hey did
| you guys know you have a massive memory leak?
| monocasa wrote:
| And in fact, when you look at formally proven code, 80% of
| the shtick is reducing complexity of mutation to something
| tractable. That's valuable whether or not you then actually
| go through the process of formally proving it.
| hinkley wrote:
| > when they modify global state and are written by different
| developers.
|
| Once cancer has metastasized, the treatment plans are more
| aggressive and less pleasant. The patient can still be saved in
| many cases, but that depends on a lot of external factors.
| ericmcer wrote:
| This made me think how I constantly rethink my approach to
| coding and learning how to do it "right" over and over.
|
| I wonder if someone like a John Carmack is just like... yeah I
| got it or he also is constantly feeling like he was bad 5 years
| ago and is doing it "better" now.
| saghm wrote:
| This is an important point I like to mention to people.
| Thinking back to old code you wrote and how you would choose
| to do it better than you originally did it a sign of growth,
| and if that stops happening after a certain point, it's a
| sign you've stopped growing. Given that the best programmers
| in the world still don't write perfectly bug-free code, I
| don't have any illusions that there will always be plenty of
| room for me to continue improving as a programmer, so I
| consider having those thoughts to be an unequivocally good
| thing. I'd honestly be a bit wary of anyone who didn't think
| they needed to learn or grow any more as a programmer!
| nayuki wrote:
| I learned the foundations of theoretical computer science in
| university. I agree with the sentiment of this post, though it is
| hard to perform in practice.
|
| In addition to pre-conditions and post-conditions, I would like
| to emphasize that loop invariants and structural induction are
| powerful techniques in CS proofs.
| https://en.wikipedia.org/wiki/Loop_invariant ,
| https://en.wikipedia.org/wiki/Structural_induction
|
| These notes from UofT's CSC236H are on-topic:
| https://www.cs.toronto.edu/~david/course-notes/csc236.pdf#pa...
| sitkack wrote:
| Those notes are great, David Liu seems while a wholesome dude
| https://www.cs.toronto.edu/~david/research.html
| atlintots wrote:
| UofT mentioned! Let's go!
| skybrian wrote:
| The software he's working on seems rather obscure and perhaps not
| launched yet? get-nerve.com can't be resolved.
| mprast wrote:
| not launched yet! I will put up a post when it launches; coming
| in a few months!
| roadside_picnic wrote:
| What's interesting about this view is that theorem provers
| ultimately boil down to sufficiently advanced type checking (the
| book _Type Theory and Formal Proof_ is an excellent overview of
| this topic). Literally the heart of proof assistants like Lean is
| "if it compiles, you've proved it".
|
| So more practical type systems can be viewed as "little" theorem
| provers in the sense the author is describing. "Thinking in
| types" so to speak, is mathematically equivalent to "writing
| little proofs in your head". I would also add that _merely_ using
| types is not equivalent to "thinking in types" as one would be
| required to do writing a sufficiently advanced Haskell program.
| trealira wrote:
| It's true that theorem provers can be just sufficiently
| advanced type systems (the Curry Howard correspondence), but
| not all theorem provers operate that way. Isabelle/HOL operates
| on higher order logic. Imperative ones like SPARK or Dafny just
| rely on encoding preconditions and postconditions and things
| like loop invariants and just use SMT solvers for verification,
| IIRC.
|
| Having an advanced type system does seem to encourage this sort
| of informal proof oriented thinking more than imperative and
| somewhat typeless languages do, though, since preconditions and
| postconditions and loop invariants and inductive proofs on
| loops are things you have to do yourself entirely in your head.
| xg15 wrote:
| I'd also add mutability and immutability to the list of
| properties. Keeping as much state as possible immutable doesn't
| just help with multithreading, it will also greatly reduce the
| headache when trying to understand the possible states of a
| program.
| jimbokun wrote:
| The article does include those.
| dblohm7 wrote:
| I don't do this often, but when I do, it's almost always when
| writing non-trivial concurrent code. I'll often "fuzz" the
| scheduling of multiple tasks around the region of code I'm
| working on to prove to myself that it works.
| cespare wrote:
| I really identify with this way of thinking. One domain where it
| is especially helpful is writing concurrent code. For example, if
| you have a data structure that uses a mutex, what are the
| invariants that you are preserving across the critical sections?
| Or when you're writing a lock-free algorithm, where a proof seems
| nearly required to have any hope of being correct.
| taeric wrote:
| I feel like this is reaching for the vocabulary that you get in
| mathematics of axioms, postulates, and theorems. Not in a bad
| way, mind. Just the general idea of building a separate
| vocabulary for the different aspects of a system of knowledge.
|
| I also think things get complicated with holding things constant
| as a necessity. Often times, the best way to find a bug is to ask
| not "how was this possible?" but "why would another part of the
| system modify this data?"
|
| Obviously, if you can make things reference capable data, you
| should do so. But, all too often "this data being X" has a
| somewhat self evident explanation that can help. And just "copy
| so I don't have to worry about it" lands you squarely in "why is
| my copy out of date?"
| phkahler wrote:
| Another nice pattern is to have any particular value written
| exactly once, or from exactly one place in the code if it gets
| updated.
|
| For embedded control I'm finding the blackboard pattern with
| single assignment to be super useful.
| bcheung wrote:
| I wonder how many of these rules can be incorporated into a
| linter or be evaluated by an LLM in an agentic feedback loop. It
| would be nice to encourage code to be more like this.
|
| I notice you didn't really talk much about types. When I think of
| proofs in code my mind goes straight to types because they
| literally are proofs. Richer typed languages move even more in
| that direction.
|
| One caveat I would add is it is not always desirable to be forced
| to think through every little scenario and detail. Sometimes it's
| better to ship faster than write 100% sound code.
|
| And as an industry, as much as I hate it, the preference is for
| languages and code that is extremely imperative and brittle. Very
| few companies want to be writing code in Idris or Haskell on a
| daily basis.
| ngruhn wrote:
| Sorry for being pedantic but isn't the story that
| types=theorems and programs=proofs?
| sitkack wrote:
| "Propositions as Types" by Philip Wadler
| https://www.youtube.com/watch?v=IOiZatlZtGU
| GMoromisato wrote:
| Totally tangential, but I love to read post-mortems of people
| fixing bugs. What were the initial symptoms? What was your first
| theory? How did you test it? What was the resolution? Raymond
| Chen does this a lot and I've always enjoyed it.
|
| I learn more from these concrete case studies than from general
| principles (though I agree those are important too).
|
| One of my most recent bugs was a crash bug in a thread-pool that
| used garbage-collected objects (this is in C++) to manage network
| connections. Sometimes, during marking, one of the objects I was
| trying to mark would be already freed, and we crashed.
|
| My first thought was that this was a concurrency problem. We're
| supposed to stop all threads (stop the world) during marking, but
| what if a thread was not stopped? Or what if we got an event on
| an IO completion port somehow during marking?
|
| I was sure that it had to be a concurrency problem because (a) it
| was intermittent and (b) it frequently happened after a
| connection was closed. Maybe an object was getting deleted during
| marking?
|
| The only thing that was weird was that the bug didn't happen
| under stress (I tried stress testing the system, but couldn't
| reproduce it). In fact, it seemed to happen most often at
| startup, when there weren't too many connections or threads
| running.
|
| Eventually I proved to myself that all threads were paused
| properly during marking. And with sufficient logging, I proved
| that an object was not being deleted during marking, but the
| marking thread crashed anyway.
|
| [Quick aside: I tried to get ChatGPT to help--o3 pro--and it kept
| on suggesting a concurrency problem. I could never really
| convince it that all threads were stopped. It always insisted on
| adding a lock around marking to protect it against other
| threads.]
|
| The one thing I didn't consider was that maybe an object was not
| getting marked properly and was getting deleted even though it
| was still in use. I didn't consider it because the crash was in
| the marking code! Clearly we're marking objects!
|
| But of course, that was the bug. Looking at the code I saw that
| an object was marked by a connection but not by the listener it
| belonged to. That meant that, as long as there was a connection
| active, everything worked fine. But if ever there were no
| connections active, and if we waited a sufficient amount of time,
| the object would get deleted by GC because the listener had
| forgotten to mark it.
|
| Then a connection would use this stale object and on the next
| marking pass, it would crash.
| TacticalCoder wrote:
| > Totally tangential, but I love to read post-mortems of people
| fixing bugs.
|
| I know I already posted it moons ago but... Around 1991 I made
| a little game, similar to Canon Ball on the MSX (which later on
| Pang / Buster Bros did copy).
|
| I had one weird case where sometimes the game would crash.
| Plain crash. But sometimes after playing for 15 minutes and
| already passing several levels. I just couldn't find it. I
| couldn't reason about it. I was stuck.
|
| So I decided to rewrite not the entire game but the part
| dealing with the inputs / game logic to make it 100%
| deterministic. It took me a long time to do that. Then
| eventually I could record myself playing: I've record only the
| player inputs and at which moment they'd happen, which would
| make for tiny savefiles btw.
|
| And eventually while I was playing and recording, the crash
| occured. I tried my replay: it worked... It replayed the
| savefile flawlessly and the game crashed again.
|
| At that point I knew the bug was gone: being able to reproduce
| a bug in a deterministic way means I was going to fix it.
|
| Turns out it was a dangling pointer (ah, C...): when the hero
| would catch an extra allowing him to fire two shots at once on
| screen (usually he'd only be allowed one) and would the first
| shot kill the last ball on screen, then on the next level the
| second shot would somehow (due to an error on my part) continue
| to live its live, eventually corrupting memory.
|
| Fun stuff.
|
| FWIW having deterministic game engines wasn't a thing back
| then. It became common later on, with games like Age of Empires
| and Warcraft III etc. using them: it was obvious for savefiles
| allowing to replay/exchangs games were tinier than tiny: they'd
| only save at which frame a player input happened and they'd
| replay the entire game from there [1]
|
| I still have the code to that game, it was surprisingly fun.
| Someone here already offered help in the past to get it back on
| its feet. I've also got an executable that runs. I just don't
| remember how the tooling worked. Still have the BAT files etc.
| to build it, but not the tools. I really should look into that
| one of these days but I'm kinda busy with other projects /
| life.
|
| [1] which also raised another issue: when game engines were
| upgraded, you could have savefiles only playing on older
| version of the game, so players would exchanges games and add
| the version of the game they were destined to
| pokemyiout wrote:
| Is there a typo in the induction section? Should `simplifyTree`
| and `simplifyGraph` be the same function?
|
| function simplifyTree(root: Node): Node { let
| newChildren = [] as Array<Node>; for (const child of
| root.children) { const simplifiedChild =
| simplifyGraph(child); if
| (shouldContract(simplifiedChild)) { for (const
| grandChild of simplifiedChild.children) {
| newChildren.push(grandChild); } } else {
| newChildren.push(simplifiedChild); } }
| root.children = newChildren; return root; }
| dkenyser wrote:
| This reminds me of reading Accelerated C++ back in the day and I
| think the part that stuck with me most from that book was the
| idea of "holding the invariant in your head" (I'm paraphrasing a
| bit).
|
| It made me think a lot more about every line of code I wrote and
| definitely helped me become a better programmer.
| mathgradthrow wrote:
| Monotone just means order preserving. f is monotone if f(a) >=
| f(b) whenever a >= b.
| wrs wrote:
| As an undergrad at Carnegie Mellon in the 80s, I was explicitly
| _taught_ to do all of these things in one of the first-year
| programming courses. And it has served me very well since then.
|
| I especially remember how learning the equivalence of recursion
| and induction immediately eliminated the "frustrated trial and
| error" approach for making recursive algorithms that work.
| TheShermanTank wrote:
| Took the course last year, and I started to appreciate it more
| while taking functional programming lmao
| wiseowise wrote:
| To be a better programmer, program more and read less bullshit on
| the internet.
| tristramb wrote:
| The idea that you should design programs with proof in mind goes
| back to T. J. Dekker's solution to the mutual exclusion problem
| in 1959. The story is told by Edgser Dijkstra in EWD1303 (https:/
| /www.cs.utexas.edu/~EWD/transcriptions/EWD13xx/EWD130...). Much
| of Dijkstra's later work can be seen as him working out the
| consequences of this insight.
| thesuperbigfrog wrote:
| Why not include little proofs in the code when you can?
|
| Several programming languages and libraries support Design-by-
| Contract (https://en.wikipedia.org/wiki/Design_by_contract) which
| lets you specify preconditions, postconditions, and invariants
| directly in your code.
|
| Those predicates can be checked in various ways (depending on how
| deeply Design-by-Contract is supported) to help you know that
| your code is working correctly.
|
| Ada supports Design-by-Contract as part of the language:
| https://learn.adacore.com/courses/intro-to-ada/chapters/cont...
|
| SPARK extends Ada Design-by-Contract into full proofs:
| https://learn.adacore.com/courses/intro-to-spark/index.html
|
| Rust has the Contracts crate:
| https://docs.rs/contracts/latest/contracts/
|
| Other programming languages have various levels of support or
| libraries for Design-by-Contract:
| https://en.wikipedia.org/wiki/Design_by_contract#Language_su...
| bigmadshoe wrote:
| Even standard assertions work as a version of this
| thesuperbigfrog wrote:
| Standard assertions certainly are better than keeping the
| little proofs only in your head.
|
| Many Design-by-Contract implementations are nicer than
| standard assertions because they help to express intent
| better and easily refer to the 'old' value of a parameter to
| verify desired results.
| bubblebeard wrote:
| This is something I've practiced myself quite a lot in recent
| years, and in my experience it's so much harder while at your
| desk. Doing something else while letting your brain work really
| helps. For me, that's usually going for a walk, taking a run in
| the woods, or doing some repeditive household chore.
| JohnMakin wrote:
| I think I stumbled across a similar concept in the more difficult
| post-grad classes I ended up in a long time ago. I began at some
| point late in my undergrad doing math tests entirely in pen. I
| didn't understand why, but it resulted in higher scores almost
| always, and much neater scratchwork, which I had attributed to
| the reason, but I think what was helping was something along the
| lines of what this post is getting at.
|
| What was helping me was that before I wrote a single expression,
| I thought about it carefully in my head and where it would lead
| before putting pen to paper, because I didn't want to make a
| bunch of messy scratch out marks on it. Or, sometimes, I'd use a
| healthy amount of throwaway scratch paper if allowed. Once my
| path was fully formed in my head I'd begin writing, and it
| resulted in far fewer mistakes.
|
| I don't always take this approach to writing code but often I do
| formulate a pretty clear picture in my head of how it is going to
| look and how I know it will work before I start.
| ge96 wrote:
| While having been employed as a developer for almost 7 years now,
| I remember taking this class maybe discrete math, I was not a fan
| of that class, where you have problems like p -> q
|
| Also farthest in math/physics I got was intro to quantum
| mechanics which the multiple-pages to solve a problem lost me
|
| Being a good programmer... I don't have a degree so I've never
| really tried to get into FAANG. I also am aware after trying
| Leetcode, I'm not an algorithm person.
|
| What's funny is at my current job which it's a multi-national
| huge entity thing but I have to try and push people to do code
| reviews or fix small errors that make something look bad (like a
| button being shorter than an input next to it).
|
| I am self-aware with true skill, I can make things, but I don't
| think I'd ever be a John Carmack. If you follow a framework's
| pattern are you a good developer? I can see tangible metrics like
| performance/some slow thing, someone better makes it faster.
|
| Recently someone forked a repo of a hardware project I made. It's
| fun watching them change it, to understand what I wrote and then
| change it to fit their needs.
|
| When I see my old code I do recognize how it was verbose/could be
| much simpler.
| pclowes wrote:
| The best way I have found to integrate this approach is Test
| Driven Development.
|
| When done well, every test you write before you see it fail and
| then you write the barest amount of code that you think will make
| it pass is a mini-proof. Your test setup and assertions are what
| cover your pre/post conditions. Base cases are the invariant.
|
| The key here is to be disciplined, write the simplest test you
| can, see the test fail before writing code, write the smallest
| amount of code possible to make the test pass. Repeat.
|
| The next level is how cohesive or tightly coupled your tests are.
| Being able to make changes with minimal test breakage "blast
| radius" increases my confidence of a design.
| swat535 wrote:
| TDD is also great for playing around with ideas and coming up
| with a clean interface for your code. It also ensures that your
| code is testable, which leads to improved readability.
|
| You'll know quickly where you're going wrong because if you
| struggle to write the test first, it's a symptom of a design
| issue for example.
|
| That being said, I wouldn't use it as dogma, like everything
| else in CS, it should be used at the right time and in the
| right context.
| pclowes wrote:
| I agree on the Dogma aspect. Plenty of times I have abandoned
| it. However, I did find it very helpful to spend my first
| couple years in a strict Xtreme Programming (XP) shop. The
| rigor early on was very beneficial and its a safety net for
| when I feel out of my depth in an area.
|
| I tend to go the other way, I don't use TDD when I am playing
| around/exploring as much as when I am more confident in the
| direction I want to go.
|
| Leaving a failing test at the end of the day as a breadcrumb
| for me to get started on in the morning has been a favored
| practice of mine. Really helps get the engine running and
| back into flow state first thing.
|
| The dopamine loop of Red -> Green -> Refactor also helps
| break through slumps in otherwise tedious features.
| saurik wrote:
| A test is not a proof, and you can prove something works
| without ever even running it. There are also properties of a
| system which are impossible to test, or are so non-
| deterministic that you a test will only fail once every million
| times the code is executed. You really need to just learn to
| prove stuff.
|
| The most complex piece of code I have ever written, as a
| relevant story, took me a month to prove to everyone that it
| was correct. We then sent it off to multiple external auditors,
| one of which who had designed tooling that would let them do
| abstract interpretation with recursion, to verify I hadn't made
| any incorrect assumptions. The auditors were confused, as the
| code we sent them didn't do anything at all, as it had a stupid
| typo in a variable name... I had never managed to figure out
| how to run it ;P. But... they found no other bugs!
|
| In truth, the people whom I have met whom are, by far, the
| worst at this, are the people who rely on testing, as they seem
| to have entirely atrophied the ability to verify correctness by
| reading the code and modeling it in some mathematical way. They
| certainly have no typos in their code ;P, but because a test
| isn't a proof, they always make assumptions in the test suite
| which are never challenged.
| pclowes wrote:
| Interesting, could you show me a formal proof that can't be
| expressed in logic (ie. code) and then tested?
|
| My thought here is that since proofs are logic and so is code
| you can't have a proof that can't be represented in code. Now
| admittedly this might look very different than typical say
| JUnit unit tests but it would still be a test validating
| logic. I am not saying every system is easily testable or
| deterministic but overall, all else equal, the more tested
| and testable a system is the better it is.
|
| IME things that are very hard to test are often just poorly
| designed.
| aljgz wrote:
| Consider a function that gets an array of integers and a
| positive number, and returns the sum of the array elements
| modulo the number. How can we prove using tests, that this
| always works for all possible values?
|
| Not discounting the value of tests: we throw a bunch of
| general and corner cases at the function, and they will
| ring the alarm if in the future any change to the function
| breaks any of those.
|
| But they don't prove it's correct for all possible inputs.
| pclowes wrote:
| I would lean towards types and property testing here
| using tools like Coq.
| aljgz wrote:
| Can you elaborate?
| layer8 wrote:
| Tests can generally only test particular inputs and/or
| particular external states and events. A proof abstracts
| over all possible inputs, states, and events. It proves
| that the program does what it is supposed to do regardless
| of any particular input, state, or events. Tests, on the
| other hand, usually aren't exhaustive, unless it's
| something like testing a pure function taking a single
| 32-bit input, in which case you can actually test its
| correctness for all 2^32 possible inputs (after you
| convinced yourself that it's truly a pure function that
| only depends on its input, which is itself a form of
| proof). But 99.99% of code that you want to be correct
| isn't like that.
|
| As an example, take a sorting procedure that sorts an
| arbitrarily long list of arbitrarily long strings. You
| can't establish just through testing that it will produce a
| correctly sorted output for all possible inputs, because
| the set of possible inputs is unbounded. An algorithmic
| proof, on the other hand, can establish that.
|
| That is the crucial difference between reasoning about code
| versus merely testing code.
| pclowes wrote:
| I agree with that but I would say that if I required
| formal verification of that kind I would move the proof
| based rationale into the type system to provide those
| checks.
|
| I would add Tests can be probabilistically exhaustive (eg
| property based testing) and answer questions beyond what
| proof based reasoning can provide ie. is this sorting of
| arbitrary strings efficient and fast?
| joseneca wrote:
| Proofs are arguably still better than tests at evaluating
| efficiency, at least for smaller components/algorithms in
| isolation. While there are cases where constant factors
| that can't be described well in a proof matter, in most
| cases, the crucial element of an algorithm's efficiency
| lies in how the complexity scales, which can be proven in
| the vast majority of cases. On the other hand, relying
| solely on benchmarking introduces a lot of noise that can
| be difficult to sort through.
| Kambing wrote:
| > A test is not a proof
|
| Actually, a test _is_ a proof! Or more specifically, a
| traditional test case is a narrow, specific proposition. For
| example, the test `length([1, 2, 3]) = 3` is a proposition
| about the behavior of the `length` function on a concrete
| input value. The proof of this proposition is _automatically
| generated_ by the runtime, i.e., the proof that this
| proposition holds is the execution of the left-hand side of
| the equality and observing it is identical to the right-hand
| side. In this sense, the runtime serves as an automated
| theorem prover (and is, perhaps, why test cases are the most
| accessible form of formal reasoning available to a
| programmer).
|
| What we colloquially consider "proof" are really more
| abstract propositions (e.g., using first-order logic) that
| require reasoning beyond simple program execution. While the
| difference is, in some sense, academic, it is important to
| observe that testing and proving (in that colloquial sense)
| are, at their core, the same endeavor.
| JohnKemeny wrote:
| That is stretching the definitions of proofs.
| assert random() != 0.
|
| QED
| GuB-42 wrote:
| I am not a fan of Test Driven Development, not at all.
|
| Having your invariants and pre/post conditions correct is not
| enough. You also need to do the right thing. For example, you
| have a function that adds two durations in the form hh:mm:ss,
| you have mm <= 60 and ss <= 60 as an invariant, testing it is a
| good thing, but it won't tell you that your addition is
| correct. Imagine your result is always 1s too much, you can
| even test associativity and commutativity and it will pass.
| Having these correct is necessary but not sufficient.
|
| Problem is, when you write tests first, especially tight, easy
| to run unit tests, you will be tempted to write code that pass
| the tests, not code that does the right thing. Like throwing
| stuff at your tests and see what sticks.
|
| I much prefer the traditional approach of first solving the
| problem the best you can, which may of may not involve thinking
| about invariants, but always with the end result in mind. And
| only when you are reasonably confident with your code, you can
| start testing. If it fails, you will have the same temptation
| to just pass the test, but at least, you wrote it at least once
| without help from the tests.
|
| Maybe that's just me, but when I tried the "tests first"
| approach, the end result got pretty bad.
| JohnKemeny wrote:
| You've missed the most important point of TDD--and indeed, of
| tests.
|
| Tests do not ensure that your functions are correct; they
| ensure that you are alerted when their behavior changes.
|
| Second, TDD is a way to force dogfooding: having to use the
| functions you're going to write, before you write them, helps
| you design good interfaces.
| GuB-42 wrote:
| > Tests do not ensure that your functions are correct; they
| ensure that you are alerted when their behavior changes.
|
| I agree with that part and I am not against tests, just the
| idea of writing tests first.
|
| > helps you design good interfaces
|
| I am sure plenty of people will disagree but I think
| testability is overrated and leads to code that is too
| abstract and complicated. Writing tests first will help you
| write code that is testable, but everything is a
| compromise, and to make code more testable, you have to
| sacrifice something, usually in the form of adding
| complexity and layers of indirection. Testability is good
| of course, but it is a game of compromises, and for me,
| there are other priorities.
|
| It makes sense at a high level though, like for public
| APIs. Ideally, I'd rather write both sides at the same
| time, as to have a real use case rather than just a test,
| and have both evolve together, but it is not always
| possible. In that case, writing the test first may be the
| next best thing.
| MoreQARespect wrote:
| >It makes sense at a high level though
|
| This is the way I've always done TDD.
|
| I don't think it makes sense to do it any other way. If a
| test case doesn't map on to a real scenario you're trying
| to implement the code for it doesn't make any sense to
| write it.
|
| I find that people who write the test after tend to miss
| edge cases or (when they're trying to be thorough) write
| _too many_ scenarios - covering the same code more than
| once.
|
| Writing the test first and the code that makes it pass
| next helps to inextricably tie the test to the actual
| code change.
|
| >but it is not always possible
|
| I don't think I've written any production code in years
| where I gave up because it was intrinsically not
| possible.
| GuB-42 wrote:
| > I don't think I've written any production code in years
| where I gave up because it was intrinsically not
| possible.
|
| What I meant by "not possible" is writing both sides of
| the API at the same time. For example, you write a
| library for overlaying maps on video feeds, it is good if
| you are also writing the application that uses it. For
| example a drone controller. So in the early phase, you
| write the library specifically for your drone controller,
| changing the API as needed.
|
| But sometimes, the drone controller will be made by
| another company, or it may be a project too big not to
| split up, that's the "not possible" part. And without a
| clear, in control use case, you have to make guesses, and
| writing tests can help make good guesses.
| pclowes wrote:
| The flow here for me is if the code is doing the wrong thing
| I:
|
| - Write a test that demonstrates that it is doing the wrong
| thing
|
| - Watch it fail
|
| - Change the code to do the right thing
|
| - Ensure the test passes
|
| And in return I get regression prevention and verified
| documentation (the hopefully well structured and descriptive
| test class) for almost free.
|
| I don't think any amount of testing absolves the programmer
| from writing clear, intention-revealing code that is correct.
| TDD is just a tool that helps ensure the programmers
| understanding of the code evolves with code. There have been
| so many times where I write code and expect a test to
| fail/pass and it doesn't. This detects subtle minute drift in
| understanding.
| gitremote wrote:
| I do TDD and write proofs as tests. TDD practitioners never
| said TDD is a substitute for thinking.
|
| > Problem is, when you write tests first, especially tight,
| easy to run unit tests, you will be tempted to write code
| that pass the tests, not code that does the right thing. Like
| throwing stuff at your tests and see what sticks.
|
| I never had that problem, but I knew how to code before I
| started TDD.
| init1 wrote:
| Instead of just thinking about the proof in your mind or putting
| it in comments you can actually embed pre/post conditions and
| compile time proofs in actual code using ADA SPARK. You use Pre
| and Post aspects for subprograms and pragma Invariant for loops
| and such constructs. The spark toolchain will run provers against
| your code and show if it can show the proofs hold.
|
| https://www.youtube.com/watch?v=tYAod_61ZuQ
| linguistbreaker wrote:
| I wonder if there is a rigorous/formal method to identify
| 'monotonicity' in code? Seems like it should be
| straightforward...
| iamwil wrote:
| I really liked this post. Does anyone have other blog posts that
| they liked that articulates a taste for good engineering?
| mreid wrote:
| Another thing I learnt from my math degree that I find helps a
| lot when programming and software engineering more generally is
| *defining your terms*.
|
| So many communication issues on teams occur when people are using
| the same words to mean different things.
| segmondy wrote:
| lookup "ubiquitous language" it's a phrase from domain driven
| design
| mreid wrote:
| Thanks, I hadn't seen that term before.
|
| Just to be clear, I wasn't claiming that "communicating
| clearly" is a new idea in software engineering, I'm mainly
| commenting on how effective embracing it can be.
|
| When doing math, pretty much every term is "load-bearing" in
| that arguments will make use of specific aspects of a concept
| and how it relates to other concepts.
|
| If you look at most graduate-level math textbooks or papers,
| they typically start with a whole bunch of numbered
| definitions that reference each other, followed by some
| simple lemmas or propositions that establish simple
| relationships between them before diving into more complex
| theorems and proofs.
|
| The best software projects I've seen follow a roughly similar
| pattern: there are several "core" functions or libraries with
| a streamlined API, good docs, and solid testing; on top of
| that there are more complex processes that treat these as
| black-boxes and can rely on their behavior being well-defined
| and consistent.
|
| Probably the common thread between math and programming is
| both lean heavily on abstraction as a core principle.
| wanderingmind wrote:
| One thing that I feel that is discussed less is, what are the
| high level constraints and levers that can be set to enable a
| group of programmers to become a better programmers. For example,
| how much impact does architecture have? bug tracing, choice of
| language, tools. People handwave and say that one is better than
| another because of <reason 1>, <reason 2>, but to be able to
| explain how each of this choice impacts the code in a semi-
| rigorous way by explaining all the way of how it impacts
| individual code components and their interaction (as a graph or a
| tree) would be immensely helpful. If anyone knows of resources
| like that, please let us know
| hansvm wrote:
| I think we're still at the stage of "this team lead makes shit
| happen" rather than "this strategy makes shit happen," with a
| lot of the nuances being fuzzy and/or wrong.
|
| Tossing out a few observations:
|
| 1. People make mistakes. Your strategy needs to account for
| that (resilient runtime, heavy type-checking, convenient and
| often-used REPL, etc).
|
| 2. Above a certain project size, nobody remembers everything.
| Your strategy needs to account for that (excellent multifaceted
| documentation, disallow long-range interactions in your code,
| etc).
|
| 3. Dependencies have a vastly higher cost than you expect, even
| in the short term. Plan for that (vendor more things, in-house
| more things, allocate resources to dependency management, cut
| scope, etc).
|
| I could go on. The core point is that certain properties of
| projects are "plainly" true to most people who have been around
| any length of time. I don't think we're yet at a point where we
| can often predict anything meaningful about some specific new
| technology, but a mental framework of "this succeeded/failed
| _because_ of {xyz}" helps tremendously in figuring out if/how
| that new idea will fit into your current workplace.
| skydhash wrote:
| The objective is always to attain some kind of synergy between
| the business logic and, as others have said, algorithm
| simplicity. The best way to go about it is to start from the
| chosen language/environment/library, and build out a DSL that
| can express the business rules. That's the basic premise of
| DDD, but where all this got complicated is splitting into
| contexts and the translation between their boundary.
|
| I believe programmers should learn a bit about programming
| language theory, mostly the bits about what is a language. Then
| how to recognize the things behind the labels, and how they
| morph (either from an operation or from a translation between
| contexts). Then it's a matter of using the knowledge to move
| from the environment to a DSL that can express the business
| rules.
|
| Architecture is the draft for the above, defining the starting
| point and a direction. And design is where you start to make
| decisions following the plan. For someone that have an idea of
| the destination, they can judge both.
| joseneca wrote:
| If only there was a clear answer to this. Software engineering,
| as it has appeared to me, has always seemed to evolve with
| opinions moreso than tangible improvements. That being said,
| some practices are almost universally agreed to lead to better
| code, like defensive programming (proving your assertions
| relating to program conditions in the actual code), uniform
| documentation, and, in general, taking your time in
| implementing structures that will be used throughout the
| codebase. Formalizing your logic in proofs can be one part of
| that.
| markus_zhang wrote:
| I'm too old for that, so I write the algorithm on paper and run
| it with some numbers, also on paper. I don't have the patience to
| do it often but whenever I did I usually got it done in one or
| two tries.
| lliamander wrote:
| This is similar to an intuition I've had about what it means to
| program "algorithmically". We often draw a distinction between
| "algorithms" and "business logic", but fundamentally they are the
| same thing. They are both plans of the steps necessary to
| accomplish a goal. The only difference, in my mind, is the style
| in which the program is written. To program "algorithmically"
| means to take steps to make undesirable program states impossible
| to represent.
|
| - In the case of search or sort algorithms, where the primary
| concern is the speed of computation, undesirable states would be
| performing unnecessary or duplicate computations.
|
| - In the case of encryption algorithms, undesirable states would
| be those that leak encrypted data.
|
| - In the case of an order shipping and fulfillment system, an
| undesirable state would be marking an order as fulfilled when not
| all of the items have been delivered.
|
| The more care that is taken to prevent undesirable states, the
| more the program takes on an algorithmic style. And the only way
| you can be sure that those undesirable states are impossible is
| to think in terms of proofs and invariants.
| alganet wrote:
| Real programmers use butterflies.
|
| https://xkcd.com/378/
|
| ---
|
| Let me elaborate: there is a huge history of dubious allegations
| of what constitutes "a real programmer" or stuff that makes you
| "a better programmer".
|
| Combobulation and descombobulation of McGuffins is often the best
| analogy though. Not to dismerit other kinds of thinkings, but
| already dismeriting them, that's what this is all about. When in
| doubt, showing the code is often what works.
| rufus_foreman wrote:
| You want to be a better programmer? Ignore these ridiculous
| people and write code every day.
|
| That's all there is.
|
| You want to get better at something? Do it all day every day.
| Aziell wrote:
| I often run through the code in my head first, especially for
| things like binary search where it's easy to mess up the logic.
| It feels like a quick mental check, sometimes even faster than
| writing tests.
|
| I'm curious though. When you're coding, do you actually pause and
| think through the logic first, or is it more like writing and
| fixing along the way?
| rramadass wrote:
| Folks might find the following two books useful in this regard;
|
| _How to Think About Algorithms_ by Jeff Edmonds.
|
| _Persuasive Programming_ by Jerud Mead and Anil Shinde.
| mparramon wrote:
| They taught me in school to think about preconditions, invariant
| and postconditions for loops. Then I switched to only using
| functional affordances like .map() and .each(). I am much happier
| now.
| begueradj wrote:
| > sketch a proof in your head as you go that your code will
| actually do what you want it to do.
|
| Could anyone, please, explain me the meaning ? I can't get it.
| saghm wrote:
| Everyone writes code that has bugs in it, but no one intends to
| write code that has bugs. The bugs occur because there's a
| disparity between what you think your code is doing and what
| your code is actually doing. The quote you give is an
| suggestion for how to reduce how often this occurs by
| considering why you should be confident about your intent being
| properly conveyed by the code you've written.
|
| It might help to think of this almost as debugging your program
| as you write it rather than only after you run it. Debugging a
| program is essentially the same process of trying to figure out
| why the program isn't doing what you think of doing. While
| there's information you can get by inspecting a running program
| or its outputs, there's often some amount of consideration you
| need to do in order to figure out why you think certain
| behavior might be happening. There's no fundamental reason this
| consideration can't happen before you actually run the program;
| even as you're writing the code, you can think about all the
| ways it could potentially go wrong, and why you're confident
| that this isn't the case. If you can't come up with a strong
| argument for that, it's a good sign you might need to
| reconsider your approach!
| begueradj wrote:
| It makes sense. Thank you for the effort.
| graycat wrote:
| Maybe a related practice (what I do): Write actual proofs or good
| documentation that contains essentially proofs and combine that
| with Knuth's _Literate_ programming.
|
| For what is to be _proved_ , try to use _divide and conquor_ to
| make the whole proof consist of separate steps, each of which
| gets a short proof and each of those steps might become a
| _subroutine_ (whatever the terminology and functionality of the
| programming language and environment).
| agnishom wrote:
| > write little proofs in your head
|
| This should be obvious, right? This is how you are supposed to
| write code. You are supposed to have a simple proposition about
| what each chunk of your code does.
| w3news wrote:
| To be a better programmer, write little proofs in your code. We
| call that tests and types, proof that it should do what you
| expect. Especially when writing tests first, then types, then the
| code. Start with a test per acceptance criteria, that well
| describes what it should do, and is clear what you send and
| receive. Also in an API you can describe the API in OpenAPI or
| GraphQL with all the properties and types, and you can validate
| on runtime the data on your specification, that specification is
| than also a proof that the application does what you described in
| your specification.
|
| So OpenAPI/GraphQL, tests and types and proof that the system
| works like intended. Always start with that before writing code.
| The specification is a solid base that doesn't change a lot if
| you start with it. How the code works, you can refactor it, and
| proof with the specification if it sill does the same like
| before.
|
| Code is less important than the specifications.
| bmn__ wrote:
| The five properties used as headings in the article are able to
| be expressed in good type systems. This way a huge part of the
| specification becomes code, and the compiler guarantees to
| uphold the expressed properties in many or all cases. We should
| strive to have a programming future where this idea is normal
| and commonplace.
|
| OpenAPI and GraphQL have exceedingly poor types, and will not
| bring us there unless they make a fifty year leap forward.
| shahzaibmushtaq wrote:
| Almost every programmer gets better through their own trial and
| error.
|
| For me, the best "trick" is to start with pseudocode.
|
| The _Isolation_ part is understandable and supports the
| philosophy:
|
| > When requirements change, you extend the behavior of [your
| program] by adding new code, not by changing old code that
| already works.
| derelicta wrote:
| I wished we had learnt about proofs in my degree. It might
| possibly part of the graduate program tho.
| amelius wrote:
| Is this how LLMs (should) write code?
| irchans wrote:
| This might be the best article that I have read this year. It
| might change the way that I program. I really want to write up
| examples for each of the ideas presented: Monotonicity, Pre- and
| post-conditions, Invariants, Isolation, Induction, and Proof-
| affinity as a quality metric. I think it's easy to find these
| ideas in my code, but I want to try to practice writing proofs
| "in my head" as I code to make my code better. It is not apparent
| to me now how this will work, but if I write up some examples, I
| imagine it will become clearer.
| runeblaze wrote:
| I literally don't even write proofs in my head. Every time I
| write algo I just ask Cursor for a Lean/Rocq proof of
| correctness. No tests. No test coverage. The sheer power of type
| theory
___________________________________________________________________
(page generated 2025-07-16 23:01 UTC)