[HN Gopher] What is "literate programming"? (2024)
___________________________________________________________________
What is "literate programming"? (2024)
Author : joecobb
Score : 74 points
Date : 2025-12-02 19:56 UTC (5 days ago)
(HTM) web link (pqnelson.github.io)
(TXT) w3m dump (pqnelson.github.io)
| stingraycharles wrote:
| To me the definition of literate programming is much less
| interesting than the spirit: for complicated logic / parts of
| code, I try to take the reader through the whole top-down plan /
| approach, as if it's a story I'm writing to my colleagues about
| what's going on and why. In those parts of code I can easily have
| 10 times as much lines of comments than code, but it's important
| to use it sparingly: people tend to start to ignore comments if
| they're low value. But it's _much_ more effective to have good
| comments than external documentation, as external documentation
| has a tendency to go out sync with the code.
|
| As with most things, don't be dogmatic.
| toolslive wrote:
| > As with most things, don't be dogmatic.
|
| It depends. If you want to learn faster, you should be
| dogmatic: "In der Beschrankung zeigt sich erst der Meister." If
| you want to become a better programmer, please _do_ set _extra_
| challenges (fe pure lazy functional progamming only, pure
| literate programming, ...)
| stingraycharles wrote:
| That's true, I was mostly referring to it in a professional
| setting, not for educational purposes.
| loa_in_ wrote:
| The examples are definitely acknowledgement worthy.
|
| I imagine the biggest hurdle on the path towards adopting this is
| writing down clear, readable prose using highly technical
| language. And naming things. Using ambiguous human language to
| describe a complex algorithm without causing a conflict in a big
| team.
| em500 wrote:
| This essay seems to be missing the main primary references for
| literate programming:
|
| https://www.cs.tufts.edu/~nr/cs257/archive/literate-programm...
|
| https://www-cs-faculty.stanford.edu/~knuth/lp.html
|
| Knuths intention seems clear enough in his own writing:
|
| _Literate programming is a methodology that combines a
| programming language with a documentation language, thereby
| making programs more robust, more portable, more easily
| maintained, and arguably more fun to write than programs that are
| written only in a high-level language. The main idea is to treat
| a program as a piece of literature, addressed to human beings
| rather than to a computer._
|
| and
|
| _Let us change our traditional attitude to the construction of
| programs: Instead of imagining that our main task is to instruct
| a computer what to do, let us concentrate rather on explaining to
| human beings what we want a computer to do._
| d-lisp wrote:
| I dream of a world where the Knuth idea of programming and
| mathematics are naturally embedded in our cultures, like novels
| are.
|
| I find it weird to not be able to find linux source code and
| commentaries or even math/physics/science masterpieces in
| libraries where you can find Finnegan's Wake easily (at least
| where do I live), and not be able to talk about the GHC in
| between two discussion about romance or the weather at the
| bakery.
| Nevermark wrote:
| > I find it weird to not be able to find linux source code
| and commentaries
|
| That one statement is a great concise explanation/motivation
| for "literate programming".
|
| Explanations with code, that explain code design choices, in
| a way that enables the code to be understood better, and the
| ideas involved to be picked up and applied flexibly to
| reading and writing other code.
|
| Another way to view it is: Developers are "compilers" from
| ideas to source. Documenting the ideas along with the
| "generated" source, is being "open source" about the origin
| and specific implementation of the source.
| 1718627440 wrote:
| This is how I see version control. Adding another dimension
| to every line of code, that explains the why that code is
| that way it is.
| zahlman wrote:
| > I chose the name WEB partly because it was one of the few
| three-letter words of English that hadn't already been applied
| to computers.
|
| Heh.
| rhdunn wrote:
| In a way this is what notebooks are for Python and other
| languages. They mix documentation and code such that you can
| run that code and inspect the output. See for example the
| pytorch tutorials.
| electroglyph wrote:
| or all the unsloth notebooks
| d-lisp wrote:
| Yes, notebooks are a restrictive type of litterate
| programming, interactive and browser bound.
|
| TeX was "proven" as a text/typography tool by the fact that
| the source code written in WEB (interleaving pascal and TeX
| (this is meta (metacircular))) allows for you to "render" the
| program as a typographed work explaining how TeX is made+ run
| the program as a mean to create typographic work.
|
| I'm lacking the words for a better explanation of how do I
| feel sbout the distinction, but in a sense I would say that
| notebooks are litterate scrips, while TeX is a litterate
| program ? (The difference is aesthetical)
| d0mine wrote:
| There is Org Babel in Emacs that can be an alternative to
| jupyter notebooks for literate programming
| (research/devopsy tasks). It is more powerful in some
| aspects and weaker in others.
| smusamashah wrote:
| If a program has very detailed comments will it fall under
| literate programming pattern?
| dandersch wrote:
| Couple things that helped me understand literate programming:
|
| - A literate program has code and documentation interleaved in
| one file.
|
| - _Weaving_ means extracting documentation and turning it into
| e.g. a pdf.
|
| - _Tangling_ means extracting code in a form that is
| understandable to a compiler.
|
| A crucial thing to actually make this paradigm useful is the
| ability to change around the order of your code snippets, i.e.
| not letting the compiler dictate order. This enables you to code
| top-down/bottom-up how ever you see fit, like the article
| mentioned. My guess on why people soured on literate programming
| is that their first introduction involved using tools that didn't
| have this ability (e.g. jupyter notebooks). Also, you usually
| lose a lot of IDE features: no go-to-definition, bad auto-
| complete, etc.
|
| IMO, the best tool that qualifies for proper literate programming
| is probably org-mode with org-babel. It's programming language
| agnostic, supports syntax highlighting and noWEB for changing
| around order. Of course it requires getting into the Emacs
| ecosystem, so it's destined to stay obscure.
| ChrisMarshallNY wrote:
| I'd guess that tools like Doxygen and Apple docc are probably
| the most obvious examples of documentation extraction.
|
| I've written code for many years, with Doxygen/Jazzy/docc in
| mind (still do[0]). I feel that it's a big help.
|
| [0] https://littlegreenviper.com/leaving-a-legacy/
| cjfd wrote:
| Documentation like doxygens is almost completely opposite
| from literate programming. The comment you are responding to
| emphasizes the ability to determine yourself the order in
| which to present the documentation. Literate programming is
| writing a document in the first place where, as an
| afterthought, a program can be extracted. Source code with
| doxygen is source code where, as an afterthought, documention
| can be extracted from. In many cases doxygen documention is
| quite worthless. Very often it is very helpfully documented
| that the method get_height, "gets the height". It is very
| fragmentary documentation where the big picture is completely
| missing. There is also a case where doxygen-like
| documentation is needed. This is when writing a library that
| is going to be used by many people. But then the doxygen
| comments should only be used on methods that you want those
| other people to use. And then there is still the danger that
| there will be too little higher level documentation because
| the doxygen is treated like it is sufficient.
|
| Literate programming is, in my opinion, only used very
| seldomly because keeping an accurate big picture view of a
| program up to date is a lot of work. It fits with a waterfall
| development process where everything that the program is
| supposed to do is known beforehand. It fits well with
| education. I think it is no coincidence that it was brought
| to prominence by D.E. Knuth who is also very famous as an
| educator.
| ChrisMarshallNY wrote:
| OK. Fair enough, but remember that Doxygen also analyzes
| code structure, and can generate things like UML diagrams,
| and inheritance trees.
|
| Maybe a tool like Rational Rose is more along those lines.
|
| I've always been a proponent of writing code in a manner
| that affords analysis, later. That's usually more than just
| adding headerdoc.
| svilen_dobrev wrote:
| >> A literate program has code and documentation interleaved in
| one file.
|
| >> - Weaving means extracting documentation and turning it into
| e.g. a pdf.
|
| >> - Tangling means extracting code in a form that is
| understandable to a compiler.
|
| Interesting. i have made a few times DomainSpecific-"languages"
| - like for chips-module-testing , or for HR-payroll stuff -
| expressed in some general language with an engine underneath,
| which allowed for both turning/rendering the DS-"code" into
| various machine-readable outputs - verilog, labview, .. - as
| well as various documentation formats. Essentially a self-
| contained code-piece-with-execution/s-and-documentation/s, with
| the feature to "explain" what goes on, change-by-change.
|
| Never knew it might be called literate programming.
| npodbielski wrote:
| Maybe it will be unpopular opinion but if your idea has to be
| explained after 50 years in a blog post maybe it was not that
| good after all. Or maybe idea was good but state of the tools and
| culture of your field is not best place to implement it, like the
| blog post ask: what tool you would use for literate programming?
| Or you need to write a tool for literate programming first? For
| me it sounds bit like runnable python notebook, which is great
| for DevOps stuff but not really for developing financial system.
| And I do not want to start about lack of tests as author states.
| js8 wrote:
| Maybe I am weird, but I would like to see/program in a formal,
| yet fuzzy/modal language, which could serve as a metalanguage
| that describes (documents) the program. This metalanguage must
| have some kind of constructs to describe unknown things, or
| things that are deliberately simplified in favor of exposition.
| So basically eschew natural language completely in favor of fully
| formalized description, that could be manipulated
| programmatically.
|
| However, I don't know what this metalanguage should be. I don't
| know how to translate typical comments (or a literate program)
| into some sort of formal language. I think we have a gap in
| philosophy (epistemology).
| svilen_dobrev wrote:
| search for "Controlled natural language". Many attempts in the
| past - ~20y ago, one of these is even called "Attempto", near
| nothing recently. Seems not enough interest in wide audiences
| GrantMoyer wrote:
| > This metalanguage must have some kind of constructs to
| describe unknown things, or things that are deliberately
| simplified in favor of exposition.
|
| Perhaps you're thinking of mathematics.
|
| If you have to be able to represent arbitrary abstract logical
| constructs, I don't think you can formalized the whole language
| ahead of time. I think the best you can do is allow for ad-hoc
| formalization of notation while trying to keep any newly
| introduced notation reasonably consitent with previously
| introduced notation.
| svilen_dobrev wrote:
| well, maybe it is everything that is not "illiterate
| programming", i.e. "programming-without-understanding".. which
| decade by decade gets more and more abundant/dominating.
|
| i do similar thing which i call live-sketching.. a mostly-no-
| content python namespace-hierarchy of module(s) and classes (used
| as just namespace holders), and then add (would-do-somehing)
| "terminal" methods, and combine-those-into-flows actual
| "procedures" methods , here and there .. until the
| "communication" diagram starts appear out of it, and week after
| week, fill the missing parts. It feels like some way of writing
| executable spec over imagined/fake stuff, and slowly replacing
| the fakes with reals. Some parts never get filled. Others are
| replaced with big-external-pieces - as-long-as matching the spec
| needed. What's left is written by hand.. and all this maybe
| multiple cycles.
|
| This approach allows for both keeping the knowledge of what the
| system should do - on the spec / hierarchical level - and freedom
| to leave things undone, plug some external monster, or do-it-
| yourself as one sees fit. The downside is that the plumbing
| between pieces might be bigger/messier than the pieces - if you
| have ever seen the spiderweb of wires above a breadboard with TTL
| ICs..
|
| e.g. for my Last project - re-engineering a multiple-aging-
| variants of kiosk-system into coherent single codebase that can
| spawn each/most of the previous - took me 6 months to turn a zoo
| of 20x 25KLoc into single 20Kloc +- 5 for the specializations -
| and the code-structure still preserves the initial split-of-
| concerns (some call it architecture), and comms "diagram", who
| talks to who when/why.
|
| But yeah, it's not for faint-hearted, and there little visibility
| of the amount of work going/done, as the structure at day 1 is
| more or less the structure at day 181, and management may decide
| to see only that..
| machino wrote:
| I've inherited some CWEB code from a colleague. My interpretation
| is that you write it like stream of consciousness, interleaving
| thinking and chucks of code. Not all code your write ends up in
| the final C file.
|
| However, the final effect is spaghetti code (you can surrogate
| "goto" by injecting code in different locations.) And docs are
| hard to read.
|
| But, it really forces you to explain what you do and how you got
| there, which is incredibly useful for reconstructing history.
| (Theirs is also a sort of diff file for it, I think with .ch
| extension, to amend files.)
| forgotpwd16 wrote:
| An interesting project I stumbled upon recently is AirLoom[0],
| essentially a reverse literate programming tool. Rather having
| code and prose interweaved (either Knuth-style code-within-prose
| or doc-style/as-comments prose-within-code), you've them split in
| dedicated in segment-annotated code and prose referencing those
| segments. AirLoom can then produce a combined document with
| references replaced by the actual code segments. This allows
| using a normal programming environment (not possible in first
| approach) and being order independent (not possible in second
| approach).
|
| [0]: https://github.com/eudoxia0/airloom
| elviejo wrote:
| There is also verso / recto that uses the same technique.
|
| https://github.com/nickpascucci/verso
|
| I actually wish for a tool that would use two things: 1)
| navigate code like a file system: Class/function/lines [3..5]
|
| 2)allow us to use git commit revisions so that we could comment
| on the evolution of the code
|
| So far the only thing capable has been leoEditor + org-babel
| PhilipRoman wrote:
| Thanks for mentioning this. I built the same thing a year ago
| for myself in dozen lines of AWK. Looks like great minds think
| alike :)
|
| In my opinion this is the most practical approach for real
| world projects. You get benefits like avoiding outdated
| documentation without huge upfront costs.
| antiquark wrote:
| I seriously looked into it many years ago...
|
| One problem with "literate programming" is it assumes that good
| coders are also good writers, and the good writers are also good
| coders.
|
| Another problem is that the source files for the production code
| will have to be "touched" for documentation changes. Which IMHO
| is an absolution no-no for production code. Once the code has
| been validated, no more edits! If you want to edit docs, go
| ahead, just don't edit the actual source.
| WillAdams wrote:
| I would turn this around --- it acknowledges the fact that if
| one needs to write a complex program, and to maintain it over
| the long term, one will need not just the raw code, but also
| documentation for how that code was written, and how changes to
| it should be approached.
|
| As I noted elsethread, the big thing which Literate Programming
| has netted me is that it makes editing easier/manageable, even
| for long and complex projects spread across multiple files ---
| having the single point of control/interaction where I can:
|
| - make the actual change to the code to implement a new feature
|
| - change the optional library which exposes this project to a
| secondary language
|
| - update the documentation to note the new interface
|
| - update the sample template files (one for the main
| implementation, the other for the secondary) to reflect the new
| feature
|
| - update an on-going notes.txt file where the need for the new
| feature was originally noted
|
| is _huge_ and ensures that no file is missed in the update.
| nerdypepper wrote:
| relatedly, i have been using literate haskell to document my
| advent of code journey this year:
|
| - day 5's solution for example:
| https://aoc.oppi.li/2.3-day-5.html#day-5
|
| - literate haskell source:
| https://tangled.org/oppi.li/aoc/blob/main/src/2025/05.lhs
|
| the book/site is "weaved" with pandoc, the code is "tangled" with
| a custom markdown "unlit" program that is passed to GHC.
| zkmon wrote:
| You are beating around a bush of nothing.
| rgreeko42 wrote:
| Isn't Org Mode and your LISP of choice the ideal literate
| programming environment? I'm surprised REPL-based LISP isn't
| mentioned at all.
| WillAdams wrote:
| There are lot of texts which were left out of this post --- I've
| been trying to collect literate programs published as books here:
|
| https://www.goodreads.com/review/list/21394355-william-adams...
|
| Not sure where the author got the contention that there are only
| a few tools for literate programming --- it's a straight-forward
| enough task that many programmers do this --- heck, even I
| managed to (w/ a bit of help on tex.stackexchange):
| https://github.com/WillAdams/gcodepreview/blob/main/literati...
| --- if it were more complex, and wasn't so implementation-
| specific (filenames need to be specified in multiple places), I'd
| write it up as a Literate Program and put it up on CTAN as a
| package.
|
| One classic bit of advice for writing is, 'It is perfectly okay
| to write garbage as long as you edit brilliantly.' --- the great
| thing about a Literate Program is that it makes the act of
| editing far simpler, which has made feasible every program I've
| ever written which got past the 1K lines mark --- including an
| AppleScript for InDesign which Olav Martin Kvern, then the
| "Scripting Evangelist" for Adobe Systems declared to be
| impossible (my boss had promised a system for creating a four-
| level deep index from XML embedded in the text of pages in an
| InDesign document, while OMK averred that it was impossible to
| create an index entry for more than the main level of the index
| --- one has to have code which tracks the existence of an entry
| at each level of the index and where it does not exist, starting
| at the top-level, insert it, then work down and add the sub-
| index-entry to the index-entry it is beneath).
| zupatol wrote:
| Another successful example of literate programming is fastHTML,
| and probably most of the code written at fast.ai and answer.ai.
| https://fastht.ml/docs/
|
| Here's Jeremy Howard explaining why he loves doing everything in
| notebooks: https://www.youtube.com/watch?v=9Q6sLbz37gk
| tehologist wrote:
| Literate programming is alive and well in 2025.
|
| https://leo-editor.github.io/leo-editor/
|
| https://kaleguy.github.io/leovue/#/t/2
|
| https://ganelson.github.io/inweb/inweb/index.html
|
| Inform 7 is arguably one of the largest programs ever written in
| literate style.
| Jtsummers wrote:
| Perhaps the most prominent example of literate programming missed
| by the author: https://www.pbrt.org/ _Physically Based Rendering_
| by Pharr, Jakob, and Humphreys.
|
| Responding directly to a couple things the author wrote:
|
| > When programming, it's not uncommon to write a function that's
| "good enough for now", and revise it later. This is impossible to
| adequately do in literate programming.
|
| It's not impossible in literate programming. There's nothing
| about LP that impedes this, I do it all the time. I have a quick
| obvious implementation (perhaps a naive recursive solution) and
| throw it in to get things working. I revisit it later when I need
| to make that naive recursive one faster (memoization, DP, or just
| another algorithm all together). It's no harder than what I'd do
| with an ordinary approach to programming.
|
| > Unit testing is not supported one bit in WEB, but you can
| cobble something together in CWEB.
|
| WEB was designed for use with Pascal and CWEB for C and C++. At
| the time the tools were developed, "unit testing" as it means
| today was not really a widespread thing. Use other tools if you
| find that WEB is impeding your use of unit tests in your Pascal
| programs. With other tools (org-mode and org-babel are what I
| use), it's easy to do. Like with writing good enough functions,
| you just do it, and it's done. You write a unit test in a block
| of code and when it gets tangled you execute your unit tests.
| This can be more cumbersome in some languages than with others,
| but in Python it's as easy as: #+BEGIN_SOURCE
| python :noweb yes :tangle test/test_foo.py from
| hypothesis import ... from pytest import ...
| <<name_of_specific_test>> <<name_of_other_test>>
| #+END_SOURCE #+NAME: name_of_specific_test
| #+BEGIN_SOURCE def test_frob(...): ...
| #+END_SOURCE
|
| When I used LP regularly I had a little script I wrote that would
| tangle source from my org files, and because I had the names and
| paths specified everything would end up in the right place. This
| is followed by running `pytest` (or whatever test utility) as
| normal. I used this in makefiles and other scripts. This is only
| slightly harder than the normal approach, but not hard. I added a
| `tangle` step into my build and test process and it was good to
| go.
|
| If your unit test system requires more ceremony then you'll need
| to include that as well, but you'd have to include that in your
| conventionally written code as well.
| kyykky wrote:
| I think assigning coding tasks to AI is a sort of literate
| programming without the programming part.
| froh wrote:
| Is there some literate programming LSP server around, which under
| the hood tangles the code chunks for language specific child LSP
| servers, and proxies those? so you have LSP support in the
| litprog source?
|
| it would probably also semi-weave the source into a standard,
| say, markdown or latex or asciidoc and proxy that LSP server on
| those woven files.
| cxr wrote:
| "Write code top-down"[1] is to literate programming what
| "concentrate on writing readable code" is to code comments.
|
| (Having said that, I firmly hold the opinion that we should all
| be writing READMEs in HTML[2][3] (instead of Markdown) and more
| fully exploring/exploiting the capabilities--and ubiquity--of web
| browsers to enable "smart documentation": self-contained (i.e.
| single-file) study aids, visualization widgets, etc[4].)
|
| 1. <https://www.teamten.com/lawrence/programming/write-code-
| top-...>
|
| 2.
| <https://hn.algolia.com/?dateRange=all&type=comment&prefix=tr...>
|
| 3. <https://crussell.ichi.city/pager.app.htm>
|
| 4. <https://holzer.online/articles/easteregg-lp-style/>
| there4 wrote:
| The BackboneJS library and the documentation generated by Docco
| has always been an interesting example. Compare the annotated
| source with the source at GitHub:
|
| https://backbonejs.org/docs/backbone.html
| https://github.com/jashkenas/backbone/blob/master/backbone.j...
| https://ashkenas.com/docco/
___________________________________________________________________
(page generated 2025-12-07 23:01 UTC)