[HN Gopher] What docs-as-code means
___________________________________________________________________
What docs-as-code means
Author : remoquete
Score : 66 points
Date : 2024-10-20 11:24 UTC (11 hours ago)
(HTM) web link (passo.uno)
(TXT) w3m dump (passo.uno)
| cheschire wrote:
| This feels like a job ripe for startup disruption.
|
| LLM documentation generators tend to benefit from context, and
| nothing provides better context than a mostly functional code
| base. The best part is the code doesn't even need to compile for
| the LLM to build the context needed.
| remoquete wrote:
| Author here. No, I don't think it does. https://passo.uno/ai-
| anxiety-tech-writer-howto/
| cheschire wrote:
| You're speaking from a position of survivorship. The jobs
| you've been hired to perform give you confidence. It's all
| the jobs you _haven 't_ been hired for that I'm more focused
| on.
| remoquete wrote:
| And the resulting failed documentation, I guess.
| chaffroomba wrote:
| Problem I'm having as a developer with LLM documentation is
| their reliability, or rather lack of it. Every time there is an
| assertation I end up having to double-confirm it because they
| tend to be wrong as often as they're right. Reading imaginary
| hallucinated documentation is just about as useful as zero
| documentation.
|
| While I could keep doing this for the rest of my life, my
| employer doesen't really appreciate the extra expense. A
| technical writer is much, much cheaper than the dozens of
| developers trying to confirm the docs.
| exe34 wrote:
| > Reading imaginary hallucinated documentation is just about
| as useful as zero documentation.
|
| no, it's worse. it's closer to reading outdated documentation
| that outright lies and gives examples that don't work, and
| will cause you to waste hours/days learning things that
| aren't relevant to the api anymore.
| joshuanapoli wrote:
| I think we will go the other way: A person will write the
| documentation and the AI will build the system that delivers
| the described product.
| chaffroomba wrote:
| Isn't that what sw development today is? A description of a
| system and what we want it to do. With the advance of
| compilers, libraries, frameworks, linters, autocomplete
| systems and so on, we're already very close to describing the
| minimum amount of information the system needs in order to
| produce the correct result. To my knowledge actually
| physically writing the software has not been a bottleneck in
| a very, very long time.
| joshuanapoli wrote:
| Right now, it takes skill and labor to move descriptions
| between representations for business goals, engineering
| (where we have frameworks and linters, etc.), and
| external/customer facing documentation.
|
| The customer is faced with an output from the design
| process. I think that we can turn that around now. Let
| customers edit part of the documentation, and let the AI
| adapt the system to their need.
| philipwhiuk wrote:
| https://www.commitstrip.com/en/2016/08/25/a-very-
| comprehensi...
| euroderf wrote:
| I would think that Requirements Engineering would be the
| ideal prep for AI code generation. A natural language
| counterpart to (e.g.) TLA+.
|
| Is this too damned obvious? Or am I missing something?
| smokel wrote:
| The main problem is that documentation should be written on
| _why_ code is written the way it is, and why it exists in the
| first place. This context is typically not available in the
| code itself. In the best case, this is encoded in requirements
| and design documentation, but more often than not, the
| information remains only in the heads of customers, architects,
| and developers.
|
| The code itself merely describes _how_ something is solved.
| Summarizing that in documentation can be useful at times, but
| it is not the full story. Especially for code that lives on for
| some time, the original design philosophy is often lost, and it
| is forced in horrible directions.
| janice1999 wrote:
| Code doesn't capture architectural decision making, one of the
| most important being why other solutions were not used. LLMs
| would need to be nearly omniscient (understanding underlying
| hardware, customer needs, budgets even) to derive the reasoning
| behind those decisions afterwards from code alone.
| from-nibly wrote:
| A source code repository is actually a terrible place to get
| that context. All of the things like decisions and how this
| relates to the customer are completely missing.
| Etheryte wrote:
| This completely misses what valuable code documentation is for.
| Anyone can read the code and figure out what it does, what as a
| documentation can be convenient, but it's not really all that
| useful, especially if it rots over time. Even if it's painful
| at times, the what is all there in front of you in the code.
| Valuable documentation explains the why, why was one approach
| chosen over another, why do we need to do this to begin with,
| why are the edge cases the way they are. This information is
| not present in the code and no tool can ever extract it since
| it isn't there.
| keybored wrote:
| > LLM documentation generators
|
| I can't wait for the LLM digression-generators to quit.
| sshine wrote:
| The problem with "documentation" in commercial code is that
| nobody reads it.
|
| Any "documentation" that isn't embedded in code will never be
| opened again.
|
| Any "documentation" in the form of doc-strings and comments will,
| over time, lie.
|
| This is because competent programmers cannot agree on whether to
| even comment.
|
| When any large percentage of competent programmers do not
| comment, it is better to not rely on comments.
|
| Here's another pitch: Code-as-Docs
| llm_trw wrote:
| Here's another: literate programming.
|
| If it's good enough for Knuth it's good enough for me.
| oersted wrote:
| I tried to do it seriously a number of times. Perhaps I am
| doing it wrong, but my productivity drops like a stone.
|
| Writing down your thoughts as you go and maintaining them
| takes serious time. The romantic notion of writing (and
| reading) code like a book is appealing, but writing books is
| hard and arduous, it cannot be underestimated as a craft on
| its own right, and there is coding to be done.
|
| There is also the question of structuring the literate code.
| Telling a story of how you are building it or explaining how
| it works has a very different flow and order to how code is
| usually structured for good maintainability.
|
| Please correct me if I'm wrong, because I would love to dive
| into it, but I don't think there has ever been any major
| piece of software developed following literate programming
| (at least as Knuth envisioned it). I also don't think there
| is any significant book that contains a sizeable working
| program embedded in it throughout, that can be compiled and
| executed as-is.
|
| In practice Knuth was most concerned with embedding short
| code snippets in his papers and books. Having the whole thing
| be an actual compilable program was secondary, and it was
| mostly short academic proof-of-concept prototypes and
| algorithms.
|
| Don't get me wrong, I love the concept, that's why I have
| given it multiple serious tries over the years, likely I will
| again, and why I think I have some insight of what happens
| when you use it for "real-world" work.
| llm_trw wrote:
| NoWeb can support multi-thousand page documents which can
| compile to tens of thousands of lines of code.
|
| I used it at a deep tech startup I worked a number of years
| ago to document the theory behind why the code was doing
| what it was. Doubly useful since I could just use a regular
| bibtex citation system for papers which had done some part
| of what we were trying to do.
|
| My code became the defacto technical onboarding document,
| still in use today, despite the fact that none of the code
| in it has been updated since I left.
|
| For examples you can read: http://brokestream.com/tex-
| web.html TeX is written entirely in WEB.
| oersted wrote:
| Thanks for the insight. Just to be clear, writing
| documentation after the fact with lots of code snippets
| is obviously good and is standard practice.
|
| You can take the next step and ensure that the entirety
| of the code is in your documentation as snippets. This
| usually doesn't make much sense, there's lots of code
| that it is not worth explaining in a literate style. And
| what's the point of the documentation containing the
| whole program if it is written after the fact and you
| already have a standard more maintainable codebase as the
| ground truth? The fact that your literate code didn't get
| touched says a lot.
|
| To me the name Literate _Programming_ implies that you
| write the code in a literate style from the outset. If
| you make it literate after the fact it is just normal
| documentation with snippets isn't it?
| llm_trw wrote:
| >The fact that your literate code didn't get touched says
| a lot.
|
| The fact it's still used 5 years later without any of the
| code still being in production says more.
|
| >To me the name Literate Programming implies that you
| write the code in a literate style from the outset.
|
| This seems like a fundamental misunderstanding of what it
| means to write. You should perhaps look at how people who
| do it for a living write books or articles. The final
| document has little to do with what you spend most your
| time editing.
|
| I absolutely hacked on the tangled source code of the
| program when trying to fix bugs or extend capabilities.
| Once I knew what I wanted to do I put it back in the
| literate program, usually finding a lot more bugs in the
| process.
| oersted wrote:
| Perhaps I came across as too critical, I have a lot of
| respect for what you did and for the craft of technical
| writing. And I definitely understand that writing is not
| done linearly and is very iterative.
|
| Correct me if I'm wrong, but it sounds like you were
| documenting existing code and that the result was very
| valuable as documentation, but not necessarily as code.
| You were acting as a technical writer not a programmer,
| it's a bit of a disconnect to call it Literate
| _Programming_ , even if you were using Literate
| Programming tooling (NoWeb).
|
| This kind of documentation is common practice all over
| industry and it is valued, but I don't think Literate
| Programming is considered to be widely adopted because of
| that.
| llm_trw wrote:
| I'm having a hard time even understanding what the
| question is here.
|
| You seem to be confusing the tools with the work being
| done.
|
| You can write a prototype of a C function in Python to
| see if you understand the requirements before you commit
| to the much harder task of writing it in C. That doesn't
| mean you're not writing a C program.
|
| The same is true for literate programming. I can write
| code outside the main literate program when I'm not sure
| it's meant to do before I put it back in.
| oersted wrote:
| What I'm saying is simply that as you describe it, you
| are first writing the code normally, and then separately
| writing some documentation about it accompanied by code
| snippets for context.
|
| But if that's Literate Programming then everyone is doing
| it and it's not a very meaningful label, it's just
| documentation.
|
| I do get it, the distinction is that you are using NoWeb
| and you can convert between the documentation and the
| code, and that the documentation contains the entirety of
| the code. I suppose that's neat.
|
| At some point, this boils down to a pointless discussion
| of semantics (my fault). "Literate Programming" as you
| describe it does not sound like a style of "Programming".
| Actually, when you reverse the Programming/Writing
| emphasis, it simply becomes "Technical Writing", which is
| what everyone does, because that's what's actually
| needed. And it is done by great writers rather than great
| programmers (which may describe Knuth, with the upmost
| respect).
|
| I always interpreted it as writing the text and the code
| together, logging your thought process, thinking of code
| like a piece of literature as it is written, rather than
| adding some documentation to it later. The notion that
| writing it like this will yield better code, regardless
| of its value as documentation. I suppose that's why it
| was unproductive for me, it is a rather romantic
| interpretation (again, my fault).
| llm_trw wrote:
| You write it however you want.
|
| I don't know how you code, but the first draft of code is
| never what ends up in the code base. Neither is the first
| draft of the documentation. You can write both together,
| but until you have an idea of what the structure of the
| code would look like, and how to split it up then you're
| better off doing multiple drafts.
|
| As always code is read much more often than it is written
| and literate programming is used for the reading part,
| not the writing part. The efficiency comes in not having
| to guess what 0x5F3759DF is there for.
| sundarurfriend wrote:
| > I absolutely hacked on the tangled source code of the
| program when trying to fix bugs or extend capabilities.
| Once I knew what I wanted to do I put it back in the
| literate program
|
| Does NoWeb automate this "untangling" process in any way?
| I sometimes use Weave.jl [1] when I'm thinking out loud
| through code, and at times it would be nice to just work
| on the tangled code, refactor and reorganize things, and
| have it all untangle back into the original in some way.
| I have no idea how that would work though, and it would
| likely be pretty limited even if it existed, but I'm
| curious what the usual approach you take to this is.
|
| [1] https://weavejl.mpastell.com/stable/
| llm_trw wrote:
| No, in noweb programs you insert chunks of code in
| multiple places and have conflicts when you try and
| automatically merge the code back too often.
|
| Org mode has a function which does this, but they didn't
| allow for arbitrary chunk nesting the last time I looked.
|
| Emacs has a number of very useful features in the modes
| for noweb/tex, one of which is jumping to the chunk which
| the code came from in the tangled source code on the
| pretty printed PDF. This follows the spirit of what you
| want. In fact SyncTeX support comes pretty much out of
| the box for noweb files and makes their editing a breeze,
| either as text or code.
|
| Of course if you're not on Emacs than god help you.
| textread wrote:
| Are you using the following workflow?
|
| orgmode file --> export to pdf (aka weave)
|
| orgmode file --> org babel tangle
|
| Would you please help me understand your workflow for >
| jumping to the chunk which the code came from in the
| tangled source code on the pretty printed PDF
|
| Do the codeblocks in your pdfs contain hyperlinks back to
| the org file where they came from?
| llm_trw wrote:
| No I'm using noweb. There is an option in noweb to add
| comments in the tangled code with the line and file from
| which they originate. Then there's an Emacs mode that
| let's you jump to that code. I wrote a little function
| that let you instead jump to the line in the same like in
| pdf using SyncTeX.
| oersted wrote:
| It seems like the site got the HN hug-of-death. Mirror: h
| ttps://web.archive.org/web/20221130150047/http://brokestr
| ea...
|
| I have skimmed the TeX literate PDF (I did a number of
| times in the past too). Frankly isn't it just like normal
| code with verbose comments? I have seen lots of code like
| that and it is not referred to as literate. The only
| difference is that this is a PDF, which makes it less
| practical and it is still not particularly readable as a
| book.
|
| It might have great book-like typography but not the
| "narrative" structure that helps you properly understand
| how it works without getting bogged down in details
| first. There's no coherent outline, no chapters or
| sections for major systems or design decisions, no
| overarching overviews, no relating different parts and
| giving context. There's also no story of how it was built
| or a log of his thoughts throughout problem-solving
| process, that would have been another good angle. Instead
| it's just the code from top to bottom with embedded very
| local commentary. The code itself is actually rather hard
| to parse visually by modern typographic standards.
|
| The issue is that probably I am misinterpreting what
| Knuth intended. The Literate Programming concept was a
| product of its time, and it has evolved into more
| practical modern documentation standards that are not so
| tightly linked to the code and don't exhaustively cover
| every line. The only problematic thing about it might
| just be the grandiose name Literate Programming, without
| that it's mainly good common-sense advice for quality
| documentation, but not necessarily a practical
| programming paradigm like the name implies.
| llm_trw wrote:
| Again, I'm having a hard time understanding what the
| issue is. It seems like you are deeply confused about
| what literate programming is and how it works.
|
| Have you read the original paper here in full:
| http://www.literateprogramming.com/knuthweb.pdf ?
|
| All of the navigation issues are taken care of by using
| <<chunks of code>> in a nested structure. You follow the
| numbers in those, like a follow your own adventure game,
| to find out whatever you need.
|
| The index has a listing of everything used in the program
| along with where it was defined and where it was used in
| case you want to find something specific.
|
| More modern tools, like NoWeb, turn all of this into
| hyperlinks so you can jump around the pretty printed
| version without having to loop up page numbers.
| oersted wrote:
| I have read the paper in the past, I am well versed about
| WEB, and I believe I have done literate programming at
| length for a number of non-trivial projects.
|
| I have explained my thinking in a separate comment
| (apologies for creating two branches). In short, I do
| think you are right and that I had an overly romantic
| notion of Literate Programming in mind.
| WillAdams wrote:
| I've found that Literate Programming suits how I
| think/approach projects, and it has worked for some quite
| large projects in the past for me.
|
| I've been maintaining a list of programs published as books
| and resources for Literate Programming at:
|
| https://www.goodreads.com/review/list/21394355-william-
| adams...
|
| esp. see:
|
| https://www.pbrt.org/
|
| and
|
| https://mruckert.userweb.mwn.de/understanding_mp3/index.htm
| l
|
| My current project is:
|
| https://github.com/WillAdams/gcodepreview
|
| which uses a LaTeX package for this which I put together
| with a bit of help from tex.stackexchange --- the big
| advantage to it is that it allows editing the
| documentation/code with "normal" syntax highlighting, the
| disadvantage is that the .sty file has to be edited/updated
| to match the files which are being output and I still don't
| have a good setup for the readme.md
|
| I find having the typeset PDF w/ its hyperlinked ToC and
| marginalia and indices helps a lot in having a "nice"
| version which I can look through to remind myself of what
| was intended at a given point, and most importantly, to
| find _where_ that was written down. Working on a re-write
| now --- we'll see if this holds up for that.
| oersted wrote:
| Awesome links, thank you. I did come across "Physically
| Based Rendering" at some point, I forgot about it. This
| is definitely an excellent example of Literate
| Programming.
| jerf wrote:
| Literate programming, as originally described by Knuth, is
| a good essential idea embodied as a bunch of accidental
| instantiations of the idea that have gone badly out of
| date. Knuth's ideas at the time add a layer on top of
| programming languages to allow you to rearrange the code in
| a lot of ways that the languages at the time didn't support
| well or at all. It essentially adds an independent concept
| of "function", and adds on top of any ability the language
| had to have documentation its own documentation overlay on
| top.
|
| Problem is, in the meantime, languages got a lot better at
| functions, got more flexible in their organization, built
| in better capabilities for documentation and comments, and
| it all goes a different direction than the languages did.
| The result in the modern era is a rather bizarre multi-
| headed hydra of conflicting ideas about how things should
| be documented and tested.
|
| If someone wanted to resurrect the ideas, they need to not
| just try to get people to do what Knuth laid out decades
| ago super harder... they really need to sit down from the
| very beginning and work out how to update it in the modern
| era to be less redundant to what we already have. It could
| be as simple as taking modern doc strings and upgrading
| them a bit to allow highly-formatted comments to be
| embedded into code. Or, instead of trying to "weave" the
| code into a static book, allow the user to specify an entry
| point and then follow through everything that happens in
| the functions that it calls and turn that into a book,
| e.g., say "I'm going to enter this web framework through
| this path, tell me everything that happens". Or some other
| idea I don't have yet. Something that harmonizes with
| modern languages instead of fighting them.
| GuB-42 wrote:
| Knuth is not the average programmer by far. And I am not
| talking about coding skills. Knuth is a writer at heart. He
| was also from a time where writing code on paper was the
| norm. Literate programming is good for Knuth, but maybe not
| for most coders today, who grew up on fast computers and
| IDEs.
| llm_trw wrote:
| >who grew up on fast computers and IDEs.
|
| >>When I was a child, I spake as a child, I understood as a
| child, I thought as a child: but when I became a man, I put
| away childish things.
| mycall wrote:
| Depends how you write code. When I use Semantic Kernel, my
| KernelFunctions include a well-defined documentation for the
| inputs and outputs, then using the System Prompt you can
| provide the concepts and glue between the various plugins. It
| is the function specification as a whole. Precision is
| important, although GPT is not yet perfect -- perhaps in
| another year or two it will be.
| exe34 wrote:
| my favourite documentation is minimal running code examples.
| give me example inputs to get the job done - i.e. not just
| "inputs: x is a y", but actually create a minimal version of y
| and show it going into f(x) and coming back out as a genuine
| object (as opposed to a mock) that I can inspect/prod until I
| understand what's happening.
| ransom1538 wrote:
| I prefer when developers and project managers create massive
| google docs for specs and descriptions. Double points if you
| share the document with only a handful of favorite employees.
| Also, ignore all requests to get permission to this document.
| Eye roll in all meetings if someone hasn't read this document.
| You can get to god mode if you hide comments around the doc.
| YawningAngel wrote:
| Lots of people, myself included, read documentation.
| ks2048 wrote:
| Ditto. The problem is self-reinforcing: people don't read
| docs because the docs are bad... we don't spend time on docs
| because no one reads them...
| sshine wrote:
| Not only is it self-reinforcing:
|
| It is inevitable because some competent programmers
| deliberately don't comment their code, read comments, or
| delete other people's comments when they are stale.
| sshine wrote:
| I personally read and write a lot. I track things in git
| messages by cross-referencing issues, and I comment my code.
|
| But my point is: when you have a cultural divide among
| competent programmers on whether to comment, not commenting
| game-theoretically wins, because the outcome where you have
| comments that get updated some percent of the time is worse
| than not having those comments.
|
| Instead, embed what you want to say in a comment in the code
| itself, or in a test.
|
| Documents your libraries and APIs if they are used by people
| outside your team.
| intelVISA wrote:
| Good code is fairly self-documenting, but alas.
| axelfontaine wrote:
| That may be true for the "what", but certainly not for the
| "why".
| intelVISA wrote:
| Why is best inferred from context imo, otherwise it's a
| smell for a system with too much scope.
| planetafro wrote:
| I can't even count how many times that I've had the DRY vs.
| verbosity of code conversation in trying to norm a team.
|
| I'm in the camp of a little verbosity and repetition for the
| sake of clarity is worth it.
| j-krieger wrote:
| This is pretty much why Rust has doctests. You include a small
| test in your docstring.
| enraged_camel wrote:
| Elixir as well.
| sshine wrote:
| Executable tests in Rustdoc are amazing. For those not
| involved, they are run when you write `cargo test` and they
| are included as markdown code blocks in your crate's
| documentation.
|
| They're not an excellent place for extensive testing. But
| they are super useful for making sure your documentation
| examples are updated and functional.
|
| #![deny(missing_docs)] is also a great way to ensure you
| don't forget to document things.
| xtiansimon wrote:
| Not at any level? My imagination is reeling--I'm thinking of
| Cold War era Spy Novels describing siloed groups who don't know
| what any other group is doing. Each is laser focused on their
| own tasks and everything is a tactical choice. It's the great
| reveal in the Movie adaptation when the separate threads come
| together and the grand national strategy is revealed. And now
| I'm also imagining the comedy versions of this genre, where the
| reveal has no purpose.
|
| At some level there is _structure_, and it can be communicated.
| If for no other reason than to validate the evident structure
| in code.
|
| > "...sitting at the same table as the almighty coding knights?
| [...] Remove documentation...and your products cease to exist,
| their inner workings left to the imagination of..."
|
| The author finished the sentence with "users", but who are they
| talking about? Those coding knights and their imagination which
| sort of parallels my Spy genre example.
|
| That said, the article seems a bit naive about the depth and
| breadth--reads like Ra-Ra.
| remoquete wrote:
| Author here. What do you mean by depth and breadth?
| xtiansimon wrote:
| -1, huh.
|
| Take my opinion with a grain of salt. Today I work for a
| small private firm, but years ago I worked in a corporate
| job (with tech writers). I drank the Kool-Aid. I believed
| the work I was doing was important and critical because
| someone up the vertical decided it was and they hired
| someone who hired someone who hired me to do a job--its
| corporate world.
|
| The tell for preaching to the choir, what triggered my
| response, are the hyperbolic strong statements ("products
| cease to exist", "your business don't crumble overnight",
| "failures are dramatic") which sound defensive. The rest of
| the blog post reads as defining what exists and setting
| standards, and further explanation of why it's all
| important. This is the breadth and depth. What I'm not
| reading are on/off-ramps and shortcuts that all of the
| developers who disagree about the mission critical position
| of documentation would regard as sensible accommodations.
|
| Today I work in a small firm and we need documentation.
| What I learned from underpaying small business is
| documentation is important and has a role (otherwise our
| small firm wouldn't waste our time on it). But if I took
| the position argued here, I would be told I was wasting
| time, that I should focus, get it done, and move on my job
| tasks. If you're working closely with writers, then you
| need to believe what they're doing is important without
| being told. If you don't believe this, then maybe your role
| does not need documentation (yet? or your group is small
| enough and people like the job security or you're just
| overworked, or something else). When I'm working
| collaboratively and I hear people tell me what I'm doing is
| not important, I have to believe there is some truth to it.
| A limit.
|
| Good luck in your work.
| MrHamburger wrote:
| code-as-docs will never tell you why method or a module exist
| in the first place.
| sshine wrote:
| The commit history will.
| MrHamburger wrote:
| Well from my experience, people who does not write comments
| are not keen on writing commit messages either. "Some
| updates", "Module ABC" "Changes"
| sshine wrote:
| It seems like the "why" is eternally lost on us in this
| scenario.
|
| If I could wave a magic wand and make all the colleagues
| verbalise what they're doing, I would.
|
| In the meantime, I'll take the downvotes for pointing out
| that it's fundamentally caused by a cultural gap, not
| just "those who don't get it yet, and those who do."
| simonw wrote:
| This is why I like all commit messages to link to an
| issue thread somewhere (such as GitHub issues).
|
| The great thing about issue threads is that you can add
| more context to them later on - unlike commit messages
| which are frozen in time.
| simonw wrote:
| If people aren't reading your documentation it means they don't
| trust your documentation.
|
| The solution is to build that trust, by building a culture of
| active documentation maintenance.
|
| My favorite trick for that is to keep the docs in the same repo
| as the code and actively enforce relevant documentation updates
| as part of the code review process.
|
| Once developers learn that new code cannot land on main without
| accompany documentation updates they learn to trust that
| documentation pretty fast.
| btbuildem wrote:
| I read the article as referring to documentation for end users
| -- not internal documentation / comments written by developers
| who thought they knew what the code they wrote was doing.
| th0masfrancis wrote:
| Love the theme of your blog
| remoquete wrote:
| Thanks! It's a tweaked version of
| https://github.com/mrmierzejewski/hugo-theme-console
| codelion wrote:
| With LLMs it is now quite easy to generate docs as needed. In
| fact we built a service to do just that - https://docs.codes/
|
| Here is an example of how it is very useful especially for newer
| libraries.
| remoquete wrote:
| You need good source material, including docs, to have LLMs
| generate docs that are accurate, reliable, and safe. LLMs have
| interesting applications in areas like SDK and API docs, for
| sure, but can't replace an entire function.
| codelion wrote:
| Correct but this is a good starting point for code that is
| written after the cut off of language models training data as
| you cannot otherwise debate accurate code form then for the
| newer versions of the library.
| codelion wrote:
| Just realised that I missed adding the link to the example -
| https://github.com/unclecode/crawl4ai/issues/126
| jerrac wrote:
| So, I feel like "Docs-as-Code" has some context I'm missing, so
| I'm going to comment on docs in general.
|
| I think there multiple kinds of docs for software.
|
| * Comments explaining a specific section of code.
|
| * API docs describing functions/classes/etc.
|
| * Docs on how to use a library/class/etc. Usually including
| simple, isolated, examples.
|
| * Tutorials on how to create simplified applications using the
| developed tools.
|
| * Docs on how to deploy, configure, and maintain an application.
|
| * Docs on how to use an application.
|
| * Docs on how to troubleshoot an application.
|
| * Docs on how to integrate applications.
|
| * And likely others I'm missing.
|
| Personally, I've been seriously frustrated by how bad most of the
| open source (haven't done much with proprietary code)
| documentation is. Case in point is Drupal and Symfony. Trying to
| use api.drupal.org is not fun, and Symfony's docs always cover
| the basics, and then there's nothing on pulling everything
| together into something complicated. So you try to dig into the
| actual code, and end up finding multiple layers of uncommented
| abstractions. Yes, I can eventually figure out what is going on
| if I put the effort in, but that's a lot of time that could be
| save by a few lines of comments.
|
| I usually end up asking JetBrains AI about what I need, then use
| what it says after I fix the errors it makes... It's also very
| good at summarizing everything I'd find if I used a normal
| search. But that all only works if others have already asked and
| answered my questions.
|
| Some things I've been trying to do to improve my own code's
| documentation:
|
| * Unless the line is super obvious, even if I think it is
| obvious, I try to leave a comment. Yes, it seems pointless, but I
| have gone back to old code I remember being obvious without said
| comment enough times that I think it is worth it.
|
| * Avoiding "elegance" in favor of "explicitness". For example, I
| use full `if` statements instead of ternary operators even when
| ternary operators would look better. For whatever reason the
| syntax of ternary operators has never sunk in for me, and the
| explicitness of `if` is much easier to parse. I also use very
| descriptive function and variable names. Basically, if I have to
| think about what something means, I try to change it so I don't
| have to.
|
| * Split out functions into smaller functions as much as I can't.
| This means I can use descriptive function names. And I'm pretty
| sure it's just good practice.
|
| I also have been trying to figure out ways to keep higher level
| docs closer to my code. I have some ideas, but haven't tried them
| yet. Has anyone ever written something that detects changes to a
| method/function, and then when you save your file it pops up
| asking if related docs need updating? Maybe add comments to the
| method pointing to where related docs live, and then your
| IDE/tool uses that to know what docs need updating?
| simonw wrote:
| "Has anyone ever written something that detects changes to a
| method/function, and then when you save your file it pops up
| asking if related docs need updating?"
|
| I've got a partial solution to that: I have automated tests
| that introspect my code for things that need documentation and
| then fail if those items aren't at least mentioned in the docs.
| Works really well.
|
| I wrote about that here:
| https://simonwillison.net/2018/Jul/28/documentation-unit-tes...
| jerrac wrote:
| That's a good way to do it. I was actually thinking of a Git
| hook or something in the ci pipeline as a place to start. So
| reading about how you implemented it was helpful. Thanks for
| sharing!
| NiklasBegley wrote:
| I think this is a really good point in the post:
|
| > If you don't review, check, and merge docs the same way your
| org reviews, checks, and merges code, you're not doing docs-as-
| code -- you're doing docs-as-bore.
|
| While some WYSIWYG cloud-based docs platforms make it easier to
| make changes, that's not necessarily what you want. Docs are a
| critical component of how your users perceive your product - you
| want to have checks that it meets certain quality and accuracy
| standards. Just like your code.
|
| And if you're an engineering lead company, you probably want your
| docs updates to be coordinated with your product releases. Git is
| just the logical place to put your docs in that case.
|
| I've even created a company specifically to help with this
| workflow: https://www.doctave.com
|
| Also, lots of comments here seem to be thinking of docstrings and
| other in-code documentation. I think that's really a different
| category that has a different set of goals and issues. This post
| is specifically about customer-facing documentation.
| robertlagrant wrote:
| > And if you're an engineering lead company,
|
| Nit: engineering-led
| euroderf wrote:
| Having worked both as a developer and as a technical communicator
| (for software), I'm thinking that low friction for developers is
| paramount, and that therefore the way to get developers to put
| some effort into it is to have documents both (a) written in
| Markdown (or adoc or rST or typst) and (b) co-located with code
| under version control. Change the code, change the docs, no
| screwing around, BUT a quick & simple brain dump can suffice
| because of [see next paragraph].
|
| To whit: I have yet to hear of a documentation system that
| provides fully bidirectional updates between such documents-as-
| code and edits made further downstream along the documentation
| production pipeline. That is to say, when a TC person or a
| reviewer makes edits to content, these changes should propagate
| back to the docs-as-code material.
|
| Then everyone benefits, including senior devs whose scarce time
| is optimised by having TC people expand and polish their hasty
| scribblings, and junior devs who have well-maintained
| documentation at-hand in-place.
|
| My 0.02EUR, YMMV. Maybe too niche.
| smokel wrote:
| Documentation near code is a good idea, but unfortunately it
| covers only part of the problem.
|
| There is a big portion of documentation that should be
| available to "other persons", such as architects or project
| managers, who may not want to visit the codebase.
|
| Another challenge is that for these people, diagrams are
| typically more useful than text. This still requires some
| manual effort which is difficult to achieve with Markdown.
| euroderf wrote:
| Agreed on all points.
|
| Regarding para 2, I am assuming that there is something like
| a CMS in place that can pluck docfiles from version control
| and massage them and insert them into an outline/ToC. (And
| then propagate changes back to version control.)
|
| Regarding para 3, there's now many GUI tools for working with
| Mermaid et al. But are any of them properly integrated into
| documentation systems ?
| robertlagrant wrote:
| Architects should definitely be visiting the codebase. Why
| would project managers be updating docs? I'd have thought
| stuff like docs translations done by translators would be a
| better candidate for non-code editing.
| mfuzzey wrote:
| Agreed with your point of accessibility to people who don't
| live in the code.
|
| But that doesn't preculude documentation near code nor
| Markdown. It just means that you need a CI job to publish the
| doc stored in the git repo as (for example) HTML or PDF.
|
| For diagrams stuff like PlantUML are great, edit as text,
| publish as images.
| spockz wrote:
| We now use MermaidJS to render diagrams from text source in
| our documentation and that works pretty well!
___________________________________________________________________
(page generated 2024-10-20 23:01 UTC)