[HN Gopher] The PostgreSQL documentation and the limitations of ...
       ___________________________________________________________________
        
       The PostgreSQL documentation and the limitations of community
        
       Author : zdw
       Score  : 91 points
       Date   : 2023-06-14 16:34 UTC (1 days ago)
        
 (HTM) web link (rhaas.blogspot.com)
 (TXT) w3m dump (rhaas.blogspot.com)
        
       | lovasoa wrote:
       | I was recently reading the documentation for pgcrypto to
       | implement user authentication in SQLPage:
       | 
       | https://www.postgresql.org/docs/current/pgcrypto.html
       | 
       | The page contains the documentation of many functions, all of
       | which raise the following error by default when you run them: No
       | function matches the given name and argument types.
       | 
       | It turns out you first have to "install" them by running "create
       | extension pgcrypto", which is obvious to someone who already
       | knows postgres modules well, but not to anyone else, and isn't
       | mentioned anywhere on the page!
        
       | pgaddict wrote:
       | I think the main limitation of our docs is that it mostly
       | explains what the pieces do, not how to use them to achieve a
       | particular goal. For example, we have pretty good documentation
       | of all the pieces to do HA, we just don't tell people how to
       | assemble them together.
       | 
       | The reason is, I think, that flexibility is a pretty fundamental
       | part of the project. We're great at providing building blocks
       | (and documenting them), but we steer clear of describing a
       | particular way to assemble them together.
       | 
       | For example, we might describe a particular HA approach, but then
       | that would be perceived as "recommended / official" way, giving
       | it preference over other (and equally valid) approaches and
       | tooling. These "how to" docs are bound to be way more
       | opinionated, so we just focus on documenting the pieces.
       | 
       | In other words, our docs are written by devs for devs, and we
       | leave the higher level stuff to tutorials written by others etc.
        
         | derefr wrote:
         | These concerns seem to be specific to cases where there are
         | various competing high-level design "strategies" with political
         | weight behind them.
         | 
         | There are cases where PG is missing high-level docs where I
         | don't think this applies.
         | 
         | For example, there's no official doc on _how to write_ PL
         | /pgSQL code. There's just an extremely-low-level language
         | reference, covering each syntax element separately. There's no
         | cookbook (other than the few examples per syntax element that
         | exist to document the edge-cases of use of that syntax
         | element); no tutorial; no efficiency/performance/scalability
         | guide discussing when certain language features should be
         | favored over others given the current way they're executed
         | (e.g. is IF-ELSE, CASE-WHEN, or a series of IFs with early
         | returns cheaper? when should I favor using FOR with a query,
         | vs. when should I query data into an in-memory array variable
         | and then use FOREACH, vs. when should I query data into a
         | TEMPORARY table and then query that?); no place where you can
         | get a sense for how procedure CALLs interact with MVCC (e.g.
         | when they acquire + release locks, and therefore how and when
         | they cause blocking on contended tables vs. how and when a
         | SELECTed function that uses dblink/fdw to run independent txs
         | would do so); etc. There isn't even a single mention of which
         | PL/pgSQL exceptions are potentially raised by what PG builtin
         | functions when called in a PL/pgSQL context; how to name those
         | exceptions to match on them to catch them; or how to raise them
         | yourself. I often need to dig into the PG source code to figure
         | that out! (PL/pgSQL honestly feels, in docs terms, like a
         | proprietary third-party language-engine "plugin" that someone
         | bolted on, where the docs were expected to be provided by the
         | third party, but never were. But it's not! It's a first-party
         | language, and the reference implementation of how to create a
         | language extension!)
        
         | briffle wrote:
         | Another good example is the differences in the documentation
         | for Indexes, vs the https://use-the-index-luke.com/ that
         | explains many of the reasons WHY you want to organize it with
         | great examples.
         | 
         | A problem I have is so many tutorials, or 'best practices' I
         | find on the internet are for older versions that don't really
         | apply as well in newer versions of postgres. Like searching for
         | logical replication, you find lots of information for
         | pg_logical for older versions of postgres, but many of those
         | parts are now baked into postgres, but with a different syntax,
         | etc.
         | 
         | I would love to see a 'tutorials/guide' and 'best practices'
         | part of the documentation that is updated with each new
         | release, that give examples of the most common tasks, and
         | when/why to use them, and when to move to something more
         | advanced.
         | 
         | Some really basic stuff like "this is the 3 best ways to handle
         | replication in version 15, and the 2 or 3 most common ways to
         | do backups, or these are the recommended ways to migrate from
         | the previous version either in place, or to a new server, etc.
        
         | akira2501 wrote:
         | I really miss old-school printed documentation's "Theory of
         | Operation" section. To me it's the most useful way to bridge
         | this gap. The technical and operations manual describe all the
         | parts and how they function, but the theory of operation really
         | laid out how and _why_ all of these things were structured the
         | way they were.
         | 
         | It also forced the designers to think in those terms and to
         | document the product from an overall perspective rather than a
         | component perspective. It was high level enough to be useful,
         | but not so high level as to be abstracted into hand holding
         | tutorial exercises.
         | 
         | I feel like most modern software documentation entirely misses
         | this component and would benefit greatly from having it.
        
           | kaycebasques wrote:
           | Can you link me to a good old school "theory of operation"
           | section? I get the idea but I want to see firsthand what you
           | mean.
        
           | giovannibonetti wrote:
           | Related: Diataxis - A systematic framework for technical
           | documentation authoring [1]
           | 
           | "The Diataxis framework aims to solve the problem of
           | structure in technical documentation. It adopts a systematic
           | approach to understanding the needs of documentation users in
           | their cycle of interaction with a product.
           | 
           | Diataxis identifies four modes of documentation - tutorials,
           | how-to guides, technical reference and explanation. It
           | derives its structure from the relationship between
           | them.(...)"
           | 
           | [1] https://diataxis.fr/
        
         | Rapzid wrote:
         | > I think the main limitation of our docs is that it mostly
         | explains what the pieces do, not how to use them to achieve a
         | particular goal
         | 
         | I honestly prefer this type of documentation. ASP.NET Core has
         | the complete opposite problem where it's too example based.
        
         | friendzis wrote:
         | This reminds me of technical documentation for embedded
         | devices. Usually you get _multiple_ classes of documents: data
         | sheets, application notes, reference designs, user guides,
         | erratas.
         | 
         | The problems described come from trying to be everything in one
         | place, but it does not have to be. As I understand you try to
         | be mostly a data sheet, which is probably a net good, because
         | it is _the_ document needed to be maintained, even if hard to
         | navigate.
         | 
         | However, there are more document classes that can be produced.
         | Yes, a reference design is inevitably going to be opinionated,
         | whether it is produced by project team or some internet person.
         | A reference design produced by project team at least has a
         | fighting chance at staying somewhat up to date. And one can
         | discuss tradeoffs between different approaches in an
         | application note.
        
       | emodendroket wrote:
       | I suppose I can see that but I 1) rarely use the index since,
       | like most users, I generally find myself looking at the docs
       | after a Web search 2) have generally found psql documentation to
       | be excellent.
        
       | fdr wrote:
       | I think Haas is basically right, that the structure flows from
       | the community structure, and that it's not clear alterations
       | would be a net win. pgsql-hackers is producing the kind of docs
       | only they can, but many usful kinds of docs they cannot produce
       | (per Haas's theory, e.g. more narrative in nature) are delegated
       | to the relative anarchy of the Internet, in blogs, comment
       | threads, stack exchange, and such.
       | 
       | While there should be a lot of hesitancy at the implied
       | assumption that any particular arrangement is at the efficient
       | frontier -- most situations can be improved in most or all
       | dimensions -- an exchanged loss in the kind of documents that
       | pgsql-hackers is suited to producing is hard to replace.
        
       | kaycebasques wrote:
       | Here's my perspective. I've been a technical writer (TW) for ~10
       | years. 3 at an IoT startup, 7 at Google.
       | 
       | > The strengths of this process are also its weaknesses. A
       | developer is, by definition, someone who spends the majority of
       | their time doing development, which is to say writing code.
       | Updating the documentation becomes a task that must be completed
       | so that the code one has written can get committed so that one
       | can move on to the next project and write some more code.
       | 
       | I may be misinterpreting, but I get the sense that the author
       | feels that there is some kind of more optimal way to split up
       | docs duties. IMO there is not. At least, not for reference docs.
       | As the author said, the people implementing the code are in the
       | best position to keep the reference information up-to-date.
       | 
       | If I grokked the rest of the article correctly, the author is
       | essentially saying that the engineers have trouble writing and
       | maintaining the other main types of docs [1] --- guides,
       | tutorials, and overviews (explanations). It also sounds like they
       | are having a "too many cooks in the kitchen" problem with pushing
       | through changes to the other docs. I have a simple answer to
       | that: hire some strong technical writers and make it clear to
       | everyone that the TWs are Responsible and Accountable [2] for
       | those docs. Also, make it explicit that the engineers are a
       | Consulted role when it comes to guides / tutorials / overviews.
       | Writing these types of docs is hard, specialized work. As the
       | author said, the engineers have lots of other priorities. Of
       | course, it's a bit self-serving for a TW to say "the solution is
       | to hire TWs" but I get the sense that people don't realize that
       | the easiest way to get good guides / tutorials / overviews is to
       | hire people who have thought long and hard about those
       | specialized tasks. If you want a good database, you don't expect
       | your TWs to do the job. You get a database engineer. If you want
       | good tutorials / guides / overviews, you likewise shouldn't
       | expect your database engineer to do the job. You get TWs.
       | 
       | [1] https://diataxis.fr
       | 
       | [2]
       | https://en.m.wikipedia.org/wiki/Responsibility_assignment_ma...
        
         | kaycebasques wrote:
         | Just re-read the last two paragraphs from Haas. I have a couple
         | further comments / questions.
         | 
         | Quote from Haas:
         | 
         | > But if on the other hand I propose some change to
         | documentation that has existed for a long time, or some kind of
         | structural change, there's a lot more room for disagreement.
         | Because the change isn't strictly mechanical, the right answer
         | is a lot more subjective. And because it's a change to existing
         | content rather than the addition of new content, many more
         | people will be familiar with it and have opinions on how it
         | ought to be changed, if at all. Consequently, even when some
         | developer does take time away from writing code to try to make
         | some larger change to the documentation, it's often an uphill
         | battle to get anything done, and people typically have to be
         | content with small improvements.
         | 
         | Honest question: do PostgreSQL engineering decisions have the
         | same dynamics as what's described here for docs decisions? If
         | not, what is different about engineering decisions versus docs
         | decisions? Is it just that engineering decisions can be
         | literally benchmarked whereas docs decisions do not seem
         | benchmark-able? Are there any other potential explanations for
         | different dynamics between eng and docs?
         | 
         | If it does indeed just boil down to "docs are not
         | benchmarkable" then I would suggest otherwise. You can create
         | docs benchmarks. They won't have the rigor of engineering
         | benchmarks but they at least establish some notion of docs
         | quality and facilitate more targeted discussions during docs
         | reviews. E.g. when there's a disagreement between an author and
         | reviewer the author can say, "what docs benchmarks am I not
         | following here?" The power of that kind of interaction is that
         | you often do realize that the docs benchmarks are incomplete
         | and some new dimension needs to be added to them. Or do the
         | PostgreSQL contributor docs already have some guidelines along
         | the lines of a "content quality checklist" and it's still not
         | working?
         | 
         | A rigorous effort to survey the PostgreSQL community and get a
         | deep sense of what the overall community considers "high-
         | quality docs" can itself be a super insightful experience.
         | Different developer communities often need / want a different
         | focus in the docs. That can be the foundation for a fairly
         | authoritative docs quality checklist.
         | 
         | Another thing that I'm very interested in, and will need to
         | think through deeply some other day, is this notion that once
         | you publish a doc, you can't touch it. It happens all the time
         | and it's really weird.
        
         | actuallyalys wrote:
         | As someone who's been both a software developer and a technical
         | writer, I agree that a lot of these problems seem best suited
         | for a technical writer. The expertise with creating different
         | types of documents is valuable, of course, but there's another
         | benefit: The organization committing resources in the form of
         | making it someone's entire responsibility.
         | 
         | While I think updating reference documents lends itself to
         | subject matter experts (especially in a project where this
         | approach is already successful), I think structuring them can
         | be separated and given to a technical writer with more
         | technical expertise or a developer with more documentation
         | expertise.
        
         | tetha wrote:
         | We're seeing similar things at work as the postgres
         | documentation has.
         | 
         | We in infra-ops can give you more details about how our
         | database clusters are designed for resilience, security, safety
         | than you want on more levels than most people in the company
         | know exist. We also have reasoning for all of this available.
         | This is really good to have for customer question sets during
         | sales.
         | 
         | However, this doesn't tell a developer how to connect his
         | spring boot thingy to it, and how to connect and manage his
         | service well. In fact, 80 - 90% or more of the things we know
         | about our database are not relevant to a simple small-scale
         | application running queries on it. And quite a lot of the
         | issues you can have with running your application on a rock-
         | solid database are entirely not relevant at a DBA level. Like,
         | the database doesn't care if your DDL modification is backwards
         | compatible.
         | 
         | And that's something we're currently learning together with a
         | foundation team at work. They are documenting on how to use it
         | well from their side, we're learning about easy mistakes to
         | make and document those, and help with the actionable
         | documentation. And in hard cases we kinda have to talk what's
         | the plan, because it's usually not smart from a DBAs
         | perspective.
        
         | btilly wrote:
         | Just curious. Where do you think that an open source project
         | like PostgreSQL gets a budget to hire anyone? Let alone to
         | dictate a new line of authority to the volunteers who are
         | already maintaining it?
         | 
         | And don't forget that there are valuable volunteers who are
         | likely to go elsewhere if too many new rules are added that
         | they don't want to live with.
        
           | kaycebasques wrote:
           | Open Web Docs is a potential model to draw inspiration from
           | regarding funding: https://openwebdocs.org
           | 
           | Presumably, PostgreSQL has leaders who are responsible for
           | steering the ship. If the project is going to succeed long-
           | term, those leaders have to find ways to keep their
           | contributors happy while also creating an organizational
           | structure that leads to good docs. Easier said than done, I
           | know, but it really is as simple as that.
           | 
           | Sorry if any of my comments came off naive or obtuse when it
           | comes to open source dynamics. But the reality is that you
           | need good docs, and I'm just trying to give an honest
           | assessment from my experience of the conditions that lead to
           | good docs.
        
             | btilly wrote:
             | _Sorry if any of my comments came off naive or obtuse when
             | it comes to open source dynamics._
             | 
             | If you want that apology to be meaningful, you should learn
             | something.
             | 
             | When you're talking about a highly successful open source
             | project that has been going for more than 3 decades, it is
             | beyond ludicrous for you to say, "If the project is going
             | to succeed long-term..." It already has succeeded long-
             | term. And you would be better off figuring out why it works
             | rather than lecturing about how it must work.
             | 
             | When you talk about "a potential model to draw from" for
             | funding, please note that I've been involved with open
             | source for about a quarter of a century. I've seen a LOT of
             | funding models attempted. Mostly they run into one big
             | problem. And that problem is that adding funding creates
             | bruised egos because people say, "Why is he getting paid
             | when I'm not?"
             | 
             | The one funding model that DOESN'T have this problem is
             | when a company decides to pay its employees to work on
             | features that it wants in the project. Now there are no
             | bruised egos - the money comes from the company and it is
             | clear why one person gets paid while another does not.
             | There are still challenges with this model - employees are
             | under pressure to get their contributions accepted whether
             | or not the project likes them - but we've learned how to
             | navigate those.
             | 
             | But now we're left back where we started. Companies who
             | hire core developers don't generally need comprehensive
             | documentation - they build internal documentation straight
             | for their use case. So comprehensive external documentation
             | is hard to find. Sometimes you'll wind up with things like
             | an excellent introductory tutorial like
             | https://docs.python.org/3/tutorial/. Usually, you don't.
             | And generally it is hard to simply pay someone to take care
             | of it for you.
        
               | kaycebasques wrote:
               | > When you're talking about a highly successful open
               | source project that has been going for more than 3
               | decades, it is beyond ludicrous for you to say, "If the
               | project is going to succeed long-term..." It already has
               | succeeded long-term.
               | 
               | Yes, your reaction here totally makes sense. Feedback
               | acknowledged.
               | 
               | > If you want that apology to be meaningful, you should
               | learn something.
               | 
               | I have re-read my earlier comments and I feel that you
               | are being more hostile to me than is justified. I do not
               | think you are adhering to HN's code of conduct guidelines
               | for comments:
               | https://news.ycombinator.com/newsguidelines.html#comments
               | 
               | > you would be better off figuring out why it works
               | rather than lecturing about how it must work
               | 
               | This doesn't seem fair. The original post is about the
               | limitations of the PostgreSQL docs. Docs have been the
               | focus of my career for 10 years. I have experienced and
               | analyzed docs problems in many contexts: small orgs,
               | large orgs, open source, closed source. I made an on-
               | topic comment about ways to resolve the problems that the
               | PostgreSQL docs are facing. Is it the only solution? Of
               | course not. But I totally have relevant experience in
               | this domain and, just like you have a good idea about
               | what generally works and doesn't work regarding open
               | source funding, I have a pretty good idea about what
               | generally works for creating the conditions that lead to
               | good docs.
               | 
               | > So comprehensive external documentation is hard to
               | find.
               | 
               | Again, I think the web platform space is relevant here.
               | Web platform documentation could easily devolve into a
               | tragedy of the commons situation. Yet MDN does exist and
               | is an amazing resource.
               | 
               | Paragraphs 4 to 6 of your last comment seem to be arguing
               | that hiring TWs is not an option for PostgreSQL. That is
               | totally understandable. On another day maybe we would
               | have arrived at that understanding on friendly terms and
               | would have had a constructive conversation about how to
               | create good docs when hiring TWs is not possible. But
               | it's clear that my ideas aren't welcome here so I'll just
               | stop now.
        
               | wrs wrote:
               | If someone gets weirdly hostile and condescending towards
               | you on HN (sadly not uncommon), I recommend that you try
               | to just ignore them and keep contributing. I'd like to
               | hear what you have to say.
        
           | minorninth wrote:
           | PostgreSQL, like many other open-source projects, has
           | sponsors and accepts donations. Here's their sponsors page:
           | 
           | https://www.postgresql.org/about/sponsors/
           | 
           | Also, I think it's important to note that a lot of
           | contributors aren't volunteering their "free time", they're
           | being paid by some other employer to contribute to PostgreSQL
           | as part of their job:
           | 
           | https://www.enterprisedb.com/blog/importance-of-giving-
           | back-...
        
             | btilly wrote:
             | If you read
             | https://www.postgresql.org/about/policies/sponsorship/
             | you'll find that the list of sponsors is essentially a
             | recognition for companies paying their employees to
             | contribute to PostgreSQL.
             | 
             | It isn't for contributing into a pot of money allowing some
             | central PostgreSQL committee to hand out money for other
             | things, like hiring people to do documentation.
        
       | globular-toast wrote:
       | It sounds like someone should write a book about postgres. I'd
       | buy it. But I think it should be a supplement to the current
       | docs, not a replacement.
        
       | TX81Z wrote:
       | I just ask GPT all my Postgres questions now. I think some of
       | this will become a moot point over time.
        
       | mannyv wrote:
       | The problem with developers writing documentation is that they
       | generally have too narrow of a view of things.
       | 
       | Here's an artificial example:
       | 
       | "Added a setting which allows you to change the size of a boba."
       | 
       | It doesn't answer any really useful questions, such as: why would
       | you want to change the size of a boba? What is the effect of
       | various sizes of boba? How does that interact with other
       | settings?
       | 
       | As a database user, I actually want to know these things.
       | Internally I have a mental model of how all these settings
       | interact with the product, and I use information about the new
       | setting to adjust that model.
       | 
       | For psql in particular, the documentation (as people have pointed
       | out) shys away from anything too "opinionated."
       | 
       | But what it should do instead of have multiple opinionated
       | examples. Multiple examples allow me to learn about the different
       | tradeoffs and configuration options.
       | 
       | It reminds me of the old days, when the psql docs talked about
       | optimization as a "black art", and basically said "it would be
       | impossible to cover everything, so we won't cover anything."
        
       | stigok wrote:
       | I find the PostgreSQL documentation to be a seriously good read.
       | Something I'd print and take on the holiday with me.
        
       | smitty1e wrote:
       | Tl;dr: a designated documentation editor would be both a boon and
       | likely a challenge to retain.
        
       | tmaly wrote:
       | This is a very valuable post in my opinion. Documentation is so
       | critical to open source as well as to the private sector.
       | 
       | It can make the difference in how long it takes you to complete a
       | project at work.
        
         | jrott wrote:
         | At this point I believe that documentation is the most
         | important marketing artifact that a project has.
        
           | tmaly wrote:
           | I always point out the readme pages of projects on github.
           | How many times has that made the difference between you using
           | the project or not?
        
             | jrott wrote:
             | Not zero if I can't figure out how to get started in a
             | reasonable amount of time. I'll go look for another
             | solution
        
       | davidatbu wrote:
       | As a regular consumer of pg docs, I vehemently agree that they
       | are incredibly detailed, and at the same time, daunting to
       | navigate.
        
         | kaycebasques wrote:
         | If their problem is an abundance of information that is hard to
         | navigate, they really should start experimenting with
         | retrieval-augmented generation search experiences [1] like
         | Supabase AI. One of the great promises of LLMs for docs IMO is
         | the ability to synthesize info from many sources to provide
         | more targeted answers.
         | 
         | [1] https://technicalwriting.tools/posts/playing-nicely-with-
         | gen... (my blog)
        
           | davidatbu wrote:
           | I haven't read your blog yet, but I'd be lying if the same
           | thought hasn't crossed my mind :)
        
         | emodendroket wrote:
         | How real is this problem though? I have no idea how they're
         | organized because I'm usually led to whatever page I wanted to
         | see by Google anyway.
        
       | vincent-manis wrote:
       | I love the approach of "change the code, write the
       | documentation"; the code author is in a unique position to be
       | able to explain the new behavior of the system. However, most
       | FOSS projects could benefit from a technical writer, who can
       | improve these first drafts to make them more usable.
       | 
       | Back in the Stone Age, companies like IBM had vast writing
       | staffs. As a result, you got entire walls full of documentation.
       | For OS/360, you got Concepts And Facilities, that explained what
       | the software did for you; you got reference manuals; and Program
       | Logic Manuals, which explained, in exhaustive/ing detail, how the
       | program worked (with flowcharts!).
       | 
       | Arguably, the software I use today is far more complex than
       | OS/360; yet we don't see the same attention to organization and
       | detail in the documentation. I understand the reasons: IBM's
       | dead-tree wall relied on paying tech writers, and that's
       | incompatible with the tiny budgets most FOSS projects suffer. Far
       | too often, I will go to a FOSS site's documentation page and
       | discover a mass of links to pages that explain how to build the
       | program on Itanium, or why this program is better than another
       | one, with no walkthrough of "how to use this program for a
       | typical case", or even "if you want to do this kind of operation,
       | these are the pages you need to read". Often, the documentation
       | isn't included in the repo, which lowers my confidence that the
       | documentation and software are updated in step.
       | 
       | So I celebrate projects such as PostgreSQL, Emacs, and Arch Linux
       | (for the wiki)--there are many more--where there is a real effort
       | to create good documentation, even when I think that
       | documentation can be improved or reorganized. Let's not allow the
       | perfect to be the enemy of the good.
        
         | kaycebasques wrote:
         | The technical writing community refers to this approach as
         | "docs-as-code". Just mentioning that keyword in case anyone
         | wants to research the space further. There is a famous Write
         | The Docs talk on the topic. I think the same author turned that
         | talk into a book.
         | 
         | Fabrizio Benedetti did a cool analysis of various common docs-
         | as-code architectures: https://passo.uno/docs-as-code-
         | topologies/
        
       | justinclift wrote:
       | This is kind of what the old PostgreSQL "Tech Docs" website was
       | useful for, back in the day (~20 years ago).
       | 
       | Here's a random snapshot of it from the Wayback machine:
       | 
       | http://web.archive.org/web/20040630081140/http://techdocs.po...
       | 
       | Much more user-level oriented than the reference stuff.
        
       ___________________________________________________________________
       (page generated 2023-06-15 23:01 UTC)