https://lwn.net/SubscriberLink/934940/3abb2d4086680b78/

LWN.net Logo LWN
.net News from the source LWN

  * Content
      + Weekly Edition
      + Archives
      + Search
      + Kernel
      + Security
      + Events calendar
      + Unread comments
      + -------------------------------------------------------------
      + LWN FAQ
      + Write for us

User: [        ] Password: [        ] [Log in]
|
[Subscribe]
|
[Register]
Subscribe / Log in / New account

PostgreSQL reconsiders its process-based model

[LWN subscriber-only content]

    Welcome to LWN.net

    The following subscription-only content has been made available
    to you by an LWN subscriber. Thousands of subscribers depend on
    LWN for the best news from the Linux and free software
    communities. If you enjoy this article, please consider
    subscribing to LWN. Thank you for visiting LWN.net!

By Jonathan Corbet
June 19, 2023
In the fast-moving open-source world, programs can come and go
quickly; a tool that has many users today can easily be eclipsed by
something better next week. Even in this environment, though, some
programs endure for a long time. As an example, consider the
PostgreSQL database system, which traces its history back to 1986.
Making fundamental changes to a large code base with that much
history is never an easy task. As fundamental changes go, moving
PostgreSQL away from its process-oriented model is not a small one,
but it is one that the project is considering seriously.

A PostgreSQL instance runs as a large set of cooperating processes,
including one for each connected client. These processes communicate
through a number of shared-memory regions using an elaborate library
that enables the creation of complex data structures in a setting
where not all processes have the same memory mapped at the same
address. This model has served the project well for many years, but
the world has changed a lot over the history of this project. As a
result, PostgreSQL developers are increasingly thinking that it may
be time to make a change.

A proposal

At the beginning of June, Heikki Linnakangas, seemingly following up
on some in-person conference discussions, posted a proposal to move
PostgreSQL to a threaded model.

    I feel that there is now pretty strong consensus that it would be
    a good thing, more so than before. Lots of work to get there, and
    lots of details to be hashed out, but no objections to the idea
    at a high level.

    The purpose of this email is to make that silent consensus
    explicit.

The message gave a quick overview of some of the challenges involved
in making such a move, and acknowledged, in an understated way, that
this transition "`surely cannot be done fully in one release'". One
thing that was missing was a discussion of why this big change would
be desirable, but that was filled in as the discussion went on. As
Andres Freund put it:

    I think we're starting to hit quite a few limits related to the
    process model, particularly on bigger machines. The overhead of
    cross-process context switches is inherently higher than
    switching between threads in the same process - and my suspicion
    is that that overhead will continue to increase. Once you have a
    significant number of connections we end up spending a *lot* of
    time in TLB misses, and that's inherent to the process model,
    because you can't share the TLB across processes.

He also pointed out that the process model imposes costs on
development, forcing the project to maintain a lot of duplicated
code, including several memory-management mechanisms that would be
unneeded in a single address space. In a later message he also added
that it would be possible to share state more efficiently between
threads, since they all run within the same address space.

The reaction of some developers, though, made it clear that the "
`pretty strong consensus'" cited by Linnakangas might not be quite
that strong after all. Tom Lane said: "`I think this will be a
disaster. There is far too much code that will get broken'". He added
later that the cost of this change would be "`enormous'", it would
create "`more than one security-grade bug'", and that the benefits
would not justify the cost. Jonathan Katz suggested that there might
be other work that should have a higher priority. Others worried that
losing the isolation provided by separate processes could make the
system less robust overall.

Still, many PostgreSQL developers seem to be cautiously in favor of
at least exploring this change. Robert Haas said that PostgreSQL does
not scale well on larger systems, mostly as a result of the resources
consumed by all of those processes. "`Not all databases have this
problem, and PostgreSQL isn't going to be able to stop having it
without some kind of major architectural change'". Just switching to
threads might not be enough, he said, but he suggested that this
change would enable a number of other improvements.

How to get there

Moving the core of the PostgreSQL server into a single address space
will certainly present a number of challenges. The biggest one, as
pointed out by Haas and others, would appear to be the server's "
`widespread and often gratuitous use of global variables'". Globals
work well enough when each server process has its own set, but that
approach clearly falls apart when threads are used instead. According
to Konstantin Knizhnik, there are about 2,000 such variables
currently used by the PostgreSQL server.

A couple of approaches to this problem were discussed. One was
pulling all of the global variables into a big "session state"
structure that would be thread-local. That idea quickly loses its
appeal, though, when one considers trying to create and maintain a
2,000-member structure, so the project is unlikely to go this way.
The alternative is to simply throw all of the globals into
thread-local storage, an approach that is easy and would work, but
heavy use of thread-local storage would exact a performance penalty
that would reduce the benefits of the switch to threads in the first
place. Haas said that marking globals specially (to put them into
thread-local storage, among other things) would be a beneficial
project in its own right, as that would be a good first step in
reducing their use. Freund agreed, saying that this effort would pay
off even if the switch to threads never happens.

But, Freund cautioned, moving global variables to thread-local
storage is the easiest part of the job:

    Redesigning postmaster, defining how to deal with extension
    libraries, extension compatibility, developing tools to make
    developing a threaded postgres feasible, dealing with freeing
    session lifetime memory allocations that previously were freed
    via process exit, making the change realistically reviewable,
    portability are all much harder.

An interesting point that received surprisingly little attention in
the discussion is that Knizhnik has already done a threads port of
PostgreSQL. The global-variable problem, he said, was not that
difficult. He had more trouble with configuration data, error
handling, signals, and the like. Support for externally maintained
extensions will be a challenge. Still, he saw some significant
benefits in working in the threaded environment. Anybody who is
thinking about taking on this project would be well advised to look
closely at this work as a first step.

Another complication that the PostgreSQL developers have in mind is
that of supporting both the process-based and thread-based modes,
perhaps indefinitely. The need to continue to support running in the
process-based mode would make it harder to take advantage of some of
the benefits offered by threads, and would significantly increase the
maintenance burden overall. Haas, though, is not convinced that it
would ever be possible to remove support for the process-based mode.
Threads might not perform better for all use cases, or some important
extensions may never gain support for running in threads. The removal
of process support is, as he noted, a question that can only really
be considered once threads are working well.

That point is, obviously, a long way into the future, assuming it
arrives at all. While the outcome of the discussion suggests that
most PostgreSQL developers think that this change is good in the
abstract, there are also clearly concerns about how it would work in
practice. And, perhaps more importantly, nobody has, yet, stepped up
to say that they would be willing to put in the time to push this
effort forward. Without that crucial ingredient, there will be no
switch to threads in any sort of foreseeable future.

[Send a free link]


-----------------------------------------
(Log in to post comments)

Aim for the stars

Posted Jun 19, 2023 16:11 UTC (Mon) by Wol (subscriber, #4433) [Link]

> While the outcome of the discussion suggests that most PostgreSQL
developers think that this change is good in the abstract, there are
also clearly concerns about how it would work in practice.

And you might hit the moon. Aim nowhere and you're going nowhere.

Look at the GIL (was that Python?) and the Big Kernel Lock in linux.
Whether you get there or not, a lot of the work on the way sounds
like it's worth it in its own right. Like getting rid of all those
global variables!

Even being able to break up each process into a bunch of threads for
the easy stuff could lead to massive benefits - threading where it
works well, processes where they work well.

I wish you all God Speed on the voyage!

Cheers,
Wol
[Reply to this comment]
Aim for the stars

Posted Jun 19, 2023 18:18 UTC (Mon) by zoobab (guest, #9945) [Link]

Maybe yse zeromq ipc messages between threads?
[Reply to this comment]
Aim for the stars

Posted Jun 19, 2023 20:19 UTC (Mon) by nevyn (subscriber, #33129) [
Link]

Python GIL and Linux Big kernel lock seem like very bad comparisons.
In those cases there is/was no Parallelism, here there is Parallelism
but _maybe_ the scaling is better if you change "everything" and
_maybe_ the security/robustness is the same.

This is "closer" to the apache-httpd move, the main difference being
I don't know enough about PostgreSQL and the plans to move to imply
the outcome will be that bad.
[Reply to this comment]
Aim for the stars

Posted Jun 19, 2023 22:22 UTC (Mon) by Wol (subscriber, #4433) [Link]

It wasn't meant as a comparison. The Big Kernel Lock and the GIL
enforced "single process". PostgreSQL *is* a single process?

Linux and Python decided that removing that restriction was
worthwhile. Whether PostgreSQL succeeds or not, the effort they make
towards removing that restriction may well be worthwhile.

Cheers,
Wol
[Reply to this comment]
PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 19:26 UTC (Mon) by raven667 (subscriber, #5198) [
Link]

I know nothing of the PostgreSQL internals or the relevant
engineering but throwing an opinion out there anyway; is there a way
to make a minimal threaded implementation that just covers the
necessary features needed for the most extreme large servers where
threading could help? If you made a ton of caveats about what
features are supportable, ie anything not used by the large instances
you want test with, can you reduce the scope of what work is needed
to something more manageable that can be iterated on? Steady
improvement without taking on a big chunk of risk to rework the whole
internal architecture, even if it takes longer, is probably the way
to go for an old mature software project like this, right?
[Reply to this comment]
PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 19:45 UTC (Mon) by jhoblitt (subscriber, #77733)
[Link]

Semi-seriously, why not port the postgresql sql dialect to use
mariadb as the backend? Mariadb (mysql...) has had a robust threaded
model and binary redo logs for literally decades.
[Reply to this comment]
PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 19:48 UTC (Mon) by pizza (subscriber, #46) [Link]

> Semi-seriously, why not port the postgresql sql dialect to use
mariadb as the backend? Mariadb (mysql...) has had a robust threaded
model and binary redo logs for literally decades.

Because it's not Postgresql's "dialect" that matters here, but rather
the features and robustness that dialect exposes.

...Mariadb might as well be on another planet in comparison.

[Reply to this comment]
PostgreSQL reconsiders its process-based model

Posted Jun 19, 2023 20:29 UTC (Mon) by flussence (subscriber, #85566)
[Link]

Oh this is quite some news. I don't mind early adopting performance
features, but...

In Apache httpd I've been using every experimental threaded/event mpm
as it becomes available, because the forking model always felt a bit
gross to me. But that's software that has had pluggable backends for
decades, and even so it's still a bit rough around the edges. I
generally trust the Postgres developers to not screw up but I think
this kind of change would need two or three major release cycles
before I'd feel comfortable turning it on in production.
[Reply to this comment]

                  Copyright (c) 2023, Eklektix, Inc.
   Comments and public postings are copyrighted by their creators.
          Linux is a registered trademark of Linus Torvalds