https://lwn.net/SubscriberLink/934940/3abb2d4086680b78/ LWN.net Logo LWN .net News from the source LWN * Content + Weekly Edition + Archives + Search + Kernel + Security + Events calendar + Unread comments + ------------------------------------------------------------- + LWN FAQ + Write for us User: [ ] Password: [ ] [Log in] | [Subscribe] | [Register] Subscribe / Log in / New account PostgreSQL reconsiders its process-based model [LWN subscriber-only content] Welcome to LWN.net The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net! By Jonathan Corbet June 19, 2023 In the fast-moving open-source world, programs can come and go quickly; a tool that has many users today can easily be eclipsed by something better next week. Even in this environment, though, some programs endure for a long time. As an example, consider the PostgreSQL database system, which traces its history back to 1986. Making fundamental changes to a large code base with that much history is never an easy task. As fundamental changes go, moving PostgreSQL away from its process-oriented model is not a small one, but it is one that the project is considering seriously. A PostgreSQL instance runs as a large set of cooperating processes, including one for each connected client. These processes communicate through a number of shared-memory regions using an elaborate library that enables the creation of complex data structures in a setting where not all processes have the same memory mapped at the same address. This model has served the project well for many years, but the world has changed a lot over the history of this project. As a result, PostgreSQL developers are increasingly thinking that it may be time to make a change. A proposal At the beginning of June, Heikki Linnakangas, seemingly following up on some in-person conference discussions, posted a proposal to move PostgreSQL to a threaded model. I feel that there is now pretty strong consensus that it would be a good thing, more so than before. Lots of work to get there, and lots of details to be hashed out, but no objections to the idea at a high level. The purpose of this email is to make that silent consensus explicit. The message gave a quick overview of some of the challenges involved in making such a move, and acknowledged, in an understated way, that this transition "`surely cannot be done fully in one release'". One thing that was missing was a discussion of why this big change would be desirable, but that was filled in as the discussion went on. As Andres Freund put it: I think we're starting to hit quite a few limits related to the process model, particularly on bigger machines. The overhead of cross-process context switches is inherently higher than switching between threads in the same process - and my suspicion is that that overhead will continue to increase. Once you have a significant number of connections we end up spending a *lot* of time in TLB misses, and that's inherent to the process model, because you can't share the TLB across processes. He also pointed out that the process model imposes costs on development, forcing the project to maintain a lot of duplicated code, including several memory-management mechanisms that would be unneeded in a single address space. In a later message he also added that it would be possible to share state more efficiently between threads, since they all run within the same address space. The reaction of some developers, though, made it clear that the " `pretty strong consensus'" cited by Linnakangas might not be quite that strong after all. Tom Lane said: "`I think this will be a disaster. There is far too much code that will get broken'". He added later that the cost of this change would be "`enormous'", it would create "`more than one security-grade bug'", and that the benefits would not justify the cost. Jonathan Katz suggested that there might be other work that should have a higher priority. Others worried that losing the isolation provided by separate processes could make the system less robust overall. Still, many PostgreSQL developers seem to be cautiously in favor of at least exploring this change. Robert Haas said that PostgreSQL does not scale well on larger systems, mostly as a result of the resources consumed by all of those processes. "`Not all databases have this problem, and PostgreSQL isn't going to be able to stop having it without some kind of major architectural change'". Just switching to threads might not be enough, he said, but he suggested that this change would enable a number of other improvements. How to get there Moving the core of the PostgreSQL server into a single address space will certainly present a number of challenges. The biggest one, as pointed out by Haas and others, would appear to be the server's " `widespread and often gratuitous use of global variables'". Globals work well enough when each server process has its own set, but that approach clearly falls apart when threads are used instead. According to Konstantin Knizhnik, there are about 2,000 such variables currently used by the PostgreSQL server. A couple of approaches to this problem were discussed. One was pulling all of the global variables into a big "session state" structure that would be thread-local. That idea quickly loses its appeal, though, when one considers trying to create and maintain a 2,000-member structure, so the project is unlikely to go this way. The alternative is to simply throw all of the globals into thread-local storage, an approach that is easy and would work, but heavy use of thread-local storage would exact a performance penalty that would reduce the benefits of the switch to threads in the first place. Haas said that marking globals specially (to put them into thread-local storage, among other things) would be a beneficial project in its own right, as that would be a good first step in reducing their use. Freund agreed, saying that this effort would pay off even if the switch to threads never happens. But, Freund cautioned, moving global variables to thread-local storage is the easiest part of the job: Redesigning postmaster, defining how to deal with extension libraries, extension compatibility, developing tools to make developing a threaded postgres feasible, dealing with freeing session lifetime memory allocations that previously were freed via process exit, making the change realistically reviewable, portability are all much harder. An interesting point that received surprisingly little attention in the discussion is that Knizhnik has already done a threads port of PostgreSQL. The global-variable problem, he said, was not that difficult. He had more trouble with configuration data, error handling, signals, and the like. Support for externally maintained extensions will be a challenge. Still, he saw some significant benefits in working in the threaded environment. Anybody who is thinking about taking on this project would be well advised to look closely at this work as a first step. Another complication that the PostgreSQL developers have in mind is that of supporting both the process-based and thread-based modes, perhaps indefinitely. The need to continue to support running in the process-based mode would make it harder to take advantage of some of the benefits offered by threads, and would significantly increase the maintenance burden overall. Haas, though, is not convinced that it would ever be possible to remove support for the process-based mode. Threads might not perform better for all use cases, or some important extensions may never gain support for running in threads. The removal of process support is, as he noted, a question that can only really be considered once threads are working well. That point is, obviously, a long way into the future, assuming it arrives at all. While the outcome of the discussion suggests that most PostgreSQL developers think that this change is good in the abstract, there are also clearly concerns about how it would work in practice. And, perhaps more importantly, nobody has, yet, stepped up to say that they would be willing to put in the time to push this effort forward. Without that crucial ingredient, there will be no switch to threads in any sort of foreseeable future. [Send a free link] ----------------------------------------- (Log in to post comments) Aim for the stars Posted Jun 19, 2023 16:11 UTC (Mon) by Wol (subscriber, #4433) [Link] > While the outcome of the discussion suggests that most PostgreSQL developers think that this change is good in the abstract, there are also clearly concerns about how it would work in practice. And you might hit the moon. Aim nowhere and you're going nowhere. Look at the GIL (was that Python?) and the Big Kernel Lock in linux. Whether you get there or not, a lot of the work on the way sounds like it's worth it in its own right. Like getting rid of all those global variables! Even being able to break up each process into a bunch of threads for the easy stuff could lead to massive benefits - threading where it works well, processes where they work well. I wish you all God Speed on the voyage! Cheers, Wol [Reply to this comment] Aim for the stars Posted Jun 19, 2023 18:18 UTC (Mon) by zoobab (guest, #9945) [Link] Maybe yse zeromq ipc messages between threads? [Reply to this comment] Aim for the stars Posted Jun 19, 2023 20:19 UTC (Mon) by nevyn (subscriber, #33129) [ Link] Python GIL and Linux Big kernel lock seem like very bad comparisons. In those cases there is/was no Parallelism, here there is Parallelism but _maybe_ the scaling is better if you change "everything" and _maybe_ the security/robustness is the same. This is "closer" to the apache-httpd move, the main difference being I don't know enough about PostgreSQL and the plans to move to imply the outcome will be that bad. [Reply to this comment] Aim for the stars Posted Jun 19, 2023 22:22 UTC (Mon) by Wol (subscriber, #4433) [Link] It wasn't meant as a comparison. The Big Kernel Lock and the GIL enforced "single process". PostgreSQL *is* a single process? Linux and Python decided that removing that restriction was worthwhile. Whether PostgreSQL succeeds or not, the effort they make towards removing that restriction may well be worthwhile. Cheers, Wol [Reply to this comment] PostgreSQL reconsiders its process-based model Posted Jun 19, 2023 19:26 UTC (Mon) by raven667 (subscriber, #5198) [ Link] I know nothing of the PostgreSQL internals or the relevant engineering but throwing an opinion out there anyway; is there a way to make a minimal threaded implementation that just covers the necessary features needed for the most extreme large servers where threading could help? If you made a ton of caveats about what features are supportable, ie anything not used by the large instances you want test with, can you reduce the scope of what work is needed to something more manageable that can be iterated on? Steady improvement without taking on a big chunk of risk to rework the whole internal architecture, even if it takes longer, is probably the way to go for an old mature software project like this, right? [Reply to this comment] PostgreSQL reconsiders its process-based model Posted Jun 19, 2023 19:45 UTC (Mon) by jhoblitt (subscriber, #77733) [Link] Semi-seriously, why not port the postgresql sql dialect to use mariadb as the backend? Mariadb (mysql...) has had a robust threaded model and binary redo logs for literally decades. [Reply to this comment] PostgreSQL reconsiders its process-based model Posted Jun 19, 2023 19:48 UTC (Mon) by pizza (subscriber, #46) [Link] > Semi-seriously, why not port the postgresql sql dialect to use mariadb as the backend? Mariadb (mysql...) has had a robust threaded model and binary redo logs for literally decades. Because it's not Postgresql's "dialect" that matters here, but rather the features and robustness that dialect exposes. ...Mariadb might as well be on another planet in comparison. [Reply to this comment] PostgreSQL reconsiders its process-based model Posted Jun 19, 2023 20:29 UTC (Mon) by flussence (subscriber, #85566) [Link] Oh this is quite some news. I don't mind early adopting performance features, but... In Apache httpd I've been using every experimental threaded/event mpm as it becomes available, because the forking model always felt a bit gross to me. But that's software that has had pluggable backends for decades, and even so it's still a bit rough around the edges. I generally trust the Postgres developers to not screw up but I think this kind of change would need two or three major release cycles before I'd feel comfortable turning it on in production. [Reply to this comment] Copyright (c) 2023, Eklektix, Inc. Comments and public postings are copyrighted by their creators. Linux is a registered trademark of Linus Torvalds