https://lwn.net/Articles/1027289/ LWN.net Logo LWN .net News from the source LWN * Content + Weekly Edition + Archives + Search + Kernel + Security + Events calendar + Unread comments + ------------------------------------------------------------- + LWN FAQ + Write for us * Edition + Return to the Briefs page User: [ ] Password: [ ] [Log in] | [Subscribe] | [Register] Subscribe / Log in / New account Bcachefs may be headed out of the kernel [Posted June 27, 2025 by jake] The history of the bcachefs filesystem in the kernel has been turbulent, most recently with Linus Torvalds refusing a pull request for the 6.16-rc3 release. Torvalds has now pulled the code in question, but also said: I think we'll be parting ways in the 6.17 merge window. You made it very clear that I can't even question any bug-fixes and I should just pull anything and everything. Honestly, at that point, I don't really feel comfortable being involved at all, and the only thing we both seemed to really fundamentally agree on in that discussion was "we're done". Bcachefs developer Kent Overstreet has his own view of the situation. Both Torvalds and Overstreet refer to a seemingly private conversation where the pull request (and other topics) were discussed. ----------------------------------------- [Log in] to post comments Problematic interactions Posted Jun 27, 2025 15:46 UTC (Fri) by job (guest, #670) [Link] Not even a year has gone by since Linus threatened to rip out bcachefs last time. [Reply to this comment] What next? Posted Jun 27, 2025 15:49 UTC (Fri) by mips (guest, #105013) [Link] (3 responses) Out of interest (as a non-kernel dev) what happens when something that is actively developed and maintained gets chucked out of the kernel? Presumably it's a fairly normal situation to be outside: people who build kernels have build-time options to include bits and bobs, and if the dev has need of internal changes in the kernel to support his stuff he'll propose it or coordinate with other devs working in the same area? (fs in this case) [Reply to this comment] What next? Posted Jun 27, 2025 16:19 UTC (Fri) by alspnost (guest, #2763) [Link] Presumably, maintenance and development carry on in a separate tree, and one day he'll have another attempt to get it merged back into the mainline kernel... [Reply to this comment] What next? Posted Jun 27, 2025 16:26 UTC (Fri) by Sesse (subscriber, #53779) [ Link] The biggest difference is that kernel developers are much more free to make changes that would break bcachefs. E.g., if someone makes a refactoring that renames "slab" to "drab" for whatever reason, the onus would probably be on them to also change bcachefs so that it still compiles. If bcachefs is not in the kernel, it's largely not their problem. (Perhaps more relevant, this also holds for VFS changes.) [Reply to this comment] What next? Posted Jun 27, 2025 17:55 UTC (Fri) by Tobu (subscriber, #24111) [ Link] The major issue I see is that Kent will have a harder time pushing fixes or refactors external to bcachefs. Currently, an overlayfs check needs to be relaxed or reworked to let a bcachefs filesystem with case-insensitive directories be usable as an overlayfs layer. This one is expected to go in the next merge window. But more wide-ranging things, refactoring efforts by Kent outside of his maintainer domain seem like they will be harder to pull off. [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 15:55 UTC (Fri) by dmv (subscriber, #168800) [ Link] (108 responses) With comments like this: "There is a time and a place for rules, and there is a time and a place for using your head and exercising some common sense and judgement. I'm the one who's responsible for making sure that bcachefs users have a working filesystem. That means reading and responding to every bug report and keeping track of what's working and what's not in fs/bcachefs/. Not you, and not Linus." And: "You also know just as well as I do that when to include a patch is very much a judgement call, so I'm quite curious why you and Linus think yourselves better able to make that judgement call than I." Which amount to what Linus said above: "You made it very clear that I can't even question any bug-fixes and I should just pull anything and everything." It feels pretty inevitable that it gets yanked out, because the two just don't seem to be able to work together. And we all know who wins in that fight. [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 16:12 UTC (Fri) by kellito (guest, #41430) [Link] (7 responses) obviously the egocentric d$#k wins) it's his kernel after all. He yanked all the russian devs, people who'd been contibuting for many years... and stamped them all trolls. Why bcachefs should be treated differently? [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 16:19 UTC (Fri) by daroc (editor, #160859) [Link] (1 responses) You can certainly disagree with Torvalds, but please don't call people d$#ks. For one thing, we don't even have a profanity filter, so the punctuation is pointless. But more importantly, it is neither polite nor respectful -- two things we ask all comments to be. [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 16:33 UTC (Fri) by kellito (guest, #41430) [Link] Ok, I agree. [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 16:25 UTC (Fri) by bluca (subscriber, #118303) [ Link] (3 responses) Not sure why we need to continuosly tolerate this pro-Putin propaganda on this website? It's just from a handful of individuals, but it's relentless, pretty much every article talking about kernel maintenance has a couple of these comments. > He yanked all the russian devs No "russian devs" were "yanked". Companies sanctioned for being complicit in the illegal and genocidal war of aggression launched by a dictator against the people of Ukraine were forbidden from being involved, in accordance to the laws of Europe, North America and much of the world. And rightly so. Got a problem with that? [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 16:36 UTC (Fri) by kellito (guest, #41430) [Link] (2 responses) "genocidal war of aggression launched by a dictator" this is a technical site, so stop this pro-american regime propaganda. [Reply to this comment] Overstreet brings this on himself Posted Jun 27, 2025 16:38 UTC (Fri) by dskoll (subscriber, #1630) [ Link] (1 responses) I'd say "genocidal" is editorializing, but "war of aggression launched by a dictator" is a simple fact. [Reply to this comment] Please, no more of this Posted Jun 27, 2025 16:46 UTC (Fri) by jake (editor, #205) [Link] this is completely off-topic for this bit of news ... let's stop this sub-thread (Russian developers, sanctions, Ukraine war to be clear) right here, please ... thanks, jake [Reply to this comment] Overstreet brings this on himself Posted Jul 4, 2025 18:12 UTC (Fri) by swiley (guest, #178196) [Link] Linus being heavy handed is a large part of the reason the kernel is so nice. Don't forget the GNU Hurd kernel is open source under a similar license and isn't nearly as pleasant to use or work on. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 0:34 UTC (Sat) by raven667 (subscriber, #5198) [ Link] (97 responses) There is a repeated thought pattern that i see which I have sometimes been guilty of, which is this overwhelming sense of responsibility leading to a sense of authority where being "right" is the most important thing and authorizes one's behaviour so that you don't have to respect the social dynamics. I recieved a rather embarrassing spanking when as a young adult tried to strongly admonish someone more senior than me who did something really stupid (deleted our entire shared drive because it was mounted in two places and they thought one was a dup. Of course we didn't have backups, had to recover by saving the last copies of docs people emailed) which is the same kind of energy I see here from KO, trying to tell Linus how to do his job. This would be a troublesome way to talk to a peer or subordinate but is going to go over even less well taking that superior attitude with the ultimate project leader. Being responsible doesn't give one the right to treat everyone else as subordinates, especially people multiple levels up in a hierarchy [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 0:44 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (96 responses) Perhaps that's a generational thing. The other thing to consider is that: within the kernel we are all experts within our area, and no one, not even Linus, is right 100% of the time - especially when stepping into another's area of expertise. That becomes a problem when decisions are being overridden repeatedly, and they are decisions of consequence -namely, safeguarding user data. And I do take that responsibility seriously: that is why I am here. So if process has become so inflexible that it is impossible to ship a new modern general purpose filesystem within that process, then it's something worth talking about. And if the decisionmaking of that process is putting user filesystems at risk - then that's something I want to find out sooner rather than later, so it can be rectified before bcachefs is on even more machines. If that can't be fixed, then perhaps bcachefs does need to be shipped as a DKMS module. I will be sad if it comes to that. [Reply to this comment] Recovery in userspace? Posted Jun 28, 2025 4:08 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (6 responses) In this particular case, I wonder if it would have been better to put the new recovery features in the userspace tools instead. Preventing corruption is obviously a kernel bug fix, but in userspace it's a lot easier to iterate quickly on recovering from corruption that has already happened. [Reply to this comment] Recovery in userspace? Posted Jun 28, 2025 4:35 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (5 responses) Oh, we can run it in userspace today. The thing is, with it in the kernel side codebase, we can test it with an -o nochanges mount, and that way the user can verify with their own eyes that the filesystem looks the way it should and their data is there. For this particular recovery mode, that's essential. [Reply to this comment] Recovery in userspace? Posted Jun 28, 2025 8:14 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (4 responses) Would a FUSE implementation help? IIUC the main problem that FUSE has is performance and I would be very surprised if that is a serious concern in recovery scenarios. As an aside, one idea I just had is to turn bcachefs into a sans-io library and have everything else use that library. This would allow for bcachefs support in GRUB for example, at least if there is enough memory to replay the journal. [Reply to this comment] Recovery in userspace? Posted Jun 29, 2025 2:17 UTC (Sun) by koverstreet ( supporter , # 4296) [Link] (2 responses) A fuse implementation will never be performant enough to replace the kernel implementation, but for situations where we just want to test repair - yes, absolutely, that would help massively. It won't be a panacea, because we're moving more and more towards automatic repair and full online self healing, so repair fixes still do need to go in in a timely manner - but it would help a LOT to make them less urgent, especially in situations like this. [Reply to this comment] Recovery in userspace? Posted Jun 30, 2025 10:31 UTC (Mon) by paulj (subscriber, #341) [Link ] So you sort of have your answer perhaps. Worth noting that ZFS was developed on Solaris so it could run either in kernel or in a user-space harness. So that it could be fully tested from the much more forgiving, debug friendly user-space env. Willy's point above, that the kernel review/upstreaming process is /intended/ very much to be /resistant/ to the typical commercial pressures many developers have hanging over their heads - it's frustrating when you're the one with the pressures trying to get stuff in, but the process is unlikely to change much just for you. Take a step back, and look at what changes there are _in your control_ that would improve things. E.g., DKMS; user-space harness (FUSE or whatever) for easier devel and/or recovery tooling. Don't bash your head off upstreaming frustrations for no good reason - some of the upstream processes may be very well-founded. It's far from abnormal for new stuff to build up (both devel and userbase wise) outside of Linus' tree, and it can be a good thing for both sides. [Reply to this comment] Recovery in userspace? Posted Jun 30, 2025 13:56 UTC (Mon) by npws (subscriber, #168248) [ Link] How about just providing a UML (Usermode Linux) binary for recovery with whatever you need? [Reply to this comment] Recovery in userspace? Posted Jul 3, 2025 14:54 UTC (Thu) by nstiurca (guest, #178163) [Link ] > I would be very surprised if [performance] is a serious concern in recovery scenarios. It's a very serious concern IMO. The need for recovery never comes at a convenient time; need for recovery almost always happens when you're working on something else which probably has time pressure. If recovery takes 5-10 minutes, then ok you go get a cup of coffee and it's not too big a deal. If it takes 3 hours, it can at best ruin your day's productivity, or at worst miss a presentation or a deadline or have an extended outage in some critical process. Put differently, if something is moderately slow all the time, that's probably manageable because you can plan and work around it; but if something is very slow at an unexpected time, it's much harder to deal with and carries more unmanaged risk. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 8:11 UTC (Sat) by rolexhamster (guest, #158445) [ Link] (77 responses) Perhaps that's a generational thing. Yeah, no. Co-creating and contributing to a large project like the Linux kernel is multi-faceted task, especially when there are many people involved. A large project like that requires coordination with multiple stakeholders and maintainers, including the head maintainer. One of the dimensions is obviously the code itself, but another dimension is the social aspect of collaboration. You seem to be doing well on the code part, and yet at the same time you also come across as very abrasive and argumentative, sometimes just for the sake of "being right". Yeah, Linus can be rather direct and rude, but it helps no one if you're adding fuel to the fire. Suggest that you humor the head maintainer, even if that means slightly slowing down the development of bcachefs so that the process is smoother. Don't forget that development within Linux involves co-creation, and hence comprise is sometimes necessary. It's not like bcachefs is a life-or-death thing. It's (currently) a niche experimental file system, and people would be rightfully wary of taking it up if it's in danger of being ejected. Obviously ejection is the opposite of a positive outcome. If it's going to take a bit longer to develop and stabilize, then so be it. The code will get there in the end. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 14:34 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (76 responses) It's hard to say given that this fundamentally seems to be about development process, and that's not being talked about, but my real fear (besides the fact that we're not prioritizing data integrity in the filesystem - that's a big one), is that the way this is going wouldn't slow down development a little, it'd be a _lot_. The point I've been making in the maintainer thread, and need to be making more explicitly, is that: A new, perfectly debugged, ready for everyone to use modern filesystem is not going to show up out of nowhere. Fundamentally, an experimental period where we _fix all the bugs_ is required. And in the last phase of stabilization, where we're deploying to a wider and wider userbase, finding the last bugs - you have to be able to keep iterating through all of that. User report bugs, we fix bugs, user gets bugfix and keeps running so they can find the next bug. If bugfixes now have to wait until merge windows, we've gone from a ~2 week cycle (either user jumps on an rc kernel for their bugfix, distros package them so this is common, or if it's critical it gets backported), to a ~4.5 month cycle. That's where bcachefs is at now, finding the last of all the weird stuff, and we still have to be able to keep moving if we want to get this done. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 15:24 UTC (Sat) by rsidd (subscriber, #2582) [ Link] (73 responses) On the one hand you say this is a stable FS, and on the other hand, you say that there is a data integrity issue that is so absolutely essential that it must go into 6.16-rc3 and cannot wait for 6.17, and moreover, this is not a small patch but an entire new feature related to disaster recovery. If this is so absolutely essential for an -rc3, then Linus is right that "anybody who uses bcachefs is expecting it to be experimental. They had better." But if Linus is right, then there is no reason to push this into -rc3 violating kernel policy. bcachefs users can just pull your patch directly. Classic Epimenides paradox. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 16:09 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (72 responses) No, I do not say that it is a stable fs - it's marked experimental for a reason! What I have been saying is that it has a lot of users, who are using it as their normal filesystem. Therefore I support it the same way stable filesystems are supported, and additionally because fixing bugs and getting them those bugfixes in a timely manner so they can keep testing and find the next bug is an absolutely critical part of the stabilization phase. This is the logic you guys keep missing: you're saying "it's experimental, therefore it doesn't matter", but the truth it matters more if we want to get this done. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 16:56 UTC (Sat) by joib (subscriber, #8541) [Link ] (60 responses) Trying to look at it from Linus's perspective (hypothetically speaking of course, since I'm not him), the current kernel development process with a fairly short merge window followed by a much lengthier integration testing and debugging phase has probably evolved to what it is for fairly good reasons. If he'd agree to an exception to the rules for any one particular developer, I'm sure there'd be a queue a mile long outside his door of developers arguing that their code too is a special snowflake. (pbonzini's suggestion of maintaining a separate public tree with the absolutely latest and greatest bcachefs code for those of your users who want to help you with bug hunting sounds like an excellent idea, without having to butt heads with Linus over introducing new features after the merge window.) [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 18:29 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (59 responses) > pbonzini's suggestion of maintaining a separate public tree with the absolutely latest and greatest bcachefs code for those of your users who want to help you with bug hunting sounds like an excellent idea, without having to butt heads with Linus over introducing new features after the merge window. No, that does not work here, and I've been repeating myself in too many places on this. This is not _development_ we're talking about, this is _stabilizing_, and bcachefs already has a wide userbase and most of it does not how to build kernels. I cannot replicate the current release process just for bcachefs across all the distros. The one way this just might work is DKMS, and even switching to that is going to be extremely disruptive. When they hit a bug, and the bug is fixed, the bugfix has to go out in a timely manner. What this is really coming down to is whether or not it's going to be possible for anyone to develop a new general purpose filesystem ever again. Because these things do not show up out of nowhere, no one will ever fund a general purpose filesystem to completion before merging it. There's far too much uncertainty and risk for the people paying for it to do that. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 20:19 UTC (Sat) by SLi (subscriber, #53131) [Link ] (55 responses) I suspect everyone would agree that the DKMS route would be extremely disruptive. I don't think Linus has any illusions about that either. That in itself should indicate how difficult Linus finds working with you--difficult enough to take a route he knows is disruptive. Which of those routes do you find better for bcachefs? I'm not saying you should accommodate, no matter what. Sometimes you have to choose an inferior route or even abandon a project, quit a job, because you just cannot work with someone or rules set by someone. So, let's rephrase it: Even if you think working in-tree the way Linus wants would be more effective than DKMS, it may be valid to say "no" if you cannot stand it. But I do think it's healthier to frame this as a difficult choice you're making, rather than something that's simply being forced on you. I have quit a job because I couldn't work with someone. Most colleagues did not have that problem with that person, and I think it's healthy to agree that either the problem was me or, perhaps more likely, a second degree effect of what sometimes happens when two reasonable people have very different approaches. (I've also seen the reverse, of me being able to work fine with someone whom everyone hates.) I think it would still be healthy to acknowledge that in that whole scheme of things, you are much more of the outlier, and that this is causing enough stress to Linus to make him want to stop working with you. Yes, that may come down to cultural factors like generation, but you seem to be insisting that it's almost purely them that need to change, that your way is superior. It's an unfortunate bind. In most jobs, if things aren't working out, you can look for a better cultural fit. But there's no second Linux kernel to take your filesystem to. Maybe it's easier for me to say this because, frankly, I don't think I'd enjoy working with either of you two . [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 20:27 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (4 responses) > That in itself should indicate how difficult Linus finds working with you--difficult enough to take a route he knows is disruptive. The feeling is mutual! > But I do think it's healthier to frame this as a difficult choice you're making, rather than something that's simply being forced on you. Agree 100%. That's how I've consistently framed it since Linus started booting bcachefs from the kernel: "life will go on if it comes to that; just remember that it may be the easier option for us but it's going to suck for everyone else" was pretty much my direct response over a year ago. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 6:13 UTC (Sun) by Phantom_Hoover (guest, #167627) [Link] (3 responses) You're the one turning basic development practices like "code freeze, bugfixes only, no new features" into protracted battles of will. I wouldn't want to work with you either. [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 13:15 UTC (Mon) by Alterego (guest, #55989) [Link ] (2 responses) I agree :-) I am not as clever as an fs developer, but i know the difference between a bugfix and new feature. [Reply to this comment] Overstreet brings this on himself Posted Jul 3, 2025 18:20 UTC (Thu) by Kluge (subscriber, #2881) [Link ] (1 responses) Two points: 1) Bugs come in all shapes and sizes. The best, even only way to fix a bug may be to add a feature, although that's rare. 2) The distinction between bugfix and feature is a lot clearer in stable code than unstable/experimental code. It may make sense for Kent to be solely responsible for any breakage that new post-merge window Bcachefs code causes, but I don't see a good rationale for saying that the same rules should rigidly apply to stable and experimental (and optional) parts of the kernel. [Reply to this comment] Overstreet brings this on himself Posted Jul 3, 2025 19:00 UTC (Thu) by koverstreet ( supporter , # 4296) [Link] It's also not just about the bugfix itself - it's also about risk mitigation. I often tell people: when you're looking at e.g. a syzbot bug, don't do the minimum to make the report go away. Reproduce it locally, read through the log output with an eye towards the behavior of the whole system. Make sure that the behavior in response to the error makes sense (which is a lot more than just not crashing!), and see if you can find other things to improve. That could be logging (we can't debug if we can't see what's going on), repair, or even - like in this case - looking for ways to limit the impact of similar bugs in the future. We've now had two separate bugs in two weeks where this new repair mode has proved useful. The second one, for which I just determined the root cause this morning, involved a filesystem with a 400GB postgres database (and a whole bunch of other data) where the directory structure got trashed. (Two different bugs in two weeks? I'd say getting the code out there quickly was justified; I've learned to trust my intuition). Proximate cause was a flaky USB controller and a crazy iscsi setup - which is exactly the sort of thing I love to see: I want people doing the craziest oddball crap they can imagine to break things _now_, before the experimental label gets lifted. It turns out 6.16 broke btree node read retries for validate errors - not IO errors, we had tests for that, and there's a story as to why the tests were missing; the error injection we were used to rely on from one subsystem was dropped with a supposedly equivalent error injection mechanism from a different subsystem - except the new one wasn't tested for anything except IO error injection, the other functionality was completely broken. Ow. Testing is important. But we had a lot of logging available to sift through to find out what went wrong, and one thing we're getting in 6.16 (which incidentally also was the patchset that introduced the regression) is much improved logging for data read and btree read errors - which made the missing retry from rest of the replicas blindingly obvious. And now I'm about to commit and push new tests for the relevant error path, and the user who hit the second bug is getting most of his stuff back thanks a combination of journal rewind (that didn't repair everything, the journal didn't go back far enough - we didn't catch it early enough) and writing code to find files by hash/filetype (almost nothing was completely lost, just ended up in lost+found). And I'm writing new tests today. TL;DR: defense in depth, risk mitigation wherever possible, and always have eyes on as much of the system's behavior as possible when things go wonky. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 20:40 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (49 responses) (You had a long, but good comment) > I think it would still be healthy to acknowledge that in that whole scheme of things, you are much more of the outlier, and that this is causing enough stress to Linus to make him want to stop working with you. Yes, that may come down to cultural factors like generation, but you seem to be insisting that it's almost purely them that need to change, that your way is superior. I would if it were possible, but I view the demands Linus has been making as internal to bcachefs and fundamentally make or break for the project. a) We do not lose user data b) We do not slow down on fixing user bugs, because being responsive when users find bugs is how we bring people into the project, teach them how to file bugs, learn how the system works, and find new bugs, and if we slow down on that stabilizing stretches out for 5, maybe 10 years and I turn into a gibbering madman. > Maybe it's easier for me to say this because, frankly, I don't think I'd enjoy working with either of you two . [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 1:06 UTC (Sun) by warrax (subscriber, #103205) [ Link] (45 responses) So, I'm a bit confused here... if it's still marked EXPERIMENTAL, let the users lose their filesystem. I do understand the drive for absolute correctness and rescuing users if you f'ed up... but it's marked EXPERIMENTAL for a reason. Let users learn about the value of (proper) backups. Sometimes being Right isn't the right thing to do. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 1:15 UTC (Sun) by koverstreet ( supporter , # 4296) [Link] (44 responses) > So, I'm a bit confused here... if it's still marked EXPERIMENTAL, let the users lose their filesystem. I do understand the drive for absolute correctness and rescuing users if you f'ed up... but it's marked EXPERIMENTAL for a reason. Let users learn about the value of (proper) backups. No :) It's not my job to teach "lessons" like that, it's my job to do my job properly. And that is... they find bugs, I fix them, I make sure they don't lose data. And it's important to be doing that part of the job properly before the experimental label gets lifted, not leave it until later. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 15:17 UTC (Sun) by jpfrancois (subscriber, # 65948) [Link] (40 responses) But what I fail to understand, and I am sorry if you explained it elsewhere, is why you can't have bug fixes separate from new features. If it matter so much that data losing bugs are fixed in a timely manner, why do you insist on mixing bug fixes and new feature ? You know it will create friction, and somehow you make the same argument again and again, about those so important data that should give you a forever exception to the "rc" vs "merge window" rules. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 22:45 UTC (Sun) by DemiMarie (subscriber, # 164188) [Link] (39 responses) The new feature is for recovery. That is something that really ought to be in userspace so that users can get the new feature without needing a -rc kernel (which distros generally don't package). DKMS would actually be better for getting features to users quickly, but the best approach by far I can think of is to have an option to run the relevant code in userspace instead. Either run the recovery in userspace and then mount in the kernel, provide a FUSE implementation, or perhaps have a userspace daemon that the kernel can delegate complex stuff to that isn't performance critical. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 23:42 UTC (Sun) by mirabilos (subscriber, #84359) [Link] (38 responses) > which distros generally don't package IIRC bcachefs' userspace parts are so unstable and change so often, without either backwards or forwards compatibility, that it's impossible to use bcachefs in a packaged distribution *anyway*. [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 0:41 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (37 responses) > IIRC bcachefs' userspace parts are so unstable and change so often, without either backwards or forwards compatibility, that it's impossible to use bcachefs in a packaged distribution *anyway*. No, that was the opinion of one Debian maintainer who went against advice re: Rust dependencies and broke the build - leaving me with a bunch of bug reports. There are a lot of backwards and forwards compatibility mechanisms in bcachefs-tools. [Reply to this comment] Userspace tool stability Posted Jun 30, 2025 4:17 UTC (Mon) by DemiMarie (subscriber, #164188) [Link] (33 responses) I think Debian is not a good distro for bcachefs. Not just because of their packaging policies, but also because Debian stable is (as the name implies) not the place for experimental software. Right now bcachefs-tools is a much better fit for rolling release distros, or at least ones willing to take new features at any time. [Reply to this comment] Userspace tool stability Posted Jun 30, 2025 4:35 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (32 responses) It'd be fine if they just shipped a working package. We have the compat code: the kernel is always going to match the on disk filesystem version, so we just transparently use kernel fsck if bcachefs-tools doesn't match the on disk version. They might be missing some of the latest goodies (bcachefs fs top), but that's not a big deal. If we need to debug, bcachefs dump will work fine on version mismatch, it doesn't need to run upgrade/ downgrade passes. It really was just a case of the maintainer trying to patch it to use an old unsupported bindgen and then completely dropping the ball. [Reply to this comment] Userspace tool stability Posted Jun 30, 2025 11:22 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (31 responses) Fits my definition of unstable (not suitable for a stable distro release), requiring too tight and/or too new dependencies to even function. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 14:32 UTC (Mon) by DemiMarie (subscriber, # 164188) [Link] (30 responses) That's not a concern if you use the dependencies specified by Cargo.lock, which is the only tested and supported configuration. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 15:50 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (29 responses) That's inacceptable for a packaged distro. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 16:09 UTC (Mon) by rahulsundaram (subscriber, # 21946) [Link] (28 responses) > That's inacceptable for a packaged distro. Used to be yeah but these days distributions have mostly given up and started bundling dependencies. They might recommend against it but allow for exceptions and applications written in Rust and Go commonly use bundling. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 16:25 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (26 responses) The Debian snafu was caused by them unbundling dependencies, against very specific advice. The Debian maintainer switched to an old unsupported version of bindgen, which didn't build, then sat on it for months - updates stopped. There was a brief regression in the mount subcommand where it wasn't passing mount options correctly - this was the version Debian was stuck on, and then I found out about it because I got a bunch of bug reports from people who'd had a drive die and weren't able to mount in degraded mode. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 16:36 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (25 responses) Bundling dependencies is a Policy violation in Debian, and as such packages are not allowed to do that, period. If you want your software to be used in stable, supportable contexts, then you keep that in mind. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 17:21 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (3 responses) The Cargo.toml correctly specified the version of bindgen. And the reason for the advice was that even if he had used a supported version of bindgen, that is a particularly sensitive package since it deals with ABI issues (and packed vs. unaligned has been poorly specified, and a source of real problems). Breakage in that area can result in a package that builds, but with the wrong ABI, leading to particularly subtle and unpredictable breakage. Changes like that need to be tested; the Debian package maintainer was not doing that testing. The issue here was entirely the fault of upstream; they made changes in an irresponsible fashion - and for a package that's critical to the system to function, that needs to be highlighted. Remember the ssh goto fail bug? And I've stated elsewhere, I never asked for bcachefs-tools to be packaged for debian, this was something the maintainer in question volunteered to do. It would have been much better if they'd simply not packaged bcachefs-tools at all; then users would have stuck with compiling it from source and had something that worked. The Debian people are the ones that acted here, and they did so from a position of authority (users assumed their package would be tested and working, and so relied on it). If they did so irresponsibly, claiming "this is our process!" is not an out. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 17:46 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] Actually, I should add, Debian most definitely has processes for obtaining carvouts for system critical packages - Ted talks about them frequently for e2fsprogs. So I think we really just needed a package maintainer who was more plugged into that process and had the time to do things right. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 21:55 UTC (Mon) by dvrabel (subscriber, #9500) [ Link] (1 responses) "Changes like that need to be tested; the Debian package maintainer was not doing that testing." Complaining about someone not running non-existent tests seems somewhat unfair. (If they are there and I missed them, you may want to update the README.md or INSTALL.md to say how to run them.) [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 22:30 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] Unfortunately, we don't have tests that would be easy for distributions to run :/ I would accept contributions if they're good, but I must warn that in the past when people have contributed simple userspace-only bcachefs-tools tests they've been bad (run this command and check that the text output matches; nothing more), and that's both fragile and it doesn't actually test much. So they rotted and eventually I ripped them out. The best bcachefs-tools test are the ktest tests, in particular - The migrate tests from single_device.ktest (this exercises a large portion of the codebase to create a bcachefs filesystem in userspace, doing an inplace migration from ext4/xfs/bcachefs) and then verifies the contents using the kernel implementation against the old filesystem we're migrating to - There's some 'bcachefs fs usage' tests, I think in single_device.ktest or replication.ktest (or maybe both?) - The mount helper needs to be exercised in multi device mode, so definitely something from replication.ktest (there might be a test specifically for this, or any test here would do). I'm not sure if we ever added a test specifically for passing through mount options; we've got enough tests that something would definitely break if that breaks again, but distro people would want a test specifically for this instead of running the full test suite. - The Rust btree bindings need to be tested, so something that checks that 'bcachefs list' works needs to be tested (if we're missing this it would be trivial to write) - Userspace fsck implementation ought to be tested; this one is less critical since it's all C and the same code that's run on the kernel side, no crazy bindgen stuff going on and I don't think we've ever seen breakage specific to userspace fsck - but it ought to be covered if it's missing. - 'bcachefs fsck' when using the kernel fsck implementation via the ioctl interface (i.e. compatibility mode) needs to be tested, we have seen breakage there in the past. The ktest tests are not super lightweight (they spin up VMs), but that's what needs to run if we want good test coverage. They're also not currently set up for testing different tools releases, so that'd need to be fixed for Debian/Fedora et. all. The NixOS people have also contributed some bcachefs-tools tests; I haven't looked closely at them but what I have seen looks better than what we had in the past. It covers a different set of things (e.g. multi arch builds) than the ktest tests though, the ktest tests really are where our primary test coverage comes from. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 18:29 UTC (Mon) by farnz (subscriber, #17727) [ Link] (20 responses) The issue is not that the Debian maintainer unbundled dependencies; the issue is that they attempted to use an older version of a dependency when bcachefs-tools explicitly stated in its Cargo.toml that it needed a newer version, and had a Cargo.lock showing that it had been tested against a newer version, at that. Unbundling dependencies is one thing; explicitly rewinding to an older version, when upstream has chosen a newer version for a reason, and then neither testing to confirm that this works nor fixing any bugs that result from using the older version of a dependency is a problem. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 0:14 UTC (Tue) by bluca (subscriber, #118303) [ Link] (19 responses) That's expected, and it's because upstream "app" developers using these language specific systems usually have no idea what they are doing, and just pin to whatever version happen to be installed at that given random time on their (most often than not, Windows) system. This is especially evident for any random Python program, but there's no reason to believe it would be different for other languages. These versions pinning info is just not very useful information normally, and it's in fact just random noise to ignore. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 1:17 UTC (Tue) by koverstreet ( supporter , # 4296) [Link] That version pinning gets you reproducibility. That's huge. When distributions start swapping out dependencies with different versions, that invalidates all the testing that's been done by upstream. And that's a real problem for a critical system package. Debian/Fedora doing this for normal packages is one thing; I know they have their reasons (I would quibble about their merits, but that's a different topic). But in this instance, we saw exactly what sort of problems it can cause. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 2:39 UTC (Tue) by NYKevin (subscriber, #129325) [ Link] (13 responses) Look, I have some sympathy for distros here, I really do, but if you change the configuration from whatever upstream ships, and it breaks, then you get to keep both pieces. That's the only sane way this can possibly work. Distros are entitled to their opinions about software packaging and versioning. They are entitled to enforce those opinions on their own packages that they distribute. They are even entitled to refuse to package software if they don't like how the upstream has provided it. What they are not entitled to do is demand that their packaging methodology is the only approach that anyone is allowed to use for any purpose. If an upstream wants to ship a Flatpak, or a Cargo crate, or whatnot, that's their choice, and the distro may package it or not, as is their choice. I'm sure it is inconvenient for distros when lots of upstreams do this, but if large numbers of upstreams are choosing to do this, it's probably for a reason, and that reason probably rhymes with the following: Distro-level packaging is great for end users and sysadmins. It's not so great for local development. You often have to deal with older versions of things, it's not always obvious whether your distro is carrying a patch or the like, and there is no isolation between different development environments (short of doing something "heavy" like a proper OS-level container for each environment). Some upstreams are simply not willing to put up with these constraints, so they use non-distro tooling. Distros had every opportunity to recognize this impedance mismatch and do something about it. They could have introduced a lightweight system for "local" installations of select library packages (not all general-purpose packages, just packages that are known to be inert shared objects and the like, things you can shove in a directory and say "OK, I installed it," with no further work). They could have provided low-oversight development repositories (some distros actually do this, but most of them have some amount of gatekeeping even in front of their "unstable" repositories, or at least they are not designed under the assumption that random upstreams will upload random bits and pieces into the repository with very little red tape). They could even have provided binaries that allow limited use of those systems on Windows and Mac (which many upstreams either want to use or are forced to use for compatibility reasons). They could have, generally speaking, led the way on local development packaging solutions. They mostly did not do those things, or at least they didn't try hard enough. We know this is the case, because the language-specific package managers would not exist (or at least wouldn't be popular) if they had succeeded. But that was a gradual process that played out over years. It's not as if everyone just woke up one day and abruptly switched to using these technologies. If a sudden change had happened, then the distros' frustration would be more understandable, but as it stands, I think they have only themselves to blame. They failed to recognize what a large cohort of users wanted out of a packaging system, those users went off and did their own thing, and now the shoe is on the other foot and the distros are at the wrong end of it. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 8:15 UTC (Tue) by taladar (subscriber, #68407) [ Link] (2 responses) > Distro-level packaging is great for end users and sysadmins. As a sysadmin I would like to point out that the "use ancient versions" aspect of distro packaging often sucks for me as well, particularly when I have to support systems using ancient versions like an almost 10 year old RHEL and the latest version and especially for packages where the latest upstream version is very compatible in use but not in terms of supported config (e.g. openssh, apache, nginx, DB servers,...). It is a huge amount of effort to special case config management templates in all kinds of places just to support some ancient openssh version and the knowledge what to exactly support there will pretty much be something I never need again once I figure it out. [Reply to this comment] Just use bundled dependencies Posted Jul 2, 2025 8:18 UTC (Wed) by niner (subscriber, #26151) [Link ] (1 responses) Indeed. I'm glad we have chosen to go into the other direction and deploy openSUSE Tumbleweed with very frequent updates. Sounds insane on a first glance, but the systems this applies to are redundant, very well covered by automated tests and of course intensely monitored. Combine that with a staggered rollout and you get a very stable system that avoids the hoops you have to jump through to support stone age versions and also makes developers happy as they get to use the shiny toys. And of course you get the (usually) faster, more secure and more powerful software. [Reply to this comment] Just use bundled dependencies Posted Jul 2, 2025 19:11 UTC (Wed) by raven667 (subscriber, #5198) [ Link] This is maybe a bit of a tangent but this depends a lot on how much time you have to work on upgrade issues and testing, and how much your development needs to track HEAD for its dependencies. I have some software we (Netbox, a python/Django app) which generally depends on a recent supported python and the latest libraries from pip and isn't tested with anything else, that can become a bit of a pain to maintain on an RHEL which is getting long in the tooth (separate SCL python when we used EL7 and tweaks to make sure the right versions/paths were used for everything) but most of our in-house software is perl and I was happy with the features and performance of RHEL5 with perl 5.8 (aside from missing the process supervision and log collection from systemd/journald in EL7, which was a fantastic upgrade). We upgrade OS when the hardware is no longer supportable, not because we are chomping at the bit for new features or libraries or are resource constrained, the server we bought in 2010 would probably still run our workload just fine today, and aside from a security update in freeradius which was a mandatory behavior change, I can't recall ever having an RHEL patch update actually break anything, its very minor, rare and unmemorable if that happened. There is a time savings in not having to perform a major upgrade of the OS every 6mo-1y and re-test everything to find the unexpected behavior changes, and only doing that work once every 5-10y. Not every team can absorb the churn while still making forward progress on their other priorities. I'm sure that the config automation we've built can get 95% of our app configured and running on the latest Fedora, but finding and fixing that other 5% to make it _perfect_ is substantially more work for diminishing returns and competes for time with bug fixes and changes that other people are waiting for. Like a lot of these things, which approach is "best" depends more on the makeup of your team than on the technical details being "objectively better" one way or another. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 9:43 UTC (Tue) by bluca (subscriber, #118303) [ Link] (9 responses) > Look, I have some sympathy for distros here, I really do, but if you change the configuration from whatever upstream ships, and it breaks, then you get to keep both pieces. That's the only sane way this can possibly work. > Distros are entitled to their opinions about software packaging and versioning. They are entitled to enforce those opinions on their own packages that they distribute. They are even entitled to refuse to package software if they don't like how the upstream has provided it. I fully agree - this software doesn't belong in any distribution, and dropping it was the correct thing to do > We know this is the case, because the language-specific package managers would not exist (or at least wouldn't be popular) if they had succeeded. This however, doesn't ring true at all. I am pretty sure the main reason language-specific package managers exist is Windows, which is the platform that the vast, vast majority of people use. Anything else is just a side effect. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 10:14 UTC (Tue) by roc (subscriber, #30627) [Link] (8 responses) > I am pretty sure the main reason language-specific package managers exist is Windows, which is the platform that the vast, vast majority of people use. Anything else is just a side effect. I think the creation of npm, Cargo etc had very little to do with Windows. No language team ever thought "we want to make it easy for developers to distribute and consume libraries written in our language! I know, let's make developers package their libraries for lots of different Linux distributions, and tell users of other OSes they're out of luck!" [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 10:36 UTC (Tue) by bluca (subscriber, #118303) [ Link] (3 responses) As far as I know, all of those are mainly corporate-driven projects. Heck, npm is owned by Microsoft now. Corporations are, by and large, Windows shops. I know this might be controversial for a website called "Linux Weekly News", but Linux on users machines is a really, really, really niche case and an afterthought at best (if not outright forbidden), in most places. I don't know it for a fact, and I don't think there's ever going to be accurate data about it, but I wouldn't even be surprised if WSL had more users than all other Linux distros, combined, on native baremetal user machines (laptops/desktops). So when you say that these package managers were created to make life easier for developers, I believe that. Because developers by and large use Windows, and to a minor extent OSX. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 12:49 UTC (Tue) by paulj (subscriber, #341) [Link] I'm currently forced to use WSL. You're still relying on a package manager in the Linux environment though. Be it apt or dnf. And I (really, *really*) prefer to install stuff through those. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 13:46 UTC (Tue) by pizza (subscriber, #46) [Link] (1 responses) > I don't know it for a fact, and I don't think there's ever going to be accurate data about it, but I wouldn't even be surprised if WSL had more users than all other Linux distros, combined, on native baremetal user machines (laptops/desktops). This matches my anectdotal experience, and not just in the corporate world. WSL made "Linux" just another Windows feature, so you now get the "best" of both worlds -- your hardware JustWorks(tm), as does every piece of software you claim to care about. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 16:16 UTC (Tue) by paulj (subscriber, #341) [Link] Hmm... my own experience is that my hardware "Just Works" much better with Linux than with Windows. Least, my work Windows+WSL XPS sometimes doesn't suspend right, sometimes just hangs, sometimes just gets the desktop into a weird state that can only be solved with a reboot, and /always/ fails to remember exactly where my windows were after I replug into external monitors. Linux on my own XPS (not quite the same model) doesn't have the latter problem at all, and is far far far more stable on the other things. My experience, with 2 Dell XPS on my desk here, 1 Linux and 1 Linux on Windows, is that Linux is much more reliable as the base OS. (If it was my choice, I'd run Windows in a VM on Linux - as I've done in other places where some stuff was only available via windows). [Reply to this comment] Just use bundled dependencies Posted Jul 2, 2025 5:22 UTC (Wed) by raven667 (subscriber, #5198) [ Link] (1 responses) > I am pretty sure the main reason language-specific package managers exist is Windows, which is the platform that the vast, vast majority of people use. Anything else is just a side effect. The OG internet-connected language package manager is perl CPAN, and I'm not sure perl ran on Windows when CPAN was created and I don't think Windows is a significant installed base for perl. RPM has excellent integration with CPAN though (don't know about dpkg) which is something sorely missing from all the subsequent language-specific packages. It all works well when either the OS package manager has full visibility into the entire dependency graph for system-wide software, OR the language-specific manager builds an entirely self-contained environment which only depends on the core/base OS as it comes out of the box, because mixing-and-matching the two incomplete systems eventually leads to insanity. [Reply to this comment] Just use bundled dependencies Posted Jul 2, 2025 8:53 UTC (Wed) by Wol (subscriber, #4433) [Link] The gentoo package management system has tools which invoke cpan (perl-cleaner). Which is fine once you realise that - you need to invoke it separately, and I've known several systems where "emerge" gets wedged until you run it. That's one of my little niggles with gentoo - updating is a dance of several commands which is fine once you've discovered them all, but there's always the odd corner case to bite you ... Cheers, Wol [Reply to this comment] Just use bundled dependencies Posted Jul 3, 2025 9:25 UTC (Thu) by Sesse (subscriber, #53779) [Link ] (1 responses) > I think the creation of npm, Cargo etc had very little to do with Windows. My personal experience is that these things mainly come out of the needs of macOS, not Windows. But Cargo specifically struck me as short-sighted from the get-go; there was a long design document/manifesto that talked about the evolution of any given piece of code and how it would be installed and how Cargo would think about this, and _not once_ did it mention that a distribution may want to package the resulting program. Or in general how Rust code would live in a world where not everything is Rust and follows Rust ways of thinking. [Reply to this comment] Just use bundled dependencies Posted Jul 4, 2025 9:59 UTC (Fri) by taladar (subscriber, #68407) [ Link] > Or in general how Rust code would live in a world where not everything is Rust and follows Rust ways of thinking. The non "Rust way of thinking" is really just the C way of thinking based on workaround for the flaws in the C development model. The only way to become completely compatible with that would be to copy all those flaws, in particularly the hand-wavy attitude towards ABI compatibility that allows the C world to pretend that ABIs are compatible when there is hope that they are reasonably similar if you squint a little. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 9:41 UTC (Tue) by farnz (subscriber, #17727) [Link ] (3 responses) It's genuinely expected that distro maintainers will not debug the changes they make as part of packaging, but will simply assume that any bugs they introduce during packaging are upstream's problem now? That's news to me - while distro maintainers are, just like any other FOSS consumer, welcome to change things, they can't then push the fault for bugs caused by their changes onto upstream, just because they don't like the way upstream develops. And note that this isn't version pinning - there are two sources of version information, one in Cargo.toml, which tells you what versions upstream expects should work (and allows for many versions), and one in Cargo.lock, which tells you the precise versions upstream has tested. It's an upstream bug if you use a version that's compatible with the dependency specification in Cargo.toml, even if you chose a different version to Cargo.lock; it's your bug if you (as happened in the case of bcachefs-tools) choose a version of the dependency that the build system would not permit you to select. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 10:38 UTC (Tue) by bluca (subscriber, #118303) [ Link] (2 responses) > It's genuinely expected that distro maintainers will not debug the changes they make as part of packaging, but will simply assume that any bugs they introduce during packaging are upstream's problem now? Personally, my expectations for such buggy and unstable software is to not get shipped at all in distributions. Which is what is happening here, so everyone should be happy? [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 13:13 UTC (Tue) by intelfx (subscriber, #130118) [ Link] (1 responses) > Personally, my expectations for such buggy and unstable software If a program claims that it depends on "foobar >= 1.2.3" and you forcibly link it against "foobar 1.0.0" and it breaks, you cannot possibly claim in good faith that it's somehow the program to blame and not you. [Reply to this comment] Just use bundled dependencies Posted Jul 1, 2025 13:43 UTC (Tue) by bluca (subscriber, #118303) [ Link] I don't much care for "blame". I'm simply saying that if something requires the bleeding edge of everything, and the only way to make it work is to vendor and embed everything at that bleeding edge, then it's not suitable for inclusion in a stable distribution. It can live elsewhere. There is no law of physics demanding that every single software under the sun gets packaged in distros. It's fine to exclude things that don't fit, for whatever reason that may be. [Reply to this comment] Just use bundled dependencies Posted Jun 30, 2025 16:35 UTC (Mon) by mirabilos (subscriber, #84359) [Link] That speaks even stronger in favour of bcachefs tooling being immature. Bundling dependencies is a strong sign of the tooling not being written against a stable enough interface, and it *will* make it lose security fixes that are applied to the rest of the distribution. So that's generally also inacceptable. [Reply to this comment] Overstreet brings this on himself Posted Jul 4, 2025 10:53 UTC (Fri) by highvoltage (subscriber, # 57465) [Link] (2 responses) Nope, I'm sorry Kent, you fundamentally and clearly simply don't understand how stable releases of software works. Even versions of bcachefs-tools that were a few weeks old in Debian were "ancient". On that bar, there's no way you could ever include it in a stable release. Also, your demands go far out of Debian policies, bcachefs-tools has zero chance of a future if it can't release any code that's suitable for long-term use. [Reply to this comment] Overstreet brings this on himself Posted Jul 4, 2025 11:41 UTC (Fri) by taladar (subscriber, #68407) [ Link] (1 responses) In experimental software with a decent development pace that still has to reach the first stable version versions that are a few weeks old is ancient. Are you sure it is not you who don't understand how stable releases of software work? They do not happen until you reach a first version you can consider stable because they are meant to preserve stability. If stability has not yet been reached in the first place there is no point in supporting older versions for any length of time at all. [Reply to this comment] Overstreet brings this on himself Posted Jul 4, 2025 12:48 UTC (Fri) by SLi (subscriber, #53131) [Link] I think there's a level of misunderstanding about what "stable" in e.g. Debian's context means. It does not mean "works". It means "does not change". That is, Debian will often package software that is very experimental and buggy, and that may get into stable. That is not a problem in itself. The stable guarantee is that it will not change in incompatible ways, usually including Hyrum's law. A crash may not be important enough to fix in stable. The promise is that it keeps working as it has worked so far. The biggest reason why Debian typically would have a problem with packaging rapidly changing software is the thought that *if* they have to change it--usually to fix a security issue, sometimes another important enough bug--then backporting the fix to the version in Debian stable may be too difficult, making it unsupportable within Debian's stability promise. [Reply to this comment] Overstreet brings this on himself Posted Jul 3, 2025 22:27 UTC (Thu) by MichaelRose (guest, #178169) [ Link] (2 responses) It seems obvious that any filesystem that needs constant changes at a moments notice to avert disaster isn't nor has it ever been stable. Being used by the kind of folks who you suggest would find it burdensome to build a kernel or install a third party repo indicates that you have successfully sold your project to at least some users who have no business using an experimental filesystem. It suggests that inclusion in the kernel was a mistake that shouldn't be compounded by subordinating process designed around comparitively stable software to support your unstable software. The best remediation would be to clearly communication to those folks who shouldn't be using your software in the first place by warning folks then removing it. Those so capable can migrate to building it or using a third party repo. Those not so can migrate their data. In the future it could always be added back when it's actually stable. [Reply to this comment] Overstreet brings this on himself Posted Jul 4, 2025 0:19 UTC (Fri) by koverstreet ( supporter , # 4296) [Link] Every filesystem has stories of lost filesystems. (Except for maybe ext4; I've seen a ton of user feedback and I think I've only seen one person - ever - mention lost data with ext4, and that in passing with no details). Consider that. You don't get to the type of reliability ext4 has by writing perfect code, modern filesystems are just too big - no one can write that much perfect code. And ext4 achieves its reliability through simple on disk data structures, with the important stuff in fixed locations - it definitely does not have perfect code. You get it by iterating on every single bug report you can find, by making the most of every bug report, looking for everything that went wrong or could have gone better, especially all the debug tooling and information so you can see what went wrong. You do it by being proactive. Rinse and repeat, over and over, until the bug reports stop coming in and you're confident you can handle anything Murphy's law throws at you. [Reply to this comment] Overstreet brings this on himself Posted Jul 4, 2025 10:42 UTC (Fri) by taladar (subscriber, #68407) [ Link] You obviously have strong opinions on this topic but for someone on the side of this debate that claims they need professional structure processes and Kent is behaving in an unprofessional manner your post reads incredibly toxic and honestly would put me off even trying to work with the Linux developer community if this is the tone to be expected in debates there. [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 9:33 UTC (Wed) by dwon (guest, #54223) [Link] (2 responses) > a) We do not lose user data You *do*, though. If bcachefs didn't lose user data, there would be no urgency for getting these patches merged after the merge window has closed. You seem to think "we don't lose user data" is a unique strength of the way you do development, but it's really just a combination of you overstating your filesystem's robustness and you not playing well with others. I read through this thread as a bystander, and honestly I'm coming to the same conclusion as Linus: you seem to have a really rigid way of thinking about how to do development, and you don't seem to think it's important to adhere to Linus's process, or Debian's process, both of which exist in part to make it possible for generalists like him and Ted to make enough sense of what a bunch of a specialists like you are doing, and turn it into something people actually want to use. They're also designed to take care of a whole bunch of other concerns that users care about. Your dismissive attitude to the work they're doing, just because they're prioritizing concerns that you don't personally understand the value of, is really insufferable. [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 14:49 UTC (Wed) by koverstreet ( supporter , # 4296) [Link] (1 responses) It's a succinct way of saying "we will make every attempt possible to harden the filesystem against everything imaginable, and when something bad happens we will make every attempt to recover, and learn from it so it doesn't happen again". It's about priorities. [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 14:59 UTC (Wed) by rsidd (subscriber, #2582) [Link ] Yes, it's about priorities. It's about prioritising your needs and your perceived needs of your users over the project management needs of a project that has maybe 3-4 billion users (counting Android). It is clear you don't see that despite any argument people throw at you. Good luck. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 16:14 UTC (Sun) by rsidd (subscriber, #2582) [ Link] Kent, you're a brilliant programmer, and you care about your users (hundreds of them? Thousands?) What you don't seem to appreciate is that Linus is not just a brilliant programmer (Linux AND git!) who cares about his users (hundreds of millions), but ALSO a mindblowing manager. Who else has managed a project of this size for this long, this well? No MBA course can teach you that. He is unique. The "only bugfixes for -rc" is not just part of his process, it's common sense. Every other project that has release candidates does this, it's what "release candidate" means. I suspect you are on the spectrum, as are many brilliant people (likely Newton, likely Linus himself), and maybe for that reason, just can't see the other perspective. The end result is that, in pursuit of YOUR ideal goal (Linus merges everything you say, every time you ask, because you say your users deserve it), you are losing the real prize: having bcachefs be in-tree. If you want that you have to play by the rules. If you don't want that, of course, do what you want, but your users will suffer. It seems clear that the only reason Linus finally agreed to merge your patch is that he had decided to kick bcachefs out for the next release, and didn't want to hurt your users. But by insisting on this route, it is you who are hurting your users. If you could see the big picture, you would realise all this, apologise to Linus, agree to play by the rules in future, and beg to have bcachefs included. Not write nonsense like saying that showing basic respect for the, not just superior, but proven and unmatched managerial abilities of Linus, is a "generational thing". Respect for your elders may be a generational thing, but this is not that. If you can't see that, maybe your users shouldn't be depending on a filesystem developed by someone with your social skills. Whether in-tree or out, you and they will have problems. After reading the dozens of comments here, can't you see that you are in a tiny minority on this? If you still think you are right, good luck, to you and your users. Many of us want good in-tree FS software (I have been using ZFS but would love to switch to something equally good and in-tree) but not if it is developed by crackpots (*) (* no reference intended to a certain other Linux filesystem). [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 11:30 UTC (Mon) by job (guest, #670) [Link] > This is not _development_ we're talking about, this is _stabilizing_, This is the sticking point, isn't it? From a cursory glance at the patches, it is not hard to see why others disagree on that. It is clearly a bit of both, and as a casual observer of kernel development for many years, it is hard to understand why bcachefs should be treated differently than any other filesystem. [Reply to this comment] -rc binary kernel builds Posted Jun 30, 2025 14:27 UTC (Mon) by DemiMarie (subscriber, # 164188) [Link] How are your users getting -rc kernels? [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 21:53 UTC (Sat) by raven667 (subscriber, #5198) [ Link] Honestly, I disagree with your logic here, or at least I wouldn't prioritize the conflicting requirements the same way. Even though a bunch of people have taken a dependency on a clearly marked Experimental FS, even if that eats their data which leads to their bodily harm, that doesn't create an emergency on the part of the Linux project that justifies fast-tracking fixes beyond the normal release cycle. Linux has tons of bugs and fixes which can have equally severe consequences, that are handled in an orderly and sustainable fashion through the existing release management process. People who really want to directly track your development can run your kernel branch if they need to. I don't think "getting this done" in the mainline kernel in the time frame you've outlined is a goal for the rest of the Linux project, so I'm not surprised you aren't getting traction with that line of reasoning. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 2:06 UTC (Sun) by rsidd (subscriber, #2582) [Link ] (7 responses) It's experimental, so there's no reason to push a new feature into -rc3. That's the logic. It's not just the kernel, any rc in any project should be about bugfixes and not new features. Otherwise it will be chaos. The only exception should be for serious late-discovery bugs in core (not experimental) components that absolutely require a new feature to resolve. [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 7:53 UTC (Mon) by taladar (subscriber, #68407) [ Link] (6 responses) You could also argue the other way around. That stability guarantees like "only bugfixes go into rc versions" should not apply to experimental code since there is no expectation that the code won't break with new releases for those. [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 10:23 UTC (Mon) by edomaur (subscriber, #14520) [ Link] (5 responses) I cannot agree with that reversal : if you merge experimental code that way, how to you ensure that it doesn't break stuff that is supposed to be stable and bugfixes only ? [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 11:13 UTC (Mon) by job (guest, #670) [Link] (4 responses) This was exactly what happened the last time, some quick fixes for bcachefs broke the build on non-mainstream architectures. [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 12:34 UTC (Mon) by epa (subscriber, #39769) [Link ] Perhaps experimental kernel code should have a fixed list of architectures it gets built for. Adding support for bcachefs on m68k (to pick a random example) would be an explicit step and done only once the bcachefs developer confirms that he has access to an m68k box and is happy to test fixes on that (at least as far as compiling them). The architectures built would gradually increase to the full set as the new filesystem moves towards stable. [Reply to this comment] Overstreet brings this on himself Posted Jul 1, 2025 7:48 UTC (Tue) by taladar (subscriber, #68407) [ Link] (2 responses) Maybe code should not be built on architectures the developer/ upstream CI didn't test? [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 6:00 UTC (Wed) by dvdeug (guest, #10998) [Link] (1 responses) Kernel code should work on all architectures that make sense with the hardware. A filesystem should compile on all architectures. If it can't, it's probably not suitable for the upstream kernel. [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 8:10 UTC (Wed) by taladar (subscriber, #68407) [ Link] Turn that around. Architectures that don't have the means to provide test/CI systems for all developers writing code that should compile there are probably not suitable for the upstream kernel. [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 8:21 UTC (Wed) by jengelh (subscriber, #33263) [ Link] (1 responses) >No, I do not say that it is a stable fs - it's marked experimental for a reason! bcachefs.org has this to say, though: >Already working and stable So which one is it? [Reply to this comment] Overstreet brings this on himself Posted Jul 2, 2025 8:59 UTC (Wed) by Wol (subscriber, #4433) [Link] Why can't it be both? Let's take knives for example - your eating knives are usually pretty blunt and safe, they're safe in *any* environment. Kitchen knives, on the other hand, tend to be sharp and dangerous - fine when used as designed. The "experimental" tag marks bcachefs as a kitchen knife - not safe when given to script kiddies ... Cheers, Wol [Reply to this comment] Overstreet brings this on himself Posted Jul 1, 2025 17:38 UTC (Tue) by kiko (guest, #69905) [Link] (1 responses) The kernel release cycle is a pretty critical process for the community, so if it's the policies around it that are the root cause of the disconnect I don't think it's realistic to expect there will be much flexibility here. Even for a filesystem. If the main issue on your end is optimizing for fast turnaround time to end users, you would probably get the best bang for buck by providing bcachefs end-users with custom packages for Ubuntu in a self-managed repository or PPA. You can include both kernel and tools and quickly iterate on those, delivering end-users something more immediately useful, and then streaming the patches that have gone through testing to Linus through lkml, following the standard kernel release cycle. [Reply to this comment] Overstreet brings this on himself Posted Jul 1, 2025 20:58 UTC (Tue) by Wol (subscriber, #4433) [Link] > If the main issue on your end is optimizing for fast turnaround time to end users, you would probably get the best bang for buck by providing bcachefs end-users with custom packages for Ubuntu in a self-managed repository or PPA. And if said end users don't run Ubuntu ... Personally, I do not like Debian/apt-based distros. Or RH-based ... which is why I run gentoo or YaST-based distros. Cheers, Wol [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 18:08 UTC (Sat) by kov (subscriber, #7423) [Link] Having to wait a little bit to ship a new feature that can work in userspace anyway and it being impossible to ship a new filesystem are so far apart, though. It seems like you just don't want to follow the rules of the project. That is fine, you can just build outside of the project in that case. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 19:41 UTC (Sat) by SLi (subscriber, #53131) [Link ] I don't think you're 100% wrong in saying it's a generational thing. Probably previous generations were more submissive and accommodating to authorities. But at the same time what you're proposing sounds like something where Linus could be replaced by a shell script (or rather just giving commit rights to people). Do you think Linux development would work, perhaps even work better, if there was no integrator like Linus who gets to say no, and who is perhaps tolerated because he's willing to do the job and he still gets it right often enough that it's a net positive? In other words, do you actually see any value in this gatekeeping part of Linux development? [Reply to this comment] A range of expertise Posted Jun 28, 2025 20:53 UTC (Sat) by neilbrown (subscriber, #359) [ Link] > The other thing to consider is that: within the kernel we are all experts within our area, As a developer you need expertise in the problem space and the mapping between it and the code. As a maintainer you need a different expertise. You need to be able to work with people. Often people who (you think) are wrong or who (you think) are inconsistent or who (you think) are rude. With the code, technical excellence is a winning strategy. With people it isn't. There you need diplomacy. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 11:28 UTC (Sun) by snajpa (subscriber, #73467) [ Link] (6 responses) > So if process has become so inflexible that it is impossible to ship a new modern general purpose filesystem within that process Not only that, the kernel process isn't working for a lot more people, but there's always so many defenders of status quo, that one might get an impression that's how things are supposed to be and even that they are better this way... Cgroup design: oh but we're doing it so much better than all the kernels with OS-level virtualization, that came before... so much it needed a v2 - and it still needs barely functional band-aids such as LXCFS, when this all should have been in the kernel Scheduler: EEVDF is a flop, let's have it in eBPF - in fact let's have everything in eBPF \o/ CRIU - another huge flop... if only those guys were enabled to do it in the kernel, properly scaleable filesystems - total flop (just look at BTRFS' status quo...) - the comment above with FUSE is just magical :D The kernel development process has been long broken. Let's do everything in the userspace... oh but never ask why did it get to this, no, doing things in userspace is just cool and that's how it's done if you want to be cool too, definitely it is not a workaround to an absolutely dysfunctional process, nope nope :-D Kent, people _don't_ want to talk about this, categorically. You're on your own, even though there might be a lot of people who might agree - they need their patches to not be refused and/or spit on, like what you're getting. Loyality is the #1 priority in the development process, after all. You trying to be more loyal to your users than the gatekeeper devs, that is the problem, needed to be dealt with - and to be made an example of. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 14:23 UTC (Sun) by marcH (subscriber, #57642) [ Link] > they need their patches to not be refused and/or spit on, like what you're getting. Loyality is the #1 priority in the development process, after all. You trying to be more loyal to your users than the gatekeeper devs, that is the problem, needed to be dealt with - and to be made an example of. So much exaggeration and drama for merely delaying a repair _feature_ until the next cycle. Too bad; you seemed to have some interesting points? But now no one will pay attention to them. That's the "non-technical" part of team work. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 18:34 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (1 responses) Wait. EEVDF was a flop? Also, CRIU is fine in principle isn't it? It's less useful right now because it's namespace centric, but it doesn't seem like a terminally flawed idea like cgroupsv1 was. [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 18:19 UTC (Mon) by adobriyan (subscriber, #30858) [Link] CRIU is OK if you want to suspend jobs, upgrade kernel and resume jobs without using VMs. [Reply to this comment] Which development process are we talking about? Posted Jun 29, 2025 21:24 UTC (Sun) by neilbrown (subscriber, #359) [ Link] (1 responses) I think you are talking about a completely different problem to the one Kent is struggling with. As I understand it, the "development process" that is mostly being discussed here is the one that leads to one release every 2 months (approx) and requires features to be ready before version X is released if they are to land in version X+1. There is a lot more detail of course, but that is the gist. You seem to be referencing a completely different process - the process by which new ideas are designed, evaluated, and implemented. That isn't really a process. There is no formality, no common policy. It is mostly about power and politics. The only underlying core I can see is the principle that the person who does the work gets to call the shots. I agree this isn't ideal, but I cannot think of anything better. But anyone who really wants a different cgroups is free to fork the kernel and continue to benefit from all the parts that they don't hate. Maybe the new design will catch on. (Life for filesystem developers is MUCH easier here. Having multiple completing filesystems is much easier than multiple schedulers, cgroups, mem-managers etc). [Reply to this comment] Which development process are we talking about? Posted Jun 29, 2025 22:25 UTC (Sun) by koverstreet ( supporter , # 4296) [Link] I think in both cases it does quickly boil down to power structures, decisionmaking, etc. Foucault would have a field day with this stuff :) [Reply to this comment] Overstreet brings this on himself Posted Jun 30, 2025 11:22 UTC (Mon) by job (guest, #670) [Link] You seem to be arguing for the kernel to ship *less* experimental code and wait longer for the design to be really polished before accepting it? Which is in itself a perfectly valid opinion but not really relevant to the release cadence being questioned here. [Reply to this comment] The trust issue Posted Jun 30, 2025 10:09 UTC (Mon) by ajb (guest, #9694) [Link] The problem here is one of trust. Linus knows that you are technically competent. He doesn't know that he can trust your judgement. You are so caught up in trying to prove that your position is best that you are missing the bigger picture. In order to convince your colleagues to change, you have to not only advocate for your position, but show that you understand *their* position. Otherwise there's no reason to believe that you understand the difference enough to really know that your position is better, or better enough to be worth changing. Your approach is simply to do what you think is best, and then only tell people the justification for why you need to diverge from the process afterwards, when you get push back. And then you argue semantics like 'this is really a fix'. The alternative would be to show that you know what Linus judgement *would* be. "Here are the uncontraversial fixes, here is another thing which I know wouldn't normally be accepted in this period but these are the reasons why I think it should be accepted". Without the second approach, Linus basically has to treat each of your pull requests like an unexploded bomb, because he can't trust you to flag things up. And he's not willing to do that. [Reply to this comment] Overstreet brings this on himself Posted Jun 28, 2025 17:29 UTC (Sat) by proski (subscriber, #104) [ Link] "I'm the one who's responsible for making sure that bcachefs users have a working filesystem" I don't mean to criticize Kent, as I don't know the context, but this quote is factually incorrect. As long as bcachefs is in the kernel, everybody shares responsibility to keep bcachefs working. If someone else breaks the bcachefs code or causes speed degradation in bcachefs, they will have to fix it. In software companies, it's usually manager's job to make sure that people feel like they are one team, that they can rely on others and they share responsibility for delivering a high quality product. But there are no managers in Linux kernels, only leads and principals. I would hate to see bcachefs removed from the kernel and join ZFS as another filesystem that is good but cannot be used easily. [Reply to this comment] Overstreet brings this on himself Posted Jun 29, 2025 18:50 UTC (Sun) by jmalcolm (subscriber, #8876) [ Link] Ya, those quotes speak for themselves. I am a big fan of bcachefs but honestly the "don't you know who I am" talk would not impress me if I was Linus. And is Kent Overstreet really lecturing Linus Torvalds on the weighty responsibility of having to worry about users? Who do you think has more riding on them, the author of a narrowly deployed, experimental filesystem, or the chief maintainer of the most widely deployed operating system kernel in the world? My work is important Linus. Don't you get it? It is precisely because Linus feel responsibility for the robustness of the kernel that he cannot let every cowboy that comes across inject risk into the kernel. Does Kent Overstreet not understand that? I really, really like bcachefs and hope to see it widely deployed. For that reason, I really wish Kent Overstreet would park a bit of the ego and think of his users instead of begging to be ejected from the kernel. [Reply to this comment] Linus had had lots of patience with him Posted Jun 27, 2025 16:23 UTC (Fri) by highvoltage (subscriber, # 57465) [Link] (5 responses) Linus has been so patient for so long and I don't think I (or most maintainers/developers) would've had the patience to take it this far with Kent. I think even his message in that commit is far more civil than what I could've pulled off. [Reply to this comment] Linus had had lots of patience with him Posted Jun 28, 2025 2:08 UTC (Sat) by Kluge (subscriber, #2881) [Link ] (4 responses) Kent does seem difficult to work with, but as long as Bcachefs is explicitly experimental, and it isn't breaking anything else in the kernel, it makes sense for the rules to be looser in its development. [Reply to this comment] Linus had had lots of patience with him Posted Jun 29, 2025 19:21 UTC (Sun) by jmalcolm (subscriber, #8876) [ Link] (3 responses) I disagree. Being marked experimental tells me, the user, that I may have problems. If we are talking features, as a user of the mainline kernel, I can wait. If we are talking bugs, "experimental" tells me that I may run into them. Kent Overstreet has an his own kernel tree. If I run into a bug serious enough for me to require a kernel update, it is acceptable that the "fix" or even just "work around" would be that I need to get and maybe build Kent Overstreet's copy of the kernel if I cannot wait. It is precisely because it is experimental that I think pushing to the kernel in a panic is a mistake and not required. Kent Overstreet already has a kernel he can point me to in an emergency. It does not have to be the one I get from Linus. However, it would sure suck if the ONLY place I can get bcachefs is the "emergency" kernel from Kent Overstreet. So, please, let's not get kicked out of the kernel. [Reply to this comment] Linus had had lots of patience with him Posted Jun 30, 2025 8:11 UTC (Mon) by taladar (subscriber, #68407) [ Link] (2 responses) On the other hand as a user with data integrity issues who does not compile kernels I would likely never look at a filesystem again if those issues stayed unfixed for several months. [Reply to this comment] Linus had had lots of patience with him Posted Jun 30, 2025 13:53 UTC (Mon) by Baughn (subscriber, #124425) [ Link] (1 responses) As such a user, you should not have been using an experimental filesystem. [Reply to this comment] Linus had had lots of patience with him Posted Jul 4, 2025 21:43 UTC (Fri) by Kluge (subscriber, #2881) [Link ] I don't see the point in lecturing users about how to meet their needs. [Reply to this comment] Recovery should be in userspace Posted Jun 27, 2025 18:44 UTC (Fri) by DemiMarie (subscriber, # 164188) [Link] (8 responses) I think one of the main reasons for the problem is that recovery is implemented in the kernel rather than in the userspace tools. If recovery were in userspace, then new features to fix broken filesystems would not need kernel changes, allowing the kernel to stay stable. [Reply to this comment] Recovery should be in userspace Posted Jun 27, 2025 19:06 UTC (Fri) by willy (subscriber, #9762) [ Link] (7 responses) Counterpoint: https://blogs.oracle.com/linux/post/ xfs-online-filesystem... [Reply to this comment] Recovery should be in userspace Posted Jun 28, 2025 14:44 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (6 responses) Darrick had over half as much time to do just online repair as I've had for all of bcachefs. That meant he had the resources to go "full waterfull", and do all the testing for every conceivable situation (I've seen the tests he's got). I have to be pragmatic. No one is going to fund an entire new modern filesystem to perfection - and then get it merged. That will never happen, we know how many engineer years filesystems take. That means I make sure the design makes sense, make sure the core and the things that matter are correct, make sure the code is debugable, and then I am extremely reliant on the community (and bcachefs has a big one) for QA. [Reply to this comment] Recovery should be in userspace Posted Jun 28, 2025 22:15 UTC (Sat) by raven667 (subscriber, #5198) [ Link] (5 responses) > I have to be pragmatic. No one is going to fund an entire new modern filesystem to perfection - and then get it merged. That will never happen, we know how many engineer years filesystems take. Right, which means as a consequence that some number of janky versions that eat your data are going to be Experimental and part of released kernels, for years and years and years, until they stabilize and earn trust over time. The kernel development community is not asking for perfection out of the gate, that would be stupid, I believe they accept that imperfect versions will exist and have made peace with that. [Reply to this comment] Recovery should be in userspace Posted Jun 28, 2025 22:39 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (4 responses) The kernel community is ok with slow, but being just one person (and do I keep having to remind people why that happened? what happened to my chances of building a team since going upstream), I have to work efficiently. [Reply to this comment] Recovery should be in userspace Posted Jun 29, 2025 9:36 UTC (Sun) by pbonzini (subscriber, #60935) [ Link] (2 responses) It *is* efficient to delay journal rewind to the next release, and direct whoever was a victim of the bug and doesn't have a backup to the "next" tree. [Reply to this comment] Recovery should be in userspace Posted Jun 30, 2025 8:14 UTC (Mon) by taladar (subscriber, #68407) [ Link] (1 responses) Is it though? Who is going to be protected by that delay? Users who don't compile bcachefs won't be impacted at all if bcachefs introduces new bugs or features in a late rc version. User who compile but don't use bcachefs will only be impacted if bcachefs introduces new compile time errors which should be easy enough to verify even in a late rc. Users who use bcachefs will get a maybe bad version instead of a known bad version which is an improvement. [Reply to this comment] Recovery should be in userspace Posted Jun 30, 2025 10:25 UTC (Mon) by farnz (subscriber, #17727) [ Link] Users who run the latest Linus release, compile bcachefs (as a module or as a built-in), but don't use it, are potentially broken by changes Kent brings in. There's been some really obscure compiler, firmware, and linker bugs triggered (not caused) by simply increasing the amount of code compiled in the system, and that assumes that Kent's changes are limited to fs/bcachefs - if he has to touch anything elsewhere in the kernel, the risk goes up. The only people who are safe are those who don't compile bcachefs at all. But if Linus only cares about people who don't compile bcachefs, why have it in tree at all? [Reply to this comment] Recovery should be in userspace Posted Jun 29, 2025 16:35 UTC (Sun) by josh (subscriber, #17465) [ Link] > and do I keep having to remind people why that happened? what happened to my chances of building a team since going upstream Yes, considering many of us have no idea what this is referring to. [Reply to this comment] Sad if true Posted Jun 27, 2025 21:19 UTC (Fri) by jmalcolm (subscriber, #8876) [ Link] (47 responses) As an enthusiastic bcachefs user, this is a very sad and serious situation. Bcachefs is running on the system that I am typing on now. It is especially tragic as it appears that, despite all the drama, bcachefs is getting quite close to a stable on-disk format which will not change from release to release. There has been talk of removing the "experimental" flag. Which means, it would soon be possible to rely not only on the current kernel but also to fall back to an LTS kernel if required. From my view on the sidelines, I was thinking that perhaps 6.17 was going to be the kernel that would be safe for more people to try bcachefs on and for me to rely on more completely as I would be able to fall-back if necessary instead of always having to push beyond the "latest" kernel to fix things. That said, even as a fan, it is hard not to be disappointed that Kent Overstreet has put us in this situation. Regardless of the merits, bcachefs was kept out of an entire kernel release not that long ago as a warning to Kent Overstreet. I was already breathing a huge sigh of relief that we had gotten back on track after that. Did Kent Overstreet see this situation as less serious than I did? Because when I read what Kent Overstreet was writing to Linus Torvalds after missing the merge windows for 6.16, my immediate reaction was panic that this was going to put bcachefs in jeopardy again. And now my fears appear to have been realized. If I was so easily able to guess the consequences of Kent Overstreet's latest actions, why could he not? It makes me so sad. The reason I have been able to use bcachefs is because it moved into the kernel, I have moved a half a dozen machines to it. It has been amazing. However, if it gets pulled from the kernel, I will have to stop using it. For lots of other reasons, I need to rely on the kernels and kernel packages that come from my distro. I am not going to be compiling custom kernels for these machines. Many of them use rolling releases, like Arch or Chimera, and this would be a hassle. Some of these machines rely on other kernel patches, for example recent Mac computers with T2 chips in them. The fact that bcachefs was in the kernel was what made it possible for me to use bcachefs. I cannot be the only one. [Reply to this comment] Sad if true Posted Jun 27, 2025 22:43 UTC (Fri) by linuxrocks123 (subscriber, # 34648) [Link] (2 responses) I've been compiling my own kernels for years. It's really not that bad. You can start with the distro kernel config if you want. If you've never done it before, you'll learn a lot by compiling the kernel. I wouldn't give bcachefs up if you like it. [Reply to this comment] Sad if true Posted Jun 28, 2025 5:51 UTC (Sat) by jmalcolm (subscriber, #8876) [ Link] Reasonable advice. I have been using Linux since the early 90's and used to compile my own kernels constantly back in the day (or at least it felt like constantly as my old 486 took quite a while to spit out vmlinuz). These days, I do not think I can commit to maintaining the kernel on a dozen machines. It is not just the kernel config as some of this hardware requires other kernel patches. Even Chimera Linux has patches that I have not delved into. Chimera uses musl, clang, and the BSD userland. Chimera supports ZFS out-of-the-box so I guess that is where I go if I cannot use bcachefs. Arch is harder. I do not like Btrfs. dkms is not an option for me on Chimera either so hearing Kent talk about that did not reassure me. And bcachefs itself has to be paired with the correct bcachefs-tools of course. With rolling-release distros like Arch and Chimera, this all feels even more frought with peril. In the past, I would have done all of the above gladly. These days, it is hard to guarantee that I will always have the time when I need to. For me, it really has to ship with my distro. If Kent Overstreet really wants to act as a champion for users, he should work harder to keep his filesystem in the kernel. [Reply to this comment] Sad if true Posted Jun 28, 2025 6:41 UTC (Sat) by eean (subscriber, #50420) [Link ] well for sure a distro like Arch Linux will have a kernel available with bcachefs if Linus pulls it for non-quality reasons. [Reply to this comment] Sad if true Posted Jun 28, 2025 5:56 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (43 responses) 6.15 has the stabilized on disk format. Yeah, it's sad. [Reply to this comment] Sad if true Posted Jun 28, 2025 13:56 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (42 responses) From maintainer to maintainer, what to include in an rc seems to be a very questionable hill to die for. There are plenty of times where people have bent the rules (me myself as recently as last week, see my latest pull request which basically adds support for TDX attestation), but you get to do that by showing a track record of moderation and conservative judgement. Recently I have switched to a workflow using topic branches + merge commits as much as possible and it gives extra flexibility that you might enjoy. You can place your own tree of completed features somewhere for the more fearless users, and structure all your pull requests as a series of merge commits. You can rebase as much as you want, for example letting your users run on top of the latest .0 while you send merge window PRs on top of -rc5 or so, and it's super easy to decide what goes in during stabilization and what waits for the merge window. But independent of the workflow, if in doubt I would keep rc changes to the bare minimum. Again: if in doubt, delay by a release. Your users might be missing a given feature but if they have lived without it so far they don't need it *today* in Linus's tree. If you want to talk about this or anything else, write me email. [Reply to this comment] Sad if true Posted Jun 28, 2025 20:46 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (41 responses) That's all well and good for you, but please try to understand the constraints and time pressures I'm under. People are waiting on this thing to be done. There's a real demonstrated need, and my to-do list stretches out for years. And on top of that, Linus's previous "I'm thinking about removing bcachefs from the kernel" - repeatedly - cost me an engineer. So no, "just do whatever makes Linus happy" is not an option, for those reasons and others I've gone into elsewhere. [Reply to this comment] Sad if true Posted Jun 28, 2025 22:11 UTC (Sat) by raven667 (subscriber, #5198) [ Link] > please try to understand the constraints and time pressures I'm under I've seen you say this a lot in different ways so at the risk of repeating myself, ISTM this time pressure is a self-inflicted constraint that doesn't apply to the rest of the Linux kernel development team, so they are under no obligation to respect it and subjugate themselves to it. The screaming voice in your head saying "this needs to get done, THIS NEEDS TO GET DONE" isn't shared by others and it seems like something you will need to bring to heel to be able to work productively with others. The other developers are aware that people use the kernel and depend on it as a matter of life and death, they shoulder that burden every day, maybe so well that it looks as if they don't care, but they are able to not be overwhelmed by that awareness and can work through an orderly release process without every eat-your-data issue being a panicked emergency. > So no, "just do whatever makes Linus happy" is not an option, for those reasons and others I've gone into elsewhere. You clearly feel that way, but this seems to be excessively black-and-white thinking, and others are not convinced to feel the same way. [Reply to this comment] Business decisions Posted Jun 28, 2025 22:50 UTC (Sat) by neilbrown (subscriber, #359) [ Link] (22 responses) If you are choosing you use Linus' tree as part of the path for your product to your market, but you are not willing to pay the toll, then that is a business decision. There are other routes to market. DKMS is one. Negotiating with a distro is another. Building your own distro is the path Oracle took.... If you are under a lot of "constraints and time pressures" then maybe you have sold something you cannot deliver? It would be foolish for a farmer to complain that the trains aren't fast enough to get their perishable goods to market on the other side of the continent. It is equally foolish to complain that a well established release process for Linux that was in place before you started on bcachefs is somehow the cause of you not being able to meet your commitments. If you didn't do the necessary logistics analysis, that isn't our problem. [Reply to this comment] Business decisions Posted Jun 28, 2025 23:05 UTC (Sat) by pbonzini (subscriber, #60935) [Link] I agree. You have to work with the constraints of the community, and claiming that you're different from everyone else is unlikely to fly. [Reply to this comment] Business decisions Posted Jun 29, 2025 0:01 UTC (Sun) by koverstreet ( supporter , # 4296) [Link] (20 responses) > If you are under a lot of "constraints and time pressures" then maybe you have sold something you cannot deliver? It would be foolish for a farmer to complain that the trains aren't fast enough to get their perishable goods to market on the other side of the continent. Have I though? Let's look at the data, there have been a lot of people prognosticating like this without doing so. - Two genuine "eat your filesystem" level bugs in the last two years, both quickly fixed, with code changes to ensure those classes of bugs can't occur again and new repair code so that users who were able to act quickly didn't lose data. For a filesystem still marked as experimental, I'd say we're well on our way to delivering what we advertise - "Filesystem is offline and repair code needs fixing" bugs have been falling off a cliff: aside from the subvol deletion bug which generated a lot of patches, I'm spending hardly any time on those since ~6.15 - Crashes, emergency read only bugs: also been declining fast, reports were heavily down with 6.15 and again in 6.16 - Regressions in mainline? We're doing exceptionally well there - Rate of bug reports has been dropping, drastically, in frequency and severity especially since 6.14 IOW: shipping a new filesystem is a massive undertaking, but by all the metrics things are going well. Now, if things are going well on the bcachefs side of things, don't you think it might be time to ask "do these process disputes make sense?" And I don't see that bcachefs has been violating the "well established release process". Bug fixes in rc kernels have always been ok, and that's what I'm doing (and 90% of controversies have always been over patches that were clear bugfixes). With the latest one, we're arguing over the semantics of "what is a bugfix for a filesystem", and this seems like an absurd level of drama over that. IOW, I see this problem as enforcement of the process being unable to know what to make of a project that's simply moving this fast, and stabilizing rapidly (thus, generating a lot of bugfixes). There seems to be a real cognitive disconnect here. In the last email I received from Linus, the thrust of which was "but I push back on pull requests all the time" - I saw more evidence of this, everything he linked me was him being perfectly polite and reasonable, whereas in the case of bcachefs it's been a steady stream of "oh hell no" and "I'm thinking about removing bcachefs from the kernel". !?!?!?!?!?!? [Reply to this comment] Business decisions Posted Jun 29, 2025 2:22 UTC (Sun) by hvd (guest, #128680) [Link] (9 responses) And I don't see that bcachefs has been violating the "well established release process". Bug fixes in rc kernels have always been ok, and that's what I'm doing (and 90% of controversies have always been over patches that were clear bugfixes). With the latest one, we're arguing over the semantics of "what is a bugfix for a filesystem", and this seems like an absurd level of drama over that. I have absolutely no stake in this, I'm not a bcachefs user, and I don't know the previous controversies, but the last one is the only one linked here and to me seems like an extremely obvious case of something that is not a bugfix. Making it easier for users to recover from other bugs is admirable, but it's not fixing the bug, and if it's important to other kernel developers to only merge bugfixes, I completely understand the hard pushback you got. The user in this case would have been able to recover their data with an external kernel module and/or a patched kernel, and once their file system was recovered, revert to the unpatched in-kernel module with zero issues, right? If so, this really is not a hill you should want to die on. I'm glad you're responding positively to the suggestion here to be stricter with what goes into the kernel and giving users a way to build a separate module for all the latest features. That looks like the obvious way to avoid conflicts like this and still give users what you want to give them, and I hope it is not too late for that to allow bcachefs to remain in the kernel. [Reply to this comment] Business decisions Posted Jun 29, 2025 2:27 UTC (Sun) by koverstreet ( supporter , # 4296) [Link] (8 responses) I've said it elsewhere, and I'll say it again: For a filesystem, the only reason to use a filesystem is if it keeps your data, and if it doesn't, we're failing at the core purpose of being a filesystem. So yes, the definition of "what is a bugfix" does need to be revised for filesystems. [Reply to this comment] Business decisions Posted Jun 29, 2025 6:18 UTC (Sun) by zdzichu (subscriber, #17118) [ Link] (4 responses) This is a classic red flag. "We are special" is a very harmful approach, indicating tendency to ignore established processes. I thought we, the software industry, eliminated all of such thinking during DevOps evolution a decade ago. [Reply to this comment] Business decisions Posted Jun 29, 2025 7:56 UTC (Sun) by marcH (subscriber, #57642) [ Link] > > > but the last one is the only one linked here and to me seems like an extremely obvious case of something that is not a bugfix. Making it easier for users to recover from other bugs is admirable, but it's not fixing the bug, and if it's important to other kernel developers to only merge bugfixes, Thank you! I was too lazy to read the mailing list discussion and had been searching for this critical piece of information across all comments here. > > So yes, the definition of "what is a bugfix" does need to be revised for filesystems. The definition of a "bugfix" is well established and understood and is never going to change; that ship sailed a long time ago. What you meant is "the rules should have an exception for filesystems". > This is a classic red flag. "We are special" is a very harmful approach, indicating tendency to ignore established processes. I thought we, the software industry, eliminated all of such thinking during DevOps evolution a decade ago. 100% [Reply to this comment] Business decisions Posted Jul 3, 2025 20:58 UTC (Thu) by skx (subscriber, #14652) [Link] (2 responses) The last time I remember "This filesystem is special" was reiserfs. The reworking there was also highly controversial and there were strong personalities and much talking-past each other on both sides of the debate. [Reply to this comment] Business decisions Posted Jul 4, 2025 5:23 UTC (Fri) by zdzichu (subscriber, #17118) [ Link] (1 responses) To be clear, anything claiming to be special is counter-productive, not only filesystems. [Reply to this comment] Business decisions Posted Jul 4, 2025 10:37 UTC (Fri) by taladar (subscriber, #68407) [ Link] I wouldn't say the problem is always on the side of the one claiming to be special, it can often also indicate a deeply broken culture in the rest of the system, e.g. most social change in history started out as a rebellion against the established norms and most revolutionary technical discoveries and scientific theories had to fight uphill battles against the majority in their field too. All that can really be said is that someone feeling the need to claim that they are special means that a problem exists on one side or the other. [Reply to this comment] Business decisions Posted Jun 29, 2025 19:50 UTC (Sun) by jmalcolm (subscriber, #8876) [ Link] Saying the rules have to be different for filesystems is disrespectful to other systems. The rest Linux did not stop mattering when bcachefs was merged. No wonder Linus has questioned that decision. Yes, as a user, I NEED things to work. Yes, a filesystem that loses my data is a problem. A But an operating system that crashes mid-job is a problem. A process that causes my data to be wrong is perhaps the biggest problem of all. Then again, maybe that is a security issue that causes my data to be stolen. Not losing data is something there are strategies I can employ to deal with. Especially, if I may be so bold, if I am going to use an "experimental" filesystem. Some of these other issues are harder to protect against and so, again, maybe even bigger problems. So, putting these other systems at risk because filesystems are more important is not massively convincing for me, as a user and a bcachefs user at that. Don't get me wrong. I want the guy writing my filesystem to have the mindset that data can never, ever, ever be lost. Please tell yourself every day that the filesystem "has only one job" when prioritizing robustness over features and even performance. I am cheering you on for that attitude. I am NOT taking the position that is it is ok for filesystem to lose data. However, the "my shit matters more than your shit" attitude is crazy. The Linux process may not be perfect but it has evolved into its current form for valid reasons. The idea that nobody has tried to do anything as important as you are doing is crazy. Rules need to apply to everyone. And the time for talking about if rules should change or not is NOT at the moment you are trying to break them. Follow the rules. Evolve the process. But, in the meantime, follow the rules. "There has to be some law". [Reply to this comment] Business decisions Posted Jun 30, 2025 4:31 UTC (Mon) by jmalcolm (subscriber, #8876) [ Link] > the only reason to use a filesystem is if it keeps your data, and if it doesn't, we're failing at the core purpose of being a filesystem. I agree with this. However.... I have removed bcachefs from exactly one computer that I use. It is a laptop that I use to teach a few college courses and I was using bcachefs as the root filesystem (quite happily). A recent bug in bcachefs caused Distrobox to stop working. Fingers may be pointed but I have 4 systems with bcachefs where Distrobox stopped working and other system running otherwise identical operating systems but xfs or btrfs and Distrobox continues to work on them. For me, this is a bcachefs issue. This bug made this computer completely unusable for its intended purpose. There are many programs that I need to run where Distrobox is the only solution I know. Without going into all the reasons, I will point out that the host is musl based for example. I did not lose any data but it did turn my system into a paperweight. Enough so that I completely wiped it and started over with xfs. I had classes. Things needed to work that day. Moving to a different filesystem was easier for me than trying to compile a patched bcachefs kernel that also had the other patches this machine would need. It is also worth noting that DKMS does not work on this system so that would not be a solution for me. I have backups of my important data. Data loss on disk would have been a much less impactful problem "for me" than the Distrobox issue (on this particular system). I have no problem with this at all. Bcachefs is experimental. Stuff is going to happen and I need to own that. It is my responsibility that I used it on a system that has to work. This was not really the kind of issue I expected and so I was not prepared for it. I need to own that too. I have an identically impacted system and I simply will not use Distrobox on it until the issue is solved. It does not matter there. [Reply to this comment] Business decisions Posted Jun 30, 2025 13:44 UTC (Mon) by nim-nim (subscriber, #34454) [ Link] Hi Kent, First of all, let me say you deserve all the credit for pushing a new filesystem. This is an incredibly high bar you set yourself given the amount of work that went in existing filesystems for decades. Second, you have to accept that the reach given by inclusion in Linus' tree, is not due solely to Linus' technical prowess. It is due to the fact he has the authority to force companies and devs to follow a common rules and cadence, no matter the wealth of the company or the size of each contributor's ego. And companies and individuals trust Linus to enforce those rules and cadence. It is probably his main job today. If you can not organise your commit flow around this cadence bcachefs' upstreaming will have failed no matter its technical merits. The sole alternative then will be to find a big and wealthy corporation willing to include your work in its products regardless of its upstream status. No one else will touch your filesystem with a ten-foot pole. We've all been bitten hard by "interesting" out of tree projects. There are too many high technical components in a 2025 kernel to bother with splinters that can not play by the common rules. The situation was very different 20 years ago. [Reply to this comment] Business decisions Posted Jun 29, 2025 5:17 UTC (Sun) by rolexhamster (guest, #158445) [ Link] !?!?!?!?!?!? Yes, that's all well and good, except that you're focusing only on specific trees at the (detrimental) cost of not seeing the proverbial forest. [Reply to this comment] Business decisions Posted Jun 29, 2025 21:44 UTC (Sun) by neilbrown (subscriber, #359) [ Link] (7 responses) I said: > maybe you have sold something you cannot deliver? and you said: > Have I though? Let's look at the data, and then go on to list lots of development statistics that suggest bcache is developing nicely - and I don't contest any of that. Then you say: > IOW, I see this problem as enforcement of the process being unable to know what to make of a project that's simply moving this fast, and now we are getting closer to the real issue. Yes - it is about speed. And the problem really is that you are relying on the slow train to get your perishable goods to market. The thing you appear to be "selling" that I don't think you can "deliver" is "a filesystem with quick turn-around time for new features delivered through Linus' kernel.org tree". Linus' tree is deliberately focused on a turn-around of 2 months (approx). If a feature is "ready" before version X is released, then it is free to land in version X+1. This is SO much better than what we had in the 2.3 and 2.5 days, but it apparently isn't sufficient for your vision for bcachefs. That doesn't mean your vision is wrong or bad. It does mean you need a different path for realising your vision. If you want to get features to customers faster, you can't use Linus' tree. Sorry. That isn't what it is for. That is not a service we provide. Please don't try to do that. If missing a release, and having to wait 2 months for the next release, is more than a minor inconvenience, then you are doing something wrongly. [Reply to this comment] Business decisions Posted Jun 30, 2025 18:36 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (6 responses) > Yes - it is about speed. And the problem really is that you are relying on the slow train to get your perishable goods to market. > The thing you appear to be "selling" that I don't think you can "deliver" is "a filesystem with quick turn-around time for new features delivered through Linus' kernel.org tree". Neil, we're not talking about features: it was a bugfix, because a filesystem that doesn't store and maintain your data is not working as advertised. Everyone has jumped on that because I labelled it as such in the original pull request - but that was only so that users would know about it. Those pull request messages are just as much for users (lots of people read them) as they are for Linus. (Perhaps I should have called it something else? New data recovery method? Knob? But we're just arguing semantics now). There are situations in life where a thing can fall into more than one category, and filesystems are big complicated beast, so we should not be surprised when that happens from time to time :) Secondly, this isn't just about whether I would be able to deliver, I'm fairly confident that this would apply to anyone trying to ship a new modern filesystem. The disconnect and lack of focus on making sure it works properly for users is a big part of the reason why btrfs stabilization stretched out as long as it did, and I don't consider doing it that way an option for bcachefs, and I don't think anyone else would either. [Reply to this comment] Business decisions Posted Jul 1, 2025 1:30 UTC (Tue) by marcH (subscriber, #57642) [Link ] (1 responses) > (Perhaps I should have called it something else? New data recovery method? Knob? But we're just arguing semantics now). A "bug" is surprisingly easy to define: it is something that does not work as expected. That's it. Sometimes there are disagreements on the expectations (as in "bug or feature?") but that does not seem to be relevant here. A "bugfix" is a code change that makes the bug go away and the software behave as expected. The vast majority of the time, bug fixes are _small_ changes in order to reduce risk and review bandwidth required and avoid the infamous "bug fixing loop" where each fix introduces a new, different regression. We have all been there; especially when coverage is lacking. When no small fix is available, it's not unusual to (temporarily) disable the feature in the imminent release. Depending on the feature and severity of the bug, it may be preferable to miss the corresponding feature entirely. I finally took the time to find and look at the commit. It is not small and neither the code change nor the commit message look like a bug fix. According to the commit message itself, it is a new disaster recovery _feature_. That looks great and very useful but it still not qualifies as a bug fix. The code for this new rewind feature looks plenty long enough to introduce bugs itself (I am not saying it does, just that this is a clear possibility), which could in theory enter "the loop" and delay a Release Candidate and that's why it's against the rules. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/... > Hopefully, this feature will be a one-off thing that's never used again: this was implemented for recovering from the "vfs i_nlink 0 -> subvol deletion" bug, and that bug was unusually disastrous and additional safeguards have since been implemented. But if it does turn out that we need this more in the future, I'll have to ... > Neil, we're not talking about features: it was a bugfix, because a filesystem that doesn't store and maintain your data is not working as advertised. Looking at commit messages, this looks like the actual bug fix: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/... PS: I really wonder about your priorities. [Reply to this comment] Business decisions Posted Jul 1, 2025 4:50 UTC (Tue) by raven667 (subscriber, #5198) [ Link] > PS: I really wonder about your priorities. That is just paying attention to the difference, sometimes a wide difference, between peoples stated priorities and goals, and the actual outcomes of their behavior, because if they are capable of understanding the relationship of cause and effect, the outcomes are a stronger indicator of their preferences than whatever they say. No one can say what motivates another person for certain because we aren't in their head (and maybe they don't even know themselves) but we can observe external behavior and make predictions of consequences pretty well, humans can naturally see the future in that way. [Reply to this comment] Business decisions Posted Jul 1, 2025 19:21 UTC (Tue) by Jordan_U (subscriber, #93907) [ Link] (3 responses) Maybe no new general purpose filesystems will make it into Linux in the next 10 years. Maybe one will, but it will be backed by a bunch of giant companies that can hire teams of developers. Maybe bcachefs will never stabilize because it will never have enough developers and other resources to do so. That would be tragic, for you and for many many others. Maybe someone will champion the cause of changing upstream development practices so that a new, non-big-tech-backed filesystem for linux can be developed to stability. It sounds like you don't have the time to be that champion, which is perfectly reasonable. But there are a lot of important subsystems / major projects that died away without ever getting to mainline. Almost certainly an order of magnitude or two more than actually did. There have even been (a lot of) big-tech pushes that never made it into mainline, because they couldn't fit within the mainline development expectations. I guess my point is that, even if bcachefs as a project dies "because of" shortcomings you see in the mainline development practices, I hope you're able to mourn that loss and move on with sadness, but not with anger. Many people predicted Linus' responses to many of your pull requests. I kind of hope that you were able to predict most of them too. Linus (probably) doesn't hate you personally. He's definitely motivated to do what's best for the project, its maintainers, and its users. If you keep making choices that you know, or should know, will garner a negative response then you're going to keep getting negative responses. And that's going to lead to unproductive anger and resentment from all sides. Maybe stop trying to do things that break the "rules" as Linus will predictably interpret them. If you can't do that and make good on your promises to users, then you have a difficult decision to make. I guess I'm recommending that, to avoid some anger on your part, and maybe i.prove your own health and wellbeing, you can and should choose to not delay hard choices and instead face them head on. And by "face them head on" I mean: If you've truly determined that bcachefs can't move forward without you breaking rules, as Linus would predictably interpret them, then end the project. I am pretty sure there's at least one person who would be willing to preview pull requests and tell you "Linus isn't going to accept that". Then you could choose to change your pull request to be acceptable to Linus, or, again, make the bigger (and very difficult) decision about what that necessarily means for the bcachefs project. I truly feel for you and the incredibly tough decisions you need to make. I will probably never put 1% of the work you've done just for bcachefs into any project I work on. I don't pretend to know what putting so much of your life into something is like. I wish you the best. [Reply to this comment] Business decisions Posted Jul 1, 2025 20:37 UTC (Tue) by koverstreet ( supporter , # 4296) [Link] (2 responses) > If you've truly determined that bcachefs can't move forward without you breaking rules, as Linus would predictably interpret them, then end the project Don't you think that's a bit... dramatic? [Reply to this comment] Business decisions Posted Jul 2, 2025 17:18 UTC (Wed) by pbonzini (subscriber, #60935) [ Link] (1 responses) It is, but it makes a point. Ultimately the future of bcachefs is a lot more hazy if it's not upstream. So if you agree that it's dramatic, maybe you also agree that this isn't "a hill to die for", to quote my earlier comment, and it's worth following the rules that the upstream community has set for itself. Social problems don't always have technical solutions, but technical solutions can help. The dual tier tree could be one, as it helps you probe how many adopters are okay with the content from Linus's tree and how many want the bleeding edge. [Reply to this comment] Business decisions Posted Jul 4, 2025 15:09 UTC (Fri) by koverstreet ( supporter , # 4296) [Link] No, I think bcachefs will do fine if we get kicked out of the kernel. a) there's every reason to think it would only be temporary: this happened before with Lustre, and they're getting ready to go back in. Hopefully the next time around there would be a real back and forth conversation about process. This isn't ZFS - that one is outside the kernel permanently because of licensing. b) Being outside the kernel opens up interesting possibilities. Maybe we finish the fuse interface, and fuse performance gets fixed - there is no inherent reason running in userspace via fuse has to be a noticeable performance hit for buffered IO. There'd be a lot of work to get to that point, but multiple people are working on fuse performance right now, it's been a very long standing problem with a lot of interest. That would be very cool. With more work, a filesystem outside the kernel might even be faster. Applications that need O_DIRECT are a bit of a problem, but solvable with L4 style IPC (and maybe a LD_PRELOAD so applications don't have to be converted). But the block layer is heavier than it needs to be, even with blk-mq - it retains a lot of legacy baggage, and we could easily just start talking to NVME devices directly. So who knows. But this is all pie in the sky stuff, and for the next two years I'm going to be staying focused on bcachefs itself - there's still a lot of work to be done. More online fsck, more self healing, finishing erasure coding, performance work, better management and monitoring (especially with respect to degraded data), more rebalance improvements (need to fix a bug where we still spin if you try to stuff more data into your background_target than it can hold) - better allocator policy for large numbers of devices, failure domains, > 64 device support (and maybe try for > 256) - there's a lot of cool stuff lined up for after stabilization finishes, and judging from the current rate of bug reports we're mostly there. I try not to worry too much about the big picture process stuff. As long as I can stay focused on the code, the rest will sort itself out eventually. [Reply to this comment] Business decisions Posted Jun 30, 2025 10:15 UTC (Mon) by paulj (subscriber, #341) [Link ] Someone else here with no immediate stake in your work. However, I did bcache was very interesting, and bcachefs looks to have a lot of promise - I would love to see a modern, stable, well designed, generally distributable COW fs. I'm sure many others would too. I can also feel your frustrations. That said, Neil's advice is good. If you can do your development on something that is easy to install via DKMS, then that will make your work very accessible to many many distro users. If bcachefs works well, it will pick up more and more users that way (DKMS is how many ZFS users install it). And that will build momentum you need. With a user-base, in time, at least some of the upstreaming friction will go away - it will be clear to upstream the fs has users, the changes have already had testing via DKMS, etc. Btw, do you have a Monero address? Some people reading may well send you a little every now and then. [Reply to this comment] Sad if true Posted Jun 29, 2025 2:55 UTC (Sun) by HenrikH (subscriber, #31152) [ Link] (5 responses) >but please try to understand the constraints and time pressures I'm under. And how much more constraints and time pressure will you be under if bcachefs is thrown out of the kernel? You are arguing like you are choosing between a rock and a hard place but you are in practice right now choosing between a rock and a nuke. Your choices on how to try and bend the rules have already cost you an engineer (going by your own words here), what do you image the cost would be if you continue down this path? [Reply to this comment] Sad if true Posted Jun 29, 2025 3:04 UTC (Sun) by koverstreet ( supporter , # 4296) [Link] (4 responses) > Your choices on how to try and bend the rules have already cost you an engineer No, I was quite explicit about that. Making threats like that would be considered abusive behavior coming from your boss, of the kind that would typically get HR involved. All of the previous process issues could have been handled with drastically less drama. > And how much more constraints and time pressure will you be under if bcachefs is thrown out of the kernel? You are arguing like you are choosing between a rock and a hard place but you are in practice right now choosing between a rock and a nuke. And it's because of things like that that I have to weigh "ok, this is going to slow adoption, and it's going to make the funding situation more precarious; but it will end the drama that was driving away contributors, and I'll get to write code without the emotional rollercoaster that came with every other pull request". Being upstream has meant a massive amount of distractions, from things that really could have been handled better. That's meant a lot of time away from writing code and supporting users. That's my job, not guessing what's going to make the Head Penguin get off my back about the next bugfix pull request (when I have users waiting on me to work on the next thing). Does that clarify how the calculation looks to me? :) [Reply to this comment] Sad if true Posted Jun 29, 2025 5:31 UTC (Sun) by rolexhamster (guest, #158445) [ Link] (1 responses) Being upstream has meant a massive amount of distractions, from things that really could have been handled better. That's meant a lot of time away from writing code and supporting users. That's my job, not guessing what's going to make the Head Penguin get off my back about the next bugfix pull request (when I have users waiting on me to work on the next thing). This kind of dismissive attitude is both arrogant and counterproductive. By putting bcachefs into the kernel you have implicitly agreed that other people will be part of the development and integration process. This includes the "Head Penguin" in your parlance. The users can wait, or if they're super impatient, directly use your development tree. [Reply to this comment] Sad if true Posted Jun 30, 2025 3:56 UTC (Mon) by jmalcolm (subscriber, #8876) [ Link] Well said. Agreed [Reply to this comment] Sad if true Posted Jun 29, 2025 6:14 UTC (Sun) by HenrikH (subscriber, #31152) [ Link] > No, I was quite explicit about that. Making threats like that would be considered abusive behavior coming from your boss, of the kind that would typically get HR involved. All of the previous process issues could have been handled with drastically less drama. That is not a threat, that is simply a summary of what happened. Your constant choices to try and bend the rules even though you had been warned before was exactly what lead to the situation where bcachefs was left out for one version which (according to you) also was the direct reason for losing an engineer. Try to sell a product in a country and refuse to follow the rules for that type of product in that country and your product will be pulled from the shelves, it does not matter how much you then complain about it being "abusive behaviour", your product is till pulled from the shelves. The very fact that you even decided to argue back demonstrates that you don't seem to understand the choices that you have here. [Reply to this comment] Sad if true Posted Jun 29, 2025 7:41 UTC (Sun) by marcH (subscriber, #57642) [ Link] > Being upstream has meant a massive amount of distractions, from things that really could have been handled better. That meant a lot of time away from writing code and supporting users. Then maybe something like DKMS is indeed better. > That's my job, not guessing what's going to make the Head Penguin get off my back about the next bugfix pull request (when I have users waiting on me to work on the next thing). You keep mentioning _what_ is and what isn't your job but you never seem to touch on _who_ decides what your job is and isn't. Ignoring all technical issues for a short moment, the way you keep acting and talking feels like you don't really realize that targeting upstream implies that some of the "penguins" are actually your superiors (whether they are right or wrong). You are of course always free to "resign", stop that upstreaming "job" and "part ways". That already happens a lot in the kernel (for very diverse reasons). Thanks to open source, you could still develop without respecting any authority. That sort of branching comes of course at a price: the price of complete freedom. [Reply to this comment] Time pressure Posted Jun 30, 2025 4:23 UTC (Mon) by wtarreau (subscriber, #51152) [ Link] (9 responses) > but please try to understand the constraints and time pressures I'm under That's precisely what the current development model is trying to break: rushed merges of the last minute that "surely cannot go wrong". We all have other things to do and cannot work faster, but there's a period for developing and a period for fixing our mistakes. You may think that your PRs are bug-free but ultimately you will find bugs there, very likely in code that was merged late in emergency. New features get merged slowly in all subsystems and rc>1 is for fixes only, there's no reason bcachefs would need to be an exception to this. Anything not merged in time is just too recent to be merged. Very likely in the next 2.5 months you'll write fixes for what you wanted to merge now. You definitely need to have topic branches, and you'll see that your users who "absolutely need online recovery" will suddenly learn how to build a topic branch. For other ones, you could possibly implement a userland tool like DemiMarie suggested. It's true that Linus can sometimes be wrong on specific points, sometimes he complains about things that stem from a misunderstanding that definitely requires some fixing to avoid others falling into the same trap, but when it comes to the process of getting code merged in a good state, he's generally right and it's *his* area of expertise. Yes he can be rude, but he also knows that this costs him a bit (reputation etc) and he only does this when he knows it's valuable to do it, i.e. he considers the other person as smart enough to understand his point and is trying a last chance to change their mind. You may disagree with him but then you need to explain to him clearly why you're in a different situation from all other maintainers to try to change his mind (hint: you're very likely the one having to adhere to his approach, like all other ones). I really think you're putting too much pressure on yourself due to your deployed user base. You cannot save all the data in the world, and people *will* lose their data by taking the risk to use experimental code. Your goal is not to rescue at any price all those who accepted to take that risk, but to make sure that ultimately those who adopt your code in good faith once it's no longer experimental almost never face such problems. And this will be in part thanks to those who took risks early. The time you invest trying to save victims now is not well invested to help the other camp later. Granted bug reports are super useful. You only need to develop a relation of trust with your current users, and that does not necessarily pass via rushing features, but indeed possibly offering them other solutions (topic branches and user land tools). IMHO you need to create such topic branches, and let Linus know that from now on you'll only send fixes for rc>1 and that in exchange of this, the next merge window will contain many more changes that will have accumulated in the topic branches, and that will likely need to receive fixes in the subsequent rcs. [Reply to this comment] Time pressure Posted Jun 30, 2025 10:34 UTC (Mon) by farnz (subscriber, #17727) [ Link] (8 responses) And just to extend on that; other subsystems I've used with experimental components have run at least three parallel branches: 1. for-next, which contains the stuff they want Linus to pull next merge window. This is where feature work that's ready to deploy accumulates, and is the tree Kent's asking Linus to pull during RCs now. 2. for-rc, which is the minimum essential stuff for the current RC, and only normally contains regression fixes, or cases where a bug has come to light that's critically destructive if you hit it. This is what Linus wants. 3. One or more topic branches for code under development. You encourage your users who hit bugs to run for-next, not Linus's tree. You merge Linus's tree into all your branches when Linus makes a release (merge the release tag). And you accept that users who choose to run Linus's kernel instead of your for-next will miss out on features that you've already written and expect them to need - for example, they may have to run offline fsck to recover their filesystem after corruption, rather than having online recovery work, or they may have to leave the FS unmounted until they have built for-next. [Reply to this comment] Time pressure Posted Jul 1, 2025 15:55 UTC (Tue) by garloff (subscriber, #319) [ Link] (2 responses) Good advice! Others (e.g. Neil) have given very good advice as well. I wish you can take it, Kent. I've had the privilege of working with brilliant engineers. Some of them somewhat similar to Kent. It was my job (as manager) to ensure they work well with the upstream processes and community. That was hard and painful. For the involved engineers and also for me. It paid off - we have e.g. done the right things in the memory subsystem 20yrs ago, so yes, it paid off. (Feel free to research if you are not as old as I am. If you are, it's probably obvious.) Learning to listen and truly understanding the perspectives of others, especially of experienced smart other people is a key skill for collaborative work (such as open source engineering), unfortunately often short in supply. I've tried to learn this and I have occasionally had some success in helping others to learn. Maybe you can as well, Kent. [Reply to this comment] Time pressure Posted Jul 1, 2025 15:56 UTC (Tue) by garloff (subscriber, #319) [ Link] (1 responses) Sidenote to self: Probably I should have highlighted Paolo as well for giving good advice... [Reply to this comment] Time pressure Posted Jul 2, 2025 17:19 UTC (Wed) by pbonzini (subscriber, #60935) [ Link] No problem :)) [Reply to this comment] The bcachefs development process Posted Jul 1, 2025 19:11 UTC (Tue) by Tobu (subscriber, #24111) [Link ] (4 responses) What you suggest is close to the current state of bcachefs branches ( Gitweb link). Kent's workflow is likely built around git rebase --update-refs. * bcachefs-for-upstream is the branch Linus pulls from. It is focused on bug fixes, except once the merge window opens, when it takes the commits in for-next. Linus' interpretation of bug fixes can differ from Kent's, as here, but it really has much less work than what the other branches contain, and has been through multiple rounds of stabilisation. * for-next is what you expect; the one the external CI bots are pointed at. This is bcachefs-for-upstream, plus work that's ready for the next merge window. * master is for-next, plus a few tweaks external to fs/bcachefs/ that haven't percolated to other subsystems yet. This is the branch users would run if they want something newer than Linus' tree. One things those users should consider, however, is that, along with the other branches, it is often rebased on an early -rc once per cycle. This makes sense inasmuch as when the merge window opens there are often dependencies on other subsystem work (the overlayfs thing will trigger this in the next merge window most likely). But it does mean that (in most recent cycles) this is for users who are prepared to build and run -rc kernels (and re-merge with newer -rcs as necessary and as Linus publishes them). Because this is the branch eager users run, it is actually pretty well tested, and can be recommended for users who are prepared to report the occasional bug. * bcachefs-testing is master, plus newer commits that are actively being worked on. This is the one where the bcachefs CI (built on ktest and xfstests) does most of the work, and the one where patches land (most from Kent, but it is also where stuff on the mailing list is integrated). Occasionally someone will be pointed there, or cherry-pick a commit from that, often because they've hit on a case that requires extra diagnostics or a tweak to the self-check logic. This process has proven itself, in that Kent works productively, the branches are tested both automatically and in the wild, and patches are well-tested by the time they are sent to Linus. It would require adjustments if bcachefs/master were made more widely available; for the average user it would be better to propose a branch based on the latest stable dot-release. For the current process, having all branches linear and in a state close to what Linus will merge has reduced the risk of sending him regressions, despite the development activity. Topic branches would introduce the risk of less-well tested interactions if someone was the first to pick a combination of topics; they also would clash with the way the data format evolves linearly (through upgrade passes, though with format stabilisation new passes are less likely to be introduced). [Reply to this comment] The bcachefs development process Posted Jul 2, 2025 6:21 UTC (Wed) by wtarreau (subscriber, #51152) [ Link] (3 responses) Then there are already topic branches (testing/master/for-next etc might match the development model) and that's fine, but in this case there's no reason for pushing certain things to -rc instead of continuing to develop them in the development branch. With that said it's not very clear to me in the current situation where future development is supposed to happen while fixing what's related to latest submissions. That might be the distinction that could be missing. [Reply to this comment] The bcachefs development process Posted Jul 2, 2025 14:53 UTC (Wed) by koverstreet ( supporter , # 4296) [Link] (2 responses) The patch in question - journal rewind - is already being used by myself and another user for a completely different bug - the user had a (replicated) filesystem with a 400GB postgres database on a flakey USB controller. (What will they think of next?). Unfortunately the issue wasn't detected early enough, so it's making it hard to tell what the root cause was (but flaky hardware makes it sound like a cascade failure - cool). Even so, about a terabyte of data is being recovered that wouldn't have been otherwise. So if it's being used successfully for two different bugs in two weeks, getting it in was the right call. I've learned to trust my intuition. [Reply to this comment] The bcachefs development process Posted Jul 2, 2025 16:21 UTC (Wed) by wtarreau (subscriber, #51152) [ Link] (1 responses) I understand your feeling, I'm also always torn between merging/not merging my stuff in my projects, but admittedly here it's a new feature, and the fact that it saved two people's data in two weeks doesn't exclude the risk of causing trouble (even if less important) to others (or even just developers/testers), and that's precisely the point of the development process. In haproxy for example we're much less strict than the kernel regarding the merge window (much smaller team, so the -next branch is super small), so we'll code roughly for 4 months out of 6 and try hard to mostly do bug fixes in the last two months. Guess what ? The vast majority of post-release bugs are caused by apparently harmless changes from the last month, that used to work perfectly when and where tested. The likelihood that your cool new recovery feature that was missing before it was developed, would be missing to numerous users until next merge window remains small, and alone would be sufficient to justify merging that cool stuff only in the next release, possibly with even more testing, positive feedback and bug fixes that would have had time to accumulate by then. And it would also avoid the risk to create precedents where everyone starts to include more important changes after the merge window. [Reply to this comment] The bcachefs development process Posted Jul 2, 2025 22:10 UTC (Wed) by koverstreet ( supporter , # 4296) [Link] No, we're not talking about a "cool new feature". I understand that you're speaking from personal experience, but this is filesystem development we're talking about, where we've got a massive investment in QA processes - and we rely on those heavily, especially because we need to make calls like this on a regular basis. It's not uncommon to have patches that justify going out quickly, and this patch was not outside the norm in complexity or likelyhood of breakage; but it was outside the norm in terms of potential for risk mitigation. We do a lot of risk mitigation in the filesystem world :) I'm afraid your experience isn't likely to be comparable - I don't see you weighing the cost/benefit analysis or talking about those QA processes, and those are critical to decisions like this one. [Reply to this comment] Sad if true Posted Jul 1, 2025 8:00 UTC (Tue) by Nikratio (subscriber, #71966) [ Link] > Please try to understand the constraints and time pressures I'm under. Did you explain somewhere what these constraints and pressures are, and where they come from? Just curious. [Reply to this comment] Dual tier? Posted Jun 28, 2025 8:06 UTC (Sat) by callegar (guest, #16148) [Link] (8 responses) Wouldn't it make sense for the time bcachefs is experimental to have bcachefs both part of the kernel /and/ shipped as a dkms module? This would enable being very conservative regarding what goes in the kernel and the process by which it goes in the kernel at the cost of leaving the main kernel a bit lagging, but would retain the advantages of having bcachefs in tree, including other bits of kernel development having to consider not breaking it. At the same time it would enable those who want the latest and most shining to rely on the dkms module. As long as there are no more on disk format changes, it should be possible to swap one for the other. Would it take too much energy to maintain? [Reply to this comment] Dual tier? Posted Jun 28, 2025 8:09 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (2 responses) I think that would make a lot of sense, actually. It would let users test the latest development branch of bcachefs without having to rebuild a whole kernel. [Reply to this comment] Dual tier? Posted Jun 28, 2025 11:30 UTC (Sat) by rincebrain (subscriber, # 69638) [Link] (1 responses) That can be quite messy if it's anything but all in one module, though - various network and storage HBA drivers have offered out of tree versions for some time, but if you're not careful about it, you can wind up with weird cases like DKMS only rebuilding some of the modules and winding up in a frankenmodule state, or the automated tooling deciding it can just use the older kernel's DKMS build on the newer one, incorrectly. And if it's required for the happy path to / mounting, well, your debugging experience can be quite bad, especially if you didn't know to set it up a priori... [Reply to this comment] Dual tier? Posted Jun 28, 2025 13:57 UTC (Sat) by pbonzini (subscriber, #60935) [Link] If you want to avoid that, just build it yourself instead of using dkms. [Reply to this comment] Dual tier? Posted Jun 28, 2025 14:34 UTC (Sat) by koverstreet ( supporter , # 4296) [Link] (3 responses) Actually, that might not be a bad idea. [Reply to this comment] Dual tier? Posted Jun 28, 2025 15:48 UTC (Sat) by pbonzini (subscriber, #60935) [Link] Yep, see my other message on how that might happen, with an external-module tree that constantly rebases and topic branch merges into both the external-module tree and the for-linus tree. Now that the format is stable, there shouldn't be big hurdles to doing this. [Reply to this comment] Dual tier? Posted Jun 29, 2025 22:24 UTC (Sun) by linuxrocks123 (subscriber, # 34648) [Link] (1 responses) If the upstream kernel community is willing to work with you that way, then, yes, that may be your best approach. BUT IF THEY ARE NOT: don't feel at all bad about just dropping upstream and doing DKMS, despite what other people here are telling you. You don't owe _ANYONE_ _ANYTHING_ when you're giving away code for free. If it would be more fun for you not to have to deal with Other People Being Wrong, just do it your way and don't look back. Life is too short to spend it dealing with crap from other people. [Reply to this comment] Dual tier? Posted Jul 1, 2025 12:02 UTC (Tue) by pbonzini (subscriber, #60935) [ Link] The upstream community couldn't care less about what happens outside Linus's tree, it only cares cares about everybody following the same rules when it comes to merge window vs stabilization, because those are the rules for the upstream tree. [Reply to this comment] Dual tier? Posted Jun 29, 2025 7:45 UTC (Sun) by marcH (subscriber, #57642) [ Link] Releasing different versions concurrently: interesting! We could call that... branching? :-) [Reply to this comment] Poor Overstreet Posted Jun 29, 2025 4:39 UTC (Sun) by Subsentient (subscriber, # 142918) [Link] (7 responses) I've had the exact same issue with devs before. Holding PRs hostage by demanding a total architecture change that they prefer, demanding irrelevant features, never responding to PRs, etc. I feel more on Overstreet's side on this. Whatever the reasons, I don't think Linus is behaving well by holding PRs hostage. [Reply to this comment] Poor Overstreet Posted Jun 29, 2025 7:43 UTC (Sun) by marcH (subscriber, #57642) [ Link] > I've had the exact same issue with devs before. Holding PRs hostage by demanding a total architecture change that they prefer, demanding irrelevant features, never responding to PRs, etc. We've all seen things like happening in various projects. None of them seems to accurately describe what happened here. [Reply to this comment] Poor Overstreet Posted Jun 29, 2025 18:53 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (4 responses) The genius of free software is creative destruction. Every argument over a patch is a fork-in-being. Every patch blocking obstinate maintainer has to keep in mind the hierarchy of escalation available to any aggrieved contributor and temper his natural inclination towards statis and hostage taking with the healthy fear of another EGCS. In commerical software the profit motive has a similar modern moderating effect on maintainer obstinacy. In the free world, the latent fear of shame and of losing control does the same job. Is Overstreet going to successfully fork Linux? No, almost certainly not. Is his DKMS threat real? Yes. Linus can see ZFS on Linux gain traction. Does Linus want to create an environment in which the most advanced features of Linux live out of tree, in which it's normal and expected to assemble systems from DKMS, and worst of all, users scream bloody murder when Linus breaks these now-load-bearing out of tree components? And will he be happy maintaining compatibility with these ecosystem components over which he now has no leverage? As for the object level: the whole conversation framing bugs me. Is Linus right to delay nonessential bug fixes until the next merge window? Sure. Is Overstreet right to push for inclusion? Fine. What *isn't* right is the "respect your elders" framing from some other commentators on this article. Elevating primate social hierarchies above technical correctness is an anti pattern and elevating it to some kind of norm is shameful. When people say correctness is optional but comity is essential, they're elevating the imaginary over reality. They're putting word games above bits. It's the same mentality that censored early science. It's not only wrong, but despicable. Nobody should feel pressed to shut his mouth when he's right just because someone more senior that him is speaking. "Fuck that" is the only way I can state this idea emphatically enough. I suspect Linus is right here about how the patch can wait. But he needs to express this decision with an attitude of respect, not haughty authority. And moreover, people need to stop telling Overstreet to shut up about it. Disagree and commit doesn't mean pretending not to disagree. [Reply to this comment] Poor Overstreet Posted Jun 30, 2025 8:16 UTC (Mon) by interalia (subscriber, #26615) [Link] > What *isn't* right is the "respect your elders" framing from some other commentators on this article. Elevating primate social hierarchies above technical correctness I don't think it's really a "respect your elders" thing but simply that it's the social reality when you and the 'elder' disagree strongly on what is technically correct. Once that happens, your choices are a) to accept it (note that accepting doesn't mean pretending not to disagree, simply recognising who has pull power on the relevant tree), b) to fork the codebase, or c) to remove your code. Maybe I missed it in the mass, but I haven't really seen any comments that elevate camaraderie over technical correctness. You seemed to write as if everyone here has been saying, "yes Kent, you're technically correct but play nice because Linus is the boss". From my reading of the comments, the majority of people disagree that Kent is technically correct or (at best) are neutral/don't find his argument compelling, so they aren't acknowledging any correctness that needs to be suppressed for comity. And again, even if everyone agreed he was technically correct, it doesn't change the social fact that Linus has control over his tree, and that's the person Kent wants to accept his changes and we're still at the a/b/c options above. [Reply to this comment] Poor Overstreet Posted Jun 30, 2025 12:09 UTC (Mon) by tuna (guest, #44480) [Link] "As for the object level: the whole conversation framing bugs me. Is Linus right to delay nonessential bug fixes until the next merge window? Sure. Is Overstreet right to push for inclusion? Fine. What *isn't* right is the "respect your elders" framing from some other commentators on this article. " Respect your elders is not the point. "Respect the rules" is. If Overstreet wants to change the rules he should push for that in a separate channel. If he thinks some issue require an exception from the rules he should clearly state that BEFORE sending in the code. Doing it after is just really bad communication. [Reply to this comment] Poor Overstreet Posted Jun 30, 2025 17:47 UTC (Mon) by marcH (subscriber, #57642) [ Link] > What *isn't* right is the "respect your elders" framing from some other commentators on this article. I don't remember seeing this framing much, I'm afraid it's your personal perception. What I remember reading: "Respect the owner of the branch you want your code to be merged to" and that's it. Feel free to branch and release whatever you want, whenever you want, anywhere else. > Nobody should feel pressed to shut his mouth when he's right just because someone more senior that him is speaking. The process for that targeted branch is being discussed and no one ever remotely suggested any censorship. The debate is only about the tone and who decides in the end. Requesting an exception or any process change is fine as long as you are ready for that request to be denied by the owner of the branch. > It's the same mentality that censored early science. Besides the lack of censorship, there is nothing in science like releasing, branching, forking and versioning that are discussed here. This not a good analogy. [Reply to this comment] Poor Overstreet Posted Jun 30, 2025 20:36 UTC (Mon) by edeloget (subscriber, #88392) [Link] > As for the object level: the whole conversation framing bugs me. Is Linus right to delay nonessential bug fixes until the next merge window? Sure. Is Overstreet right to push for inclusion? Fine. What *isn't* right is the "respect your elders" framing from some other commentators on this article. Elevating primate social hierarchies above technical correctness is an anti pattern and elevating it to some kind of norm is shameful. When people say correctness is optional but comity is essential, they're elevating the imaginary over reality. They're putting word games above bits. It's the same mentality that censored early science. It's not only wrong, but despicable. Nobody should feel pressed to shut his mouth when he's right just because someone more senior that him is speaking. "Fuck that" is the only way I can state this idea emphatically enough. The argument is weird, as this is a fundamental property of every society that have ever existed. We, as a species, elevated thoses who were able to work with other even if they were less good, because as a group we happen to be stronger. So yes, the comity is essential ; it has proven to be so for the last 50000 years. Correctness, on the other end, is really optionnal, because the group will be able to address issues later. Again, our societies did so for the past 50000 years. This is reality. Now, there is a world where correctness is essential and comity is optional : this is a world where you fight to get your view accepted by others because you have a very strong belief that you are more correct than them. Ultimately, the one who will be able to succeed is the one who is able to silence his/her opponents. That's called the natural order, or, as frames by early philosopher, the law of the strongest. We lived through this situation again and again and again through our history: these are the periods in time where science was blocked for ideological reasons (progess of science was never blocked because the society as a whole decided they did not want them ; they were blocked because a small group of powerfull individual decided they did not want them. Every time you get a social construct where the comity is made more important than the individual, science thrives; so I think you get your ideas in the wrong order). You got one part right, yes: nobody should stay silent because of some respect to a specific group if they think they are right. That also comes with some constraint: if you're proven wrong, then you must admit that you are. Power and responsibility, as always. Spining some misguided philosophy to avoid this reality is not going to work. [Reply to this comment] Poor Overstreet Posted Jun 29, 2025 20:41 UTC (Sun) by tuna (guest, #44480) [Link] If you don't think that the developers and maintainers do a good job your only choice is to fork the project. Do you really think you would do a better job than Linus Thorvalds in developing Linux? Also, most users of Linux do not care about Bcachefs. If Overstreet would build Bcachefs in FreeBSD way fewer people would use it and care about it. [Reply to this comment] Maybe a new approach is needed Posted Jun 30, 2025 0:00 UTC (Mon) by nicbr (subscriber, #174696) [ Link] I understand the benefits of incremental changes, it's what made Linux successful after all, but maybe it's time to figure out a new policy on how to handle the implementation of more complex kernel features. [Reply to this comment] Is this why vendors prefer out of tree drivers? Posted Jun 30, 2025 4:31 UTC (Mon) by DemiMarie (subscriber, #164188) [Link] (1 responses) One pattern that sometimes happens is that hardware vendors will only offer commercial support for the out of tree driver they maintain. I wonder if this is why: they cannot support something they don't have commit rights to, because they have no way to ensure they can fix bugs or respond to feature requests in a reasonable amount of time. One can provide an SLA for fixing bugs in a driver one packages, but nobody can provide an SLA for upstreaming. There is also at least one case where the in-tree driver is horrible, but because there can be only one driver for a piece of harsware, replacing it with the out of teew driver is near impossible. I wonder if this is a case where Linux is too tightly coupled, and a microkernel with userspace components that can be independently upgraded would be better. [Reply to this comment] Is this why vendors prefer out of tree drivers? Posted Jun 30, 2025 12:12 UTC (Mon) by tuna (guest, #44480) [Link] "There is also at least one case where the in-tree driver is horrible, but because there can be only one driver for a piece of harsware, replacing it with the out of teew driver is near impossible." If there is a will there usually is a way. But yeah, having a commercial support service for the mainline Linux releases seems pretty impossible. But that is why Red Hat etc exists. [Reply to this comment] Possible way forward? Posted Jun 30, 2025 6:52 UTC (Mon) by skissane (subscriber, #38675) [ Link] (5 responses) I really don't know who is right and wrong in this situation. But the fact is, Linux is Torvalds' baby... so he has a lot more power here than Overstreet. Hence, even if Overstreet is all in the right and Torvalds is all in the wrong (and I'm not saying that's the case), Torvalds is still going to come out on top. The only way that Overstreet could possibly win would be (a) convince Torvalds he's in the wrong; or (b) convince enough other leading kernel devs of it that they can socially pressure Torvalds into accepting being overruled. While neither is impossible, I don't think either is likely. And I think it would be a shame if bcachefs got pulled from the kernel due to a personality clash. That would be a negative outcome for its user community. A possible solution: what if some other respected kernel dev agreed to act as an "interface" between Overstreet and Torvalds? So Overstreet convinces the "interface" the patches are necessary, and then it is the job of the "interface" to convince Torvalds of the same? That could obviously cut down on a lot of the personality conflict. Of course, the big problem with that, is who is going to sign up to do it? It would take up a lot of time and effort, for little reward, and maybe nobody with the necessary skills/experience is sufficiently interested in bcachefs to play that role. Maybe, if a commercial vendor was sufficiently interested in bcachefs, they might pay for someone to do that, but I'm not sure if any vendor is. [Reply to this comment] Possible way forward? Posted Jun 30, 2025 12:17 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (4 responses) I'd prefer a model where process issues can be discussed and addressed instead of having it just be about personalities. [Reply to this comment] Possible way forward? Posted Jun 30, 2025 14:06 UTC (Mon) by Alterego (guest, #55989) [Link ] (3 responses) Maybe you should consider you are wrong, and you need to follow the rules. Or develop on top of *BSD ... [Reply to this comment] Let's slow down a little Posted Jun 30, 2025 14:42 UTC (Mon) by corbet (editor, #1) [Link] (2 responses) Speaking to the thread as a whole, without focusing on any single participant: we've heard a lot from people telling others what they need to do. I'm guessing that the practical benefit from this advice, if any, has already been realized, so we can hold off on doing more of it for now...? Thank you. [Reply to this comment] Let's slow down a little Posted Jun 30, 2025 18:50 UTC (Mon) by raven667 (subscriber, #5198) [ Link] I know this is a moderation action, but I wanted to point out that there is in fact a *ton* of people who have spent a significant amount of time writing useful thoughtful advice on how to approach kernel development throughout the comments on this article, so much so that there might be enough material for a reference document like HOWTO Be a Kernel Developer document, or even HOWTO Be a Team Player, with the best bits of advice. [Reply to this comment] Let's slow down a little Posted Jun 30, 2025 22:13 UTC (Mon) by jalla (guest, #101175) [Link] Can we request 'the tragedy of bcachefs' as an article, once this all settles? [Reply to this comment] Bcachefs could use a publicist Posted Jun 30, 2025 12:57 UTC (Mon) by Theodor (guest, #178100) [Link ] Like some artists, the project could use a publicist. They would act as a buffer between the technical team and the outside world, taking care of public relations and most user support communication. This would free up time for the programmers, and if there was any conflict, a good publicist would minimise it and keep it internal to the project, greatly limiting the damage it could cause. [Reply to this comment] Well... Posted Jun 30, 2025 22:29 UTC (Mon) by jd (guest, #26381) [Link] (6 responses) First, I don't think bcachefs was ready for integration with the kernel. We've had a lot of "well, it's getting close" stuff (ReiserFS, DevFS, etc) that has had to be pulled because it just didn't work and wasn't ever going to work. We've also had a lot of intriguing ideas that were, perhaps, simply too novel to get into the kernel in the first place. (HP's pluggable scheduler, the IBCS work, KGI, the VAX architecture patches, etc. Not sure if the BadRAM patch or pluggable CPU patch ever made it in, either.) This was one reason, many many moons ago, I developed a kernel tree that had as many experimental patches as I could find and munge in. I wanted people to be able to see what was around, because it was a LOT more than was being publicly discussed. Since then, Linux has moved to GIT, we've got a whole bunch of high-up trees that aren't the top-of-the-chain Linux kernel that allow a lot more experimental ideas to actually be shared. It's doubtful the stuff I did contributed anything to people getting the idea that it would be a good idea to have such trees, since it's a very obvious idea, I mentioned my work only because it IS an obvious idea, which is why people these days do this sort of thing. That means that if bcachefs has good ideas in it, then it WILL be in a whole bunch of trees, just not the mainline one. Second, a cursory inspection of how filesystems develop (because Linux has seen so many of them being developed) shows that there's remarkably little actual theory on how they should work and remarkably little documentation on how what there is actually works, why, and whether there are architectural reasons for remaining issues. ReiserFS could have seriously benefitted from that - it was decent enough for a bunch of use cases, but frequently corrupted itself and because nobody understood it (including, I suspect, its author), nobody could fix it. We are doomed to see a LOT of Reiserisms in filesystems, if the technical writers of the world continue their fixation on doing anything but open source. Bcachefs is clearly an example, the parallels to Reiser4 are stark, and the only reason Btrfs is usable now is that those backing it threw in vastly more time and money than would have been needed if the programmers had known why it was breaking in the first place. And it broke. A lot. There's nothing wrong with the Linux model, it develops complex high-grade software faster than any other method out there, but it would be really really good if we actually had the writing to go with it. We could avoid the Bcachefs-type arguments entirely because most of the problems would never have happened to begin with. [Reply to this comment] Well... Posted Jun 30, 2025 23:27 UTC (Mon) by koverstreet ( supporter , # 4296) [Link] (5 responses) A reiserfs comparison... Not sure how I feel about that one :) First: any project can look like chaos from afar. You have to dig deeper, understand the project priorities and if they're executing on those priorities. With reiserfs, reiserfs3 wasn't exactly known for robustness (speed yes, but there were known issues with repair) and I never got the impression reiserfs4 was focused on fixing those - I did hear about a lot of cool ambitious features they wanted to do, though! In contrast, bcachefs was started after the core - btree, IO paths were done, deployed and stable. And while it does have an ambitious featureset, the scope of that featureset was frozen years ago. It's also had users since before it went upstream; I did not submit it until it was looking stable for the userbase it had at the time. But of course that was not the end of stabilization and hardening; filesystems (especially today) are massive, so we need to do a gradual rollout, at every step learning more about what can go wrong, fixing the issues the current userbase is finding, and making sure that stabilization is keeping up with growing deployment. If we look back at the past two years of development, we also see that there hasn't exactly been a ton in the way of feature work: what feature work has happened has been limited in scope based on user feedback, things that were already planned and in the works, or scalability work. (There are now many bcachefs users with 100+ TB filesystems, and I think I can confidently say that we're good on scalability for now). The majority of the development time has gone to debugging, hardening, new debugging tools and better logging (you can't debug what you can't see and understand), and a lot of work on repair as we discover and learn how to cope with new failure modes. - 6.7 (immediately prior to merge): upgrade/downgrade mechanisms, modern versioning (we don't do really do feature bits like other filesystems, we version everything, our forwards/backwards compatibility mechanisms are really nice; that made everything that came after much smoother (or even possible)) - 6.8: per-device vector clocks, for split brain detection - 6.9: repair by btree node scan, per-device superblock bitmaps of "ranges with btree nodes" to make btree node scan practical on large filesystems. (This one was motivated by reiserfs, but with a lot of lessons learned so that it works reliably - even if you've got a bcachefs image file on your filesystem). - 6.11: disk accounting rewrite - this was a multi year project started before bcachefs was merged, which made our accounting extensible and scalable (it was fast before, but not extensible); this enabled e.g. per-snapshot accounting (not yet exposed). - 6.12 or .13? - reflink improvements (the ability to put indirect extents in an error state), to ensure that transient errors don't cause data loss - 6.14: major scalability work for backpointers check/repair, in response to larger and large filesystems becoming commonplace - this required an expensive and disruptive on disk format upgrade, but now we're good to 10+ PB, tested. - 6.15: scalability work for device removal, snapshots removal: again done in response to actual usage - 6.15: more hardening against actual IO/checksum errors, in the data move path (extent poisoning; this generated a kerfuffle among the block layer people - "you want to do WHAT with FUA?" "it says it right here in the spec" (and we've since determined that yes, read fua does work as advertised on scsi hard drives, anyone's guess what nvme devices are doing). - 6.16: major logging improvements for data read errors, btree node read errors, and errors that trigger repair: grouping all errors and repair actions into a single error message, so we can follow the sequence of events Through it all, lots of lots of end user support and bug fixing. I am perpetually telling users: "I don't care what broke or why, if you think it was the hardware's fault or pebcak - get me a metadata dump, get me the info I need, we'll get it working again and make it more robust for everyone." [Reply to this comment] Well... Posted Jul 1, 2025 4:49 UTC (Tue) by Alterego (guest, #55989) [Link] You do a lot of right things. Maybe you need to introspect about what you may do wrong, not in absolute value, but wrt Linux technical process, like Mr Torvalds did some years ago and successfully changed himself for a better person and manager. [Reply to this comment] Biggest change you could make to interact better with Torvalds Posted Jul 1, 2025 9:55 UTC (Tue) by farnz (subscriber, #17727) [Link ] I honestly think that the best thing you could do in this situation is to maintain more git branches: * One with the bare minimum fixes for the next rc. This is one where you only have bugfixes that stop things getting worse - they don't make things better. So, bugs that actively corrupt data when hit go in here, but new recovery paths do not. The idea is that Linus can see this branch as "always safe to pull under any circumstances", because it's the bare minimum changes to "stop the bleeding" and prevent people whose filesystems are not yet corrupt from seeing corruption. * One with broader bugfixes, that still don't introduce new recovery paths, but protect both people whose filesystem is not yet corrupt, and prevent people with corrupt filesystems from losing data (in the extreme by detecting the corruption and going read-only because there's nothing else you can do without a new recovery path) * One or more with new recovery paths, each new recovery path going into its own branch. That way, Linus can pull in the ones that he thinks are -rc material, while leaving the ones he doesn't think are -rc material for later. * One or more with new code that you think isn't going to be ready to merge until the next merge window at the earliest (if not the merge window afterwards), so that Linus can see where you're expecting to take this in future, and use that to guide his selection of new recovery paths. This is quite a lot more work for you, but allows Linus to rebuild trust in your non-technical side; he can see that you're asking for the minimum changes to stop the bleeding, that you'd like him to take more changes if he's willing to, but that you trust him to choose the right changes given enough information about where you're planning to take bcachefs in the future. [Reply to this comment] Well... Posted Jul 2, 2025 14:16 UTC (Wed) by kragil (guest, #34373) [Link] (2 responses) Mr Overstreet, you need to play by the rules. It is as simple as that. Mr Torvalds is calling the shots, you need to be on his good side, that is just the way things are and your users can wait PERIOD Nobody will sponsor you or invest in your FS if it gets this kind of bad press all the time. Just imagine an alternative reality where the above would have always been true ... [Reply to this comment] Second request Posted Jul 2, 2025 14:29 UTC (Wed) by corbet (editor, #1) [Link] (1 responses) I'll repeat -- speaking to the thread as a whole -- I think we have had plenty of posts telling people what they need to do. I really doubt anything useful will come from more of them; please let's bring this to a close. [Reply to this comment] Second request Posted Jul 3, 2025 3:44 UTC (Thu) by ejr (subscriber, #51652) [Link] Thank you. [Reply to this comment] Copyright (c) 2025, Eklektix, Inc. Comments and public postings are copyrighted by their creators. Linux is a registered trademark of Linus Torvalds