[HN Gopher] XFS Metadata Corruption on Linux 6.3 Tracked Down to...
___________________________________________________________________
XFS Metadata Corruption on Linux 6.3 Tracked Down to One Missing
One-Line Patch
Author : LinuxBender
Score : 89 points
Date : 2023-05-29 13:38 UTC (9 hours ago)
(HTM) web link (www.phoronix.com)
(TXT) w3m dump (www.phoronix.com)
| sp332 wrote:
| Is it a little worrying that even with all the attention, no one
| seems to know what this line of code actually does?
| juujian wrote:
| Glad I am not the only one who was thinking that.
| _a_a_a_ wrote:
| Agreed, the tone of the quotes is scarily relaxed. This should
| not be how good software dev is done. Maybe they are being more
| rigorous than I give them credit but it doesn't sound good.
| pengaru wrote:
| The transparency of FOSS conferring exceptionally high
| visibility into how the sausage is made often creates this
| kind of impression.
|
| But in reality what's happening here is folks are getting
| access to bleeding-edge kernel development snapshots who
| choose to run these kernel versions, and are lucky to get
| such quick access to patches even before the scope of new
| bugs are entirely understood by the developers. Note there's
| nothing preventing these affected users from simply running a
| prior known-stable kernel version until the bug is better
| understood, they're opting in on the chaos.
|
| It's unfair to assume Dave Chinner et al won't be running the
| issue seemingly fixed by this one-line change fully to
| ground.
|
| If you're not interested in playing the role of kernel QA and
| interacting with the upstream devs when things break in not
| yet understood ways, don't run bleeding edge kernel versions.
| LTS and -stable releases are offered for a reason.
| jeffbee wrote:
| You're not the first person to propose this, but like all
| those other people, you are wrong. 6.3 is the latest
| "stable" release. It is the version front and center on
| kernel.org. There is nothing "bleeding-edge" about it.
| pengaru wrote:
| Ah I didn't notice 6.3 had already been promoted to
| stable, that's unfortunate.
|
| Relative to a kernel version you'd encounter in something
| like rhel or debian stable however, tracking mainline's
| "stable" branch is still pretty damn aggressive.
| jeffbee wrote:
| Giant refactor + no unit tests = data loss. The history of Linux
| in a nutshell.
| patrakov wrote:
| I wouldn't say "no unit tests". There are xfstests, the problem
| is that nobody runs them on stable backports to verify their
| correctness and completeness.
| jeffbee wrote:
| xfstests are not unit tests, they are integration stress
| tests, and their coverage is quite poor. Nothing in that
| suite exercises `xfs_bmap_btalloc_at_eof` particularly.
| That's the kind of unit test you want before undertaking a
| large refactor. There are several testable postconditions
| that would be trivial to test, if this code had an easy way
| to add and run unit tests. It has two mutable (in-out)
| parameters and a comment that says allocation returns as if
| the function was never called. And that is where the bug
| lies, according to the patch (which also adds or modifies no
| tests).
| garganzol wrote:
| This is why I always see the code as a math sheet - if every
| little expression is perfect then the combined result is
| guaranteed to be perfect too. This rule never fails.
| malkia wrote:
| I wonder if unit testing was ever considered, (or possible?) for
| the Linux source code?
| speed_spread wrote:
| Code that does I/O has a lot of interplay that's hard to
| replicate and impossible to cover entirely. The physical world
| is nothing but shared mutable state.
| hnarn wrote:
| FLOSS developers are real heroes, but so are the people willing
| to spend time testing newer non-LTS versions of the code and
| report their issues.
|
| I have enough on my plate just dealing with the issues arising
| from using stable code, I think it's admirable that people find
| the time raising their glance to future releases and helping us
| all enjoying a less panic-inducing experience.
| talhah wrote:
| Bleeding edge arch linux user here, I've barely come across any
| major bugs in the last couple of years. Whenever I find
| something I do report it and it usually gets fixed really
| quickly.
|
| In fact, many of these bugs were on stable releases too.
| awill wrote:
| exactly. A RHEL kernel is likely a lot more stable than the
| kernel.org LTS kernel. Often bugfixes and security patches
| are backported to the LTS kernel, meaning both can be
| affected by similar bugs.
| georgyo wrote:
| In my experience, bleeding edge and stable are about the same
| amount of pain. Breakage isn't actually that common, and fixes
| come a lot faster.
|
| And even if you perfer stable, the latest will become stable
| eventually. Not trying your workload out on the next releases
| has pretty much the same risk profile of just running latest.
|
| Many problems can only be found by running your particular
| workload.
| ilyt wrote:
| That seems to be mostly bathtub curve for most of the
| software for us when it comes to amount of work.
|
| Running on "latest commit from master" from many projects
| (not Linux) will just get you code nobody even tested and so
| a lot of bugs fixed quickly.
|
| Running on "latest stable" (whatever that means for project)
| means fixes from time to time when it updates, but in vast
| majority of cases not that much work.
|
| Anything behind that like LTS releases ? Extra work.
|
| Now any doc you find might be about never release or feature
| that changed. "Bugs" might not get fixed if they are not big
| enough to backport.
|
| Upgrade to new LTS version will also get you years of changes
| in app that you then have to apply to the system, vs having
| to do it "change by change" when keeping up to date.
|
| If you use configuration management that also often means
| multiple different configs to manage at the very least till
| previous LTS version gets finally upgraded
| drewg123 wrote:
| We run bleeding edge FreeBSD at Netflix and are never more than
| a few weeks behind the FreeBSD main branch. This has worked out
| quite well for us.
|
| We used to run -stable, and update every few years, like from
| FreeBSD 9.x to FreeBSD 10.x. We found that when we did that, we
| would often encounter some small subtle bug that was tickled in
| our environment, and which was incredibly hard to track down.
| That sort of bug was hard to track down because the diff
| between branches was enormous, and because there were thousands
| of commits to sift through, and because the person responsible
| for the bug may have committed it months or years ago, and has
| forgotten about it.
|
| We eventually decided to track the main branch, updating
| frequently. This means that while we find more bugs, but they
| are far easier to fix because they were introduced more
| recently, and there are a lot fewer commits to look through to
| find where they came from.
| hpb42 wrote:
| Is there a position open on your team? This sounds like the
| stuff I'm into!
___________________________________________________________________
(page generated 2023-05-29 23:00 UTC)