[HN Gopher] Uncovering a 24-year-old bug in the Linux Kernel (2021)
___________________________________________________________________
Uncovering a 24-year-old bug in the Linux Kernel (2021)
Author : endorphine
Score : 399 points
Date : 2022-10-15 13:08 UTC (9 hours ago)
(HTM) web link (engineering.skroutz.gr)
(TXT) w3m dump (engineering.skroutz.gr)
| sponaugle wrote:
| This was a cool example of a class of bugs that are both hard to
| find with no active example, and hard to prevent in complex
| systems. The optimization that was added many years ago for
| performance didn't update something that had a use case that was
| incompatible with not being updated in a very small number of
| circumstances.
|
| It is an interesting thought experiment to consider what kind of
| tool or automated detection could have found this. Some type of
| dependency linking between variables might have shed some light,
| but I'm not sure that would have really highlighted this kind of
| issue.
|
| Great description of both the bug and the path to the solution!
| gizmo686 wrote:
| Probably the only way to prevent this type of issue in an
| automated fashion is to change your perspective from proving
| that a bug exists, to proving that it doesn't exist. That is,
| you define some properties that your program must satisfy to be
| considered correct. Then, when you make optimizations such as
| bulk receiver fast-path, you must prove (to the static analysis
| tool) that your optimizations to not break any of the required
| properties. You also need to properly specify the required
| properties in a way that they are actually useful for what
| people want the code to do.
|
| All of this is incredibly difficult, and an open area of
| research. Probably the biggest example of this approach is the
| Sel4 microkernel. To put the difficulty in perspective, I
| checkout out some of the sel4 repositories did a quick line
| count.
|
| The repository for the microkernel itself [0] has 276,541
|
| The testsuite [1] has 26,397
|
| The formal verification repo [2] has 1,583,410, over 5 times as
| much as the source code.
|
| That is not to say that formal verification takes 5x the work.
| You also have to write your source-code in such a way that it
| is ammenable to being formally verified, which makes it more
| difficult to write, and limits what you can reasonably do.
|
| Having said that, this approach can be done in a less severe
| way. For instance, type systems are essentially a simple form
| of formal verification. There are entire classes of bugs that
| are simply impossible in a properly typed programs; and more
| advanced type systems can eliminate a larger class of bugs.
| Although, to get the full benefit, you still need to go out of
| your way to encode some invariant into the type system. You
| also find that mainstream languages that try to go in this
| direction always contain some sort of escape hatch to let the
| programmer assert a portion of code is correct without needing
| to convince the verifier.
|
| [0] https://github.com/seL4/seL4
|
| [1] https://github.com/seL4/sel4test
|
| [2] https://github.com/seL4/l4v
| xani_ wrote:
| > That is not to say that formal verification takes 5x the
| work. You also have to write your source-code in such a way
| that it is ammenable to being formally verified, which makes
| it more difficult to write, and limits what you can
| reasonably do.
|
| Also hire significantly more skilled people. Write formal
| verification on job requirement and the pool of candidates
| will shrink massively.
|
| Explains why it is so rare really. "Spend 5-10x on developers
| to have some bugs not happen" is not a great sell.
| simtel20 wrote:
| It's a great question! Thinking back...
|
| At the time this bug was introduced it would probably have been
| cost prohibitive to create a test case. We were proud of
| 100mbit networks, had flaky nics the vendors didn't help
| maintain much of the time (and which were often broken in
| hardware) and the filesystem max file size was something like
| 2tb, and most drives wee're in the handful of gbs. Conceiving
| of testing for something like this would have been expensive.
| And none of the big system vendors took Linux seriously then.
|
| Though perhaps flooding zeros across a TCP socket could work, I
| really think that a kernel hacker would have found a lot of
| other hardware and driver issues before ever being able to
| trigger this.
| aposm wrote:
| Awesome breakdown - as someone who is fairly familiar with TCP
| theoretically but not with the details of the TCP implementation
| in the Linux kernel, this was just the right balance of detail.
| Great technical writing IMO!
| myself248 wrote:
| Okay so I got to the wrap-up at the end, about "why did nobody
| else find this", the author sets up some logical dominoes but
| doesn't knock them down. Allow me to try:
|
| Earlier in the article, the author mentions that they recently
| upgraded some network hardware, and the problem seemed to become
| more frequent after that.
|
| Packet loss or other network issues would force the stack to fall
| out of fast-path and update the counter, avoiding the bug.
|
| Running over ssh would avoid the bug. The only time you'd run
| rsync not over ssh would be within your own network.
|
| So it sounds like (this is my conjecture here) this would only
| appear to someone running rsync internally, over a high-
| performance network with no packet loss, and upgrading the
| switches might've finally gotten the network good enough to
| expose the bug?
| cryptonector wrote:
| One might expect this to have been hit by HPN (high performance
| networking) users, but perhaps if they are storage I/O bound
| rather than CPU or network I/O bound, then probably not.
| ardel95 wrote:
| That sounds plausible. But also, most software (browsers, web
| service SDKs, RPC frameworks) treat TCP connections as fallible
| by setting read/write timeouts and aggressively reopening
| broken connections. So, I'm totally not surprised this issue
| went unnoticed for this many years.
| verisimilitudes wrote:
| _How is it possible for a TCP bug that leads to stuck connections
| to go unnoticed for 24 years?_
|
| It's because the fools responsible never rewrite their code, use
| a broken language, and don't even try to prove half of the broken
| garbage they write. Then, when it turns out to have been broken
| for decades, they chuckle and shove another finger into another
| crack, never understanding how they misuse computers.
| layer8 wrote:
| This is a good case for formal verification.
| mdaniel wrote:
| I struggle because I want to upvote these comments, because
| that's the world I want to live in. But the opposite side of
| that coin is who is going to author the incredibly arcane
| _specification_ of TCP against which any such implementation is
| formally verified?
|
| Maybe TCP stacks are one of the few cases where that make
| sense, but I'd suspect if it was "worth the cost" it would have
| already been done
| layer8 wrote:
| There are certain guarantees you want such a formal
| specification to give, like for example not getting
| permanently stuck in some state as with the present bug. You
| can formalize the proofs for those guarantees and have their
| correctness machine-checked. Something like TLA+/PlusCal is
| likely suitable for that.
|
| A formal specification is less ambiguous than a prose
| specification. Formalizing the TCP specification will, if
| anything, expose aspects where the specification is unclear,
| or corner cases where the specification actually leads to
| unwanted behavior and doesn't provide the desired guarantees.
|
| So, while you can't prove that the formal specification
| matches the prose specification a 100%, you _can_ prove that
| it provides all the guarantees the original prose
| specification was aiming for (once you've formalized those
| desired guarantees), which is something you can't do for the
| prose specification.
| sneak wrote:
| > _These snapshots are updated daily through a pipeline that
| involves taking an LVM snapshot of production data, anonymizing
| the dataset by stripping all personal data, and transferring it
| via rsync to the development database servers._
|
| I don't know what sort of data these people process, but most
| datasets about people are not anonymized by simply removing the
| PII.
| abraae wrote:
| Yes they are. Any information that can be used to identify a
| person by definition is PII.
|
| Once all the PII is removed, by definition the dataset is
| anonymized.
| capitol_ wrote:
| This is obviously true, as you are stating an axiom. But what
| I think the grand parent is trying to say is that databases
| with PII can often be deanonymized by looking at the other
| data that isn't obviously PII.
|
| Take for example a database over all mobile phone positions
| over time, this can be 'anonymized' by removing all
| connections from the phones to information on who owns the
| phones.
|
| But it can still be trivially deanonymized by analyzing where
| the phones are at night and during office hours, not very
| many persons work in the same building and sleep in the same
| house.
| omginternets wrote:
| Which kernel version has this patch?
| MatthiasPortzel wrote:
| I remember when this was originally posted, but I voted it up
| again because I think it's such an excellent story, and excellent
| programming. We need more people and companies like this, who are
| willing to go beyond "oh it fails randomly sometimes" and track
| down the underlying issues.
|
| => https://news.ycombinator.com/item?id=26102241 Previous
| Discussion (497 points - 41 comments)
| [deleted]
| c0mptonFP wrote:
| > We need more people and companies like this, who are willing
| to go beyond "oh it fails randomly sometimes" and track down
| the underlying issues.
|
| I absolutely disagree. Most capable engineers I know have this
| urge to go down rabbit holes and fix any issue, this is nothing
| special.
|
| Everyone wants to be the hero that found a bug deep in the
| stack, make a glorious pull request, and be celebrated in the
| community.
|
| I much more value people who have enough self-control to pick
| meaningful battles, and follow the right priorities.
| jackmott wrote:
| black_puppydog wrote:
| Eh, right, many bugs we have don't really matter.
|
| Oh what is that you say, security vulnerabilities are also
| just bugs that get exploited? Oh well...
| [deleted]
| [deleted]
| rrss wrote:
| In my experience, the "oh it fails randomly sometimes" bugs
| are often in some random dull legacy infrastructure component
| where there is zero attention or celebration for fixing them,
| and so engineers tend to tolerate losing a bit of time once a
| week due to them for years rather than someone spending half
| a day to fix it for everyone.
| robertlagrant wrote:
| Exactly. I could fix any complex bug. I just choose not to.
| KolmogorovComp wrote:
| At the company level, it is indeed more expensive to fix
| upstream rather thank work around it, but on a macro scale it
| is much more beneficial.
|
| In my opinion fixing upstream whenever possible even if not
| the best short-term solution should be considered the price
| to pay for using OSS.
| CSSer wrote:
| GP's comment is also odd because the article notes they took
| your approach. They documented the problem when they first
| noticed it happening infrequently and moved on to higher
| priorities. When it started happening every single day it
| became mission critical to investigate.
| HenrikB wrote:
| I think this was well prioritized; they struggled with the
| issue at times, found a temporary workaround, but when that
| workaround stod being efficient and the bug hit them
| everyday, they decided to track down the source. Then they
| reported upstream, it was reproduced, and someone patched it,
| and rolled out new, fixed kernels.
|
| That is a perfect example of how things works and should
| work. They contributed to the community. I think it was a
| great prioritization.
|
| I'm certain there were lots of other people hitting this bug
| and killing processes or rebooting to get around it. The
| troubleshooting and reporting done here, silently saved a lot
| of of other people a lot of efforts - now and in the future.
| I don't think they were after it to be heroes; they just
| shared their story, which I'm sure will encourage others to
| maybe do the same one day.
| freedomben wrote:
| This opinion is a popular one these days (particularly since
| it complements the demands of business nicely by maximizing
| personal/company profit), but it is a big part of the reason
| why the majority of software these days is so unreliable and
| buggy. It results in hacks on top of hacks to paper over
| problems in the lower levels of the abstraction tower that is
| modern software, and it results in tons of "WTF" bugs that
| are just accepted and never fixed.
| trasz wrote:
| This _is_ the meaningful stuff. Engineers might have the
| urge, but most don't have the opportunity, because they need
| to focus on the currently fashionable framework.
|
| A good rule of thumb regarding meaningful battles is to
| ignore everything promoted by companies like Google or
| Facebook - everything they do is either going to be abandoned
| in five years, or makes sense only in the context of solving
| problems nobody else have.
| stjohnswarts wrote:
| seems like something an engineer might fix on their own
| time if they were feeling feisty about the matter.
| Something tells me if it went on for 20 years it was an
| edge case that only very rarely came up and was mostly a
| non-issue.
| trasz wrote:
| I suspect it was definitely an issue, it's just that most
| companies like Google don't care about reliability, only
| availability, and it might just not show up in their
| stats.
| digiou wrote:
| For the record, this is one of the top Greek employers. This is
| Greece's Amazon essentially. The C-team are intact since day-1
| and AFAIK still writing (some) code.
|
| It is not unheard of to have 4-day weeks and developer-first
| mindset at that place.
| charcoalhobo wrote:
| Love deep dive troubleshooting like this. I haven't heard of
| systemtap before; looks nice. When I had to troubleshoot a kernel
| bug [1] I used perf [2] probes which are also really nice for
| this kind of debugging.
|
| [1] https://www.spinics.net/lists/xdp-newbies/msg01231.html
|
| [2] https://www.brendangregg.com/perf.html
| thow232329 wrote:
| "This setup has worked rather well for the better part of a
| decade and has managed to scale from 15 developers to 150"
|
| LOL
| dang wrote:
| Could you please stop creating accounts for every few comments
| you post? We ban accounts that do that. This is in the site
| guidelines: https://news.ycombinator.com/newsguidelines.html.
|
| You needn't use your real name, of course, but for HN to be a
| community, users need some identity for other users to relate
| to. Otherwise we may as well have no usernames and no
| community, and that would be a different kind of forum.
| https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...
|
| Also, could you please stop posting unsubstantive and/or snarky
| and/or flamebait comments? It's not what this site is for, and
| it destroys what it is for. If you wouldn't mind reviewing
| https://news.ycombinator.com/newsguidelines.html and taking the
| intended spirit of the site more to heart, we'd be grateful.
| halukakin wrote:
| Could someone provide link(s) on how regular snapshots of
| databases can be taken like this? (Googling didn't help much,
| maybe I'm googling for the wrong keywords.) For me, backing up
| the database is a few-hour-long process. Restoring it for a
| developer again is a few hours process. I read about snapshots
| before but haven't realized they could be this effective.
| rrdharan wrote:
| It's the lack of clarity on how they manage access control for
| what should be regulated data that surprises me, more than the
| technology achievement.
| nick__m wrote:
| for mariadb :
|
| 0) make sure the the database data volume is on lvm or zfs
|
| in a sql prompt: 1) BACKUP STAGE START; BACKUP
| STAGE BLOCK_COMMIT; 2) \! the shell command to take the
| snapshot 3) BACKUP STAGE END;
|
| you can now mount your snapshot, copy it offsite and delete it.
| The restore procedure is left as an exercise!
| halukakin wrote:
| Very helpful. Thank you!
| ClumsyPilot wrote:
| can't most COW dilesystems like BTRFS or ZFS take a snapshot at
| a point in time instantly?
| abdulocracy wrote:
| LVM does the same but at the block level.
|
| https://wiki.archlinux.org/title/Create_root_filesystem_snap.
| ..
| mauvehaus wrote:
| Because it isn't a backup. They put the database into a
| quiescent state on disk, take a file system snapshot, let the
| dbms resume working, and send the snapshot data via rsync.
|
| This requires the cooperation of the dbms software to get the
| on-disk data quiesced. Then your snapshot has to go fast enough
| that the dbms doesn't end up with too many spinning plates
| before you let it start writing normally.
| halukakin wrote:
| Got it. Thank you!
| justin_oaks wrote:
| I love when you're using open source software and can find the
| bug yourself, even if it's deep down the stack.
|
| Imagine if this bug were somewhere in closed source software.
| You'd have to reach out to the software's customer support team.
| Every time I reach out to customer support I expect to have an
| unpleasant experience. It is rarely otherwise.
| [deleted]
| xani_ wrote:
| Kinda why I'm not a fan of cloud, same black box problem.
| perth wrote:
| And even if you did reach out to customer support, it would
| rarely ever get dev attention unless most people have the
| issue. Even in that case, it sometimes still gets a fat
| wontfix, like the famous OneDrive file corruption bug.
| themoonisachees wrote:
| Raising this bug in windows (how? Microsoft sells support,
| barely, but you can't talk to the ipv4 stack dev anyway) woul
| get you laughed out of the chat room because it can't posibly
| be the ip stack's fault.
| didgetmaster wrote:
| As someone who thrives on tracking down rare but annoying bugs in
| a debugger, I love stories like this. It is not just bugs that
| cause real failures which can be headaches; but also bugs that
| just slow things down unexpectantly. They can sometimes go
| undetected for decades like this one.
|
| I wrote an article this past year that talks about silent bugs
| that slowly eat resources and collectively can be very expensive
| in terms of wasted time and energy:
| https://didgets.substack.com/p/finding-and-fixing-a-billion-...
| xani_ wrote:
| > As someone who thrives on tracking down rare but annoying
| bugs in a debugger,
|
| As someone that is cursed to inevitably find some obscure bug
| the second I start using some piece of software I'm happy I'm
| not the only one
|
| > I wrote an article this past year that talks about silent
| bugs that slowly eat resources and collectively can be very
| expensive in terms of wasted time and energy
|
| "Using JS for backend is ecoterrorism" lmao
| myself248 wrote:
| Okay but where's the bug story? Did I miss the story?
| didgetmaster wrote:
| I wrote the article right after I fixed a huge inefficiency
| problem in a function within my own project. I neglected to
| give the specifics in the article, but here they are since
| you asked.
|
| My Didgets tool lets you create pivot tables against
| relational database tables, even very large ones. For the
| pivot values, you can choose to just count the occurrence of
| each value or if it is a number type you can add them up. You
| can also add up the values in a separate number column. Here
| is a quick demo video:
| https://www.youtube.com/watch?v=2ScBd-71OLQ
|
| When adding up numbers in a separate column, I had just a few
| lines of unnecessary code that ended up being called
| exponentially. For smaller tables it was barely noticeable,
| but for tables with 30 million+ rows it really bogged down.
|
| A simple fix to the affected lines caused a certain test
| against a large table to go from over 10 minutes down to
| under 20 seconds. The effects of just a few lines of code
| when applied to a big enough data set can really impact
| performance. It is the old Einstein equation E=mc2 in effect
| which is discussed here:
| https://didgets.substack.com/p/musings-from-an-old-
| programme...
| shurane wrote:
| I guess there is a lost art of writing for optimal
| code/memory/execution time, especially as our resources
| increase.
|
| I think the idea here is to write code quickly that's
| inefficient, and re-write it to be efficient if the
| performance is required down the line. For companies where
| there's bigger fish to fry, i.e. customer acquisition, it's
| more useful to pump out more features (even at the expense
| of bugs) because that draws customers.
|
| But in places where performance is important, you do see
| developers squeeze out more cycles/memory. I.e. kernel/OS
| development, database servers, video games. It's just that
| most developers aren't in those areas of specialty anymore.
|
| Btw, have you heard of https://handmade.network/ and
| https://en.wikipedia.org/wiki/Demoscene ? Wondering what
| your thoughts are in those areas. There are probably more
| communities like the ones I mentioned, where developers are
| interested in writing the kind of code that you are talking
| about.
| pvillano wrote:
| > but also bugs that just slow things down unexpectantly. They
| can sometimes go undetected for decades like this one.
|
| Reminds me of the GTA Online quadratic time JSON parsing bug
| itismetheidiot wrote:
| how odd to see a write up from skroutz.gr blog being at the first
| page of HN...
| dang wrote:
| Also these!
|
| _Speeding Up Our Build Pipelines_ -
| https://news.ycombinator.com/item?id=20775297 - Aug 2019 (24
| comments)
|
| _The infrastructure behind one of the most popular sites in
| Greece_ - https://news.ycombinator.com/item?id=9982361 - July
| 2015 (5 comments)
|
| _Working with the ELK stack_ -
| https://news.ycombinator.com/item?id=9008119 - Feb 2015 (35
| comments)
| NKosmatos wrote:
| Yeap, it's a bit strange, but the post was very well written,
| with a nice breakdown and easily understandable steps that can
| be followed by most software engineers.
|
| There have been some sporadic posts from Skroutz in the past,
| but nothing that gained so much attention.
|
| For those that don't know it, Skroutz is the biggest Greek
| online price aggregator/e-commerce market/price comparison
| site.
| [deleted]
___________________________________________________________________
(page generated 2022-10-15 23:00 UTC)