[HN Gopher] Who killed the network switch? A Hubris Bug Story
___________________________________________________________________
Who killed the network switch? A Hubris Bug Story
Author : ingve
Score : 85 points
Date : 2024-03-25 06:54 UTC (1 days ago)
(HTM) web link (cliffle.com)
(TXT) w3m dump (cliffle.com)
| moosingin3space wrote:
| This is a fantastic in-depth look at debugging a complex problem,
| and the fact that the rest of the system remained stable is a
| testament to the quality of the engineering work that the Oxide
| team put into this. I'm personally quite inspired by this and
| plan on applying similar techniques in my day job!
| aus10d wrote:
| The work Oxide is doing is truly amazing
| monocasa wrote:
| FWIW, you can support more than 8 regions by treating that
| hardware more like a soft fill TLB.
| packetlost wrote:
| I assume TLB is translation lookaside buffer, but what do you
| mean by "soft fill" here?
| Veserv wrote:
| They mean software-filled. On "page fault (really a illegal
| memory access)" you walk a software data structure to
| determine if there should actually be accessible memory there
| then you "page in (update the MPU to make the memory
| accessible, possibly evicting another entry)" the memory.
|
| This is how some older chips used to work. Hardware page
| table walkers can be viewed as just a hardware implementation
| of such code.
| robocat wrote:
| I would guess they (a) want soft realtime performance, and (b)
| would not want to introduce something critical that could
| interfere with debuggability or potentially reliability.
| scottlamb wrote:
| Nice read!
|
| Nit: // Order the task's regions in ascending
| address order. // // THIS IS IMPORTANT. The
| kernel exploits this property to do cheaper // access
| tests. regions.sort_by_key(|i|
| region_table.get_index(*i).unwrap().1.base);
|
| I wouldn't put this comment here. It's not just some detail of
| this function; it's an invariant of the field that all writers
| have to respect (maybe this is the only one now but still) and
| all readers can take advantage of. So I'd add it to the
| `TaskDesc::regions` docstring. [1]
|
| [1]
| https://github.com/oxidecomputer/hubris/commit/b44e677fb39cd...
| sbt567 wrote:
| I adore whatever folks at Oxide does. And this is one of it
| orf wrote:
| [delayed]
___________________________________________________________________
(page generated 2024-03-26 23:00 UTC)