[HN Gopher] Who killed the network switch? A Hubris Bug Story
       ___________________________________________________________________
        
       Who killed the network switch? A Hubris Bug Story
        
       Author : ingve
       Score  : 85 points
       Date   : 2024-03-25 06:54 UTC (1 days ago)
        
 (HTM) web link (cliffle.com)
 (TXT) w3m dump (cliffle.com)
        
       | moosingin3space wrote:
       | This is a fantastic in-depth look at debugging a complex problem,
       | and the fact that the rest of the system remained stable is a
       | testament to the quality of the engineering work that the Oxide
       | team put into this. I'm personally quite inspired by this and
       | plan on applying similar techniques in my day job!
        
       | aus10d wrote:
       | The work Oxide is doing is truly amazing
        
       | monocasa wrote:
       | FWIW, you can support more than 8 regions by treating that
       | hardware more like a soft fill TLB.
        
         | packetlost wrote:
         | I assume TLB is translation lookaside buffer, but what do you
         | mean by "soft fill" here?
        
           | Veserv wrote:
           | They mean software-filled. On "page fault (really a illegal
           | memory access)" you walk a software data structure to
           | determine if there should actually be accessible memory there
           | then you "page in (update the MPU to make the memory
           | accessible, possibly evicting another entry)" the memory.
           | 
           | This is how some older chips used to work. Hardware page
           | table walkers can be viewed as just a hardware implementation
           | of such code.
        
         | robocat wrote:
         | I would guess they (a) want soft realtime performance, and (b)
         | would not want to introduce something critical that could
         | interfere with debuggability or potentially reliability.
        
       | scottlamb wrote:
       | Nice read!
       | 
       | Nit:                   // Order the task's regions in ascending
       | address order.         //         // THIS IS IMPORTANT. The
       | kernel exploits this property to do cheaper         // access
       | tests.         regions.sort_by_key(|i|
       | region_table.get_index(*i).unwrap().1.base);
       | 
       | I wouldn't put this comment here. It's not just some detail of
       | this function; it's an invariant of the field that all writers
       | have to respect (maybe this is the only one now but still) and
       | all readers can take advantage of. So I'd add it to the
       | `TaskDesc::regions` docstring. [1]
       | 
       | [1]
       | https://github.com/oxidecomputer/hubris/commit/b44e677fb39cd...
        
       | sbt567 wrote:
       | I adore whatever folks at Oxide does. And this is one of it
        
       | orf wrote:
       | [delayed]
        
       ___________________________________________________________________
       (page generated 2024-03-26 23:00 UTC)