[HN Gopher] Boeing 787s must be reset every 51 days or 'misleadi...
       ___________________________________________________________________
        
       Boeing 787s must be reset every 51 days or 'misleading data' is
       shown to pilots
        
       Author : jakey_bakey
       Score  : 94 points
       Date   : 2024-10-24 20:19 UTC (2 hours ago)
        
 (HTM) web link (www.theregister.com)
 (TXT) w3m dump (www.theregister.com)
        
       | tomudding wrote:
       | (2020)
        
       | shadowgovt wrote:
       | This is remarkably business-as-usual for airplane electronics.
       | 
       | As a more mundane example: the wifi on planes does temporary
       | [edit: DHCP, not NAT] leases. But the system on many has
       | expiration windows on the order of hours, possibly more than a
       | day... Couple that with the number of passengers planes serve and
       | busy routes can easily exhaust the lease pool.
       | 
       | The solution: there's a button the flight attendants can push to
       | reboot the router, dumping the lease table.
        
         | JosephRedfern wrote:
         | Nitpicking here, but you mean DHCP rather than NAT, right?
        
           | shadowgovt wrote:
           | Yes; thank you.
        
         | Matheus28 wrote:
         | Even with super long leases, couldn't they just have a larger
         | subnet? A /8 oughta do it.
         | 
         | But I guess we're talking about the same people who made the
         | mistake in the first place...
        
           | jmholla wrote:
           | To steelman the choice, the reserved IP /8 subnet is 10.x.x.x
           | and is often used for corporate networks and other larger
           | subnets experience similar usage. People on the plane using
           | WiFi are likely to access their corporate networks via VPN,
           | potentially causing routing issues.
           | 
           | Users VPNing into the reused address space for their own home
           | VPN are probably knowledgeable enough to figure out what is
           | going on and a small enough user base to not care about.
        
             | Filligree wrote:
             | Couldn't we spare a single extra /8 for airplanes to use?
             | 
             | Though I suppose it's not worth it when you can hit
             | 'reboot'.
        
             | ordersofmag wrote:
             | I'm no network guy so someone please explain why using
             | 10.x.x.x. on a plane might "potentially cause routing
             | issues"? It doesn't jive with what I understand about
             | unrouteable address spaces. Is the 10.x.x.x space somehow
             | different than the 192.168.x.x space that millions of
             | people use VPN's out of every day (basically every WFH
             | person on their cheap NAT'd home Wifi)?
        
       | jcelerier wrote:
       | 51 days * 86400 seconds * 1000
       | 
       | => 4406400000
       | 
       | 2^32
       | 
       | => 4294967296
       | 
       | the coincidence seems unlikely, it's basically ~~5 hours and a
       | half~~ 30 hours of difference if one has a 1-ms counter increment
        
         | Dylan16807 wrote:
         | It's a day and a half difference, and since 2^32 is the
         | _smaller_ number that would be pretty catastrophic. Pretty
         | likely it 's coincidence.
        
         | sitkack wrote:
         | Watch Windows 95 crash live as it exceeds 49.7 days uptime
         | https://news.ycombinator.com/item?id=28340101
         | 
         | Must be a northwest washington thing.
        
         | throwbadubadu wrote:
         | Not getting it.. yeah the famous 32 bit ms overflow after 49
         | something days. But why then 51 here? Shouldn't they be
         | required to reboot after 49 days please please? :D
        
           | tines wrote:
           | Possibly cumulative error in the timing source?
        
             | icelancer wrote:
             | This is even scarier than the base concern.
        
             | jcelerier wrote:
             | Or just ticking every 1.025 ms (e.g. at 975 Hz instead of
             | 1khz)... that brings us to :                   (4406400000
             | - 1.025*2 ^ 32)/1000
             | 
             | so a difference of 1.12 hours with the "51 days" mention.
        
             | hinkley wrote:
             | It's possible to run tasks instead of starting every
             | second, starting one second after the previous iteration
             | finishes.
             | 
             | So if you have something that checks the system health
             | every millisecond, and keeps a count instead of a duration,
             | then if it takes a couple microseconds to complete you
             | might get something less than 86 million ticks per day
             | instead of 86.4 million.
        
               | Jtsummers wrote:
               | The OS used on the 787 has a hard real-time scheduler.
               | Tasks are started up at a specific frequency (set per
               | task), run to completion or to the end of their time slot
               | (set per task) and terminated. We had, IIRC, a strict
               | 100ms slot for our bit of LRU software to do everything
               | and it would be launched every 1s (from memory, that was
               | 15 years ago). Information could be stored between
               | executions so partial completion is something you could
               | handle if needed by storing state information and using
               | it at the start of the next iteration (we didn't need
               | that, our tasks finished in the slot).
               | 
               | You don't base the start of a future task on the end of
               | the prior one, you base it on a fixed clock for these
               | kinds of systems.
        
           | amelius wrote:
           | Maybe it takes 2 days to boot the entire thing?
        
         | thamer wrote:
         | Where did you get 5 hours and a half? It seems to be closer to
         | 31 hours:                   >>> round((4406400000 -
         | 2**32)/(1000 * 3600), 3)         30.954
        
           | jcelerier wrote:
           | from me typing too quickly in bc, apparently :')
        
       | avelis wrote:
       | In the software world I call this an end user discovered issue.
       | But when the issue involves a plane that is carrying actual
       | souls. That can feel very scary.
       | 
       | I am sure this has been resolved by now since its from 2020.
        
         | recursive wrote:
         | I don't think airplane software ships updates the way npm
         | packages do. I would be more surprised if this _is_ fixed.
        
           | thecosmicfrog wrote:
           | > I don't think airplane software ships updates the way npm
           | packages do.
           | 
           | I'd ideally like to sleep tonight, thanks.
        
           | advisedwang wrote:
           | I think from the point of view of Boeing, the FAA and the
           | airlines, "put it in our maintenance checklist to reboot
           | every 51 days" _is_ a fix.
        
             | woah wrote:
             | With that framing, this sounds like one of the easiest
             | maintenance tasks imaginable. No wrenches or grease
             | involved.
        
         | Dylan16807 wrote:
         | That depends on how much code was having trouble, and what you
         | mean by "resolved".
         | 
         | The safe option might be to avoid the situation, and I could
         | imagine that even if there is a code update it might just make
         | the plane balk at getting ready to take off after a certain
         | amount of uptime.
        
         | AmVess wrote:
         | Scary would be right.
         | 
         | Reminds me of the F-22 Raptor crossing the International
         | Dateline error in 2007. They were flying a squadron of them
         | from Hawaii to Japan. They crossed the IDL and all nav/fuel
         | systems went down, as well as some communications gear.
         | 
         | They only made it back because they were flying with tankers at
         | time, who led them back to base.
        
       | Dylan16807 wrote:
       | Previous: https://news.ycombinator.com/item?id=22761395
       | https://news.ycombinator.com/item?id=33233827
       | 
       | More interesting, a root cause analysis:
       | https://news.ycombinator.com/item?id=33239443
       | https://ioactive.com/reverse-engineers-perspective-on-the-bo...
       | 
       | The 47 bit timestamp at 32MHz would explain the duration (Though
       | not why it isn't 33MHz?).
        
       | rich_sasha wrote:
       | Scary as it is, is there any reason for a passenger jet to have
       | uptime if more than, say, 24hrs? Wouldn't you just switch it off
       | and on again between every flight, regardless?
       | 
       | If this issue was in a car, we would never know as no one keeps
       | their car running for 50 days straight.
        
         | fnordpiglet wrote:
         | I'll bet you the typical EV stays powered on 24/7 with reboots
         | around OTA updates.
        
           | garyfirestorm wrote:
           | unsure what you mean here. most of the systems go to a sleep
           | state in modern vehicles ev or not. the 12v battery keeps
           | only certain ECU's up - think ECUs that control alarm, lock
           | and unlock state and any communication with the mobile app
           | via LTE... but the rest of the systems are OFF, you don't
           | want an EV battery to hit 0% and 12V to also hit 0% - that
           | would basically make it a brick from what I understand-
           | because EV's have contactors which need to shut for the
           | battery to be 'engaged' the 12V battery controls these
           | contactors.
        
             | fnordpiglet wrote:
             | A car with an enormous rack of high capacity batteries able
             | to accelerate an 8000 pound object to 60mph and sustain
             | that for hundreds of miles generally doesn't depend on the
             | backup battery for literally anything. It has so much
             | excess energy storage in the form of electricity in the
             | primary batteries it generally doesn't power down the
             | onboard computers at all.
             | 
             | Indeed when you get close to exhausting the main battery
             | rack it starts selectively shutting down everything. I've
             | never personally let mine get to 0% ever - but for instance
             | a Tesla is continuously on, and if you use sentry mode it's
             | not just on but the GPU is constantly doing classification
             | of the environment to determine if someone is prowling your
             | vehicle.
        
         | themoonisachees wrote:
         | Some of these planes are constantly flying as long as they're
         | not in maintenance. A plane not in the air is a plane the
         | company bought that's not currently generating profit.
        
         | ceejayoz wrote:
         | Overnight, planes tend to be plugged in to ground power, to
         | ventilate, keep the batteries charged, for the cleaning crews,
         | etc. Most get rebooted once in a while, but it's always
         | possible one won't be, hence the directive to be certain.
         | 
         | This particular problem has been known for years (the article
         | is from 2020).
        
           | n_ary wrote:
           | Unfortunately, an aircraft has no "reboot". It is just a
           | violent power cut. A lot of headache is introduced in non-
           | critical aircraft software because there is no "graceful
           | shutdown" or long power duration. Infact, certain hardware
           | has an upper limit(much lower than a week) before which it
           | needs one power cut(sometimes called power cycle) or it
           | suffers from various buffer overflow, counter overflow and
           | starts acting mysterious.
        
             | kulahan wrote:
             | >an aircraft has no "reboot". It is just a violent power
             | cut
             | 
             | Guess how I typically reboot things :)
        
               | thfuran wrote:
               | By traveling to Mexico and laying out bait along the
               | migratory path of the butterflies?
        
             | ceejayoz wrote:
             | > Unfortunately, an aircraft has no "reboot". It is just a
             | violent power cut.
             | 
             | That's a reboot.
        
             | jcgrillo wrote:
             | It's amazing that's legal. Like, why do we accept software
             | that does this? It can be done in such a way that these
             | things don't happen.Put another way, why aren't the
             | companies involved being fined and sued out of business?
             | Why aren't their managers facing criminal negligence
             | charges? It's outrageous.
        
               | ceejayoz wrote:
               | Because it works fine. A maintenance tech gets one extra
               | line item on the weekly or monthly inspection checklist.
        
               | jcgrillo wrote:
               | It works fine until it doesn't and people die. At which
               | point the blame falls on the maintenance crew? That's
               | _wrong_. And where there 's smoke there's fire. If the
               | software has this horrible bug, likely the broken culture
               | that created it has written worse, more subtle bugs.
        
               | ceejayoz wrote:
               | Commercial air travel in the US is incredibly safe. The
               | last fatal crash was in 2009.
        
               | Veserv wrote:
               | Because there has never been a single commercial jetliner
               | fatality caused by software in its intended operational
               | domain failing to operate according to specification.
               | That makes the commercial jetliner software development
               | and deployment process by far the safest and highest
               | reliability ever conceived by multiple orders of
               | magnitude. We are talking in the 10-12 9s range.
               | 
               | And just to get ahead of: "Well what about the 737 MAX",
               | that was a system specification error, not due to "buggy"
               | software failing to conform to its specification. The
               | software did what it was supposed to do, but it should
               | not have been designed to do that given the
               | characteristics of the plane and the safety process
               | around its usage.
        
         | sitkack wrote:
         | Many car's control units continue to run while the car is off.
         | If you want to reboot your vehicle, you need to unplug the 12v
         | battery for at least a minute.
        
           | jcgrillo wrote:
           | On some cars (recent VWs in particular) when you plug the
           | battery back in you need to twiddle some settings in the
           | computer otherwise the charging circuit will fry the battery
           | prematurely. We've gotten ahead of our skis with this
           | nonsense, time to rein it in.
        
             | symisc_devel wrote:
             | This issue is notorious for BMW cars. You have to notify
             | the ECU each time you install a new battery.
        
               | dzhiurgis wrote:
               | Ahhh, "program a new battery" $400 please.
        
               | jcgrillo wrote:
               | It's hard to imagine an interpretation of this behavior
               | that doesn't involve manufacturers trying to punish
               | independent mechanics and end users who service their own
               | cars. Like, there's no way it's an "honest mistake",
               | right?
               | 
               | BTW I have an AGM ("advanced glass mat") battery in my
               | 1995 Toyota which has a completely analog charging
               | system, and it doesn't get cooked, so it's not because
               | there's something special about the battery.
        
               | HeyLaughingBoy wrote:
               | Don't attribute to malice what can easily be explained by
               | overstressed Systems Engineers trying to resolve multiple
               | conflicting Requirements.
        
             | RichardHesketh wrote:
             | Rein. It's about controlling a horse, not an entire nation.
        
               | jcgrillo wrote:
               | Thanks, I blame phone autocorrect
        
         | n_ary wrote:
         | Very strange, because for me, an aircraft(medium) is never
         | alive for more than 24h. A big one like 787 may be alive for up
         | to 72h(assuming longer routes). 50 days for me would be a dream
         | and a lot less headache but it is very expensive to keep an
         | aircraft powered that long with ground power.
        
         | rodgerd wrote:
         | It's another thing on a checklist that can go wrong.
        
       | akira2501 wrote:
       | > This alarming-sounding situation
       | 
       | That's not what's alarming to me. What's alarming is that the
       | plane could possibly be in a position to be continuously powered
       | on for 51 days in the first place.
        
         | stavros wrote:
         | When a minute of downtime costs thousands, why wouldn't you
         | expect planes to be in constant utilization?
        
           | fallingknife wrote:
           | The number of flights varies a lot by time of day, so there
           | is nothing close to constant utilization.
        
             | Filligree wrote:
             | There's not much reason to turn them off outside of
             | maintenance. When they're parked, they're connected to grid
             | power.
        
               | n_ary wrote:
               | A parked Aircraft is not kept powered when there are no
               | maintenance or other routine(cleaning/checks/certificatio
               | n/preparation/restocking etc.)
               | 
               | It is very surprising that how a lot of comments here
               | claim the contrary.
               | 
               | Even when parked for next flight, until resupply and
               | cargo routines are declared, it is also not powered.
        
               | thecosmicfrog wrote:
               | Airliners are regularly and routinely shut down. "Cold
               | and dark" is a common startup procedure for the first
               | flight of the day.
        
             | CactusOnFire wrote:
             | I've flown with airlines before where there was a cascading
             | delay due to a "plane deficit" at the terminal (not the
             | technical term, that's my own). Not to say it's always
             | uptime, but I imagine there are instances of constant
             | uptime.
        
               | fallingknife wrote:
               | They can't just change things up on a dime like that.
               | Even if it's 3 AM and most planes are sitting on the
               | ground they can't just be used for your flight like that
               | because they are all scheduled to take off in the morning
               | rush a few hours later.
        
           | akira2501 wrote:
           | > why wouldn't you expect planes to be in constant
           | utilization?
           | 
           | They require weekly maintenance which takes them out of
           | service for at least 12 hours.
           | 
           | What we may of as 'constant utilization' is quite different
           | in a regulated fleet environment like airlines.
        
             | hinkley wrote:
             | maintenance would happen with the aircraft in 'wheels on
             | ground' mode but that may not mean all systems are turned
             | off. I expect it's like a bug in the SMC on a computer. To
             | really turn it off you have to do some magic.
        
             | stavros wrote:
             | "Constant utilization" means "they aren't sitting idle",
             | not "they aren't undergoing necessary maintenance ever".
        
       | smcleod wrote:
       | I was speaking with a 787 pilot last Sunday, I told him that the
       | week before when I was at an airport there were two pilots
       | sitting next to me talking about how "This is the third bloody
       | 787 rescue we've had this month... I can't believe we had full
       | engine and <I think he said auxiliary?> failure at the same time"
       | - I asked him if this is common and he said "I hear of it, but I
       | haven't had that many major failures, but lots of little things -
       | last time I flew in from <city> a few moments after we touched
       | down we lost auxiliary power from the rear engine, all the cabin
       | lighting went black along with a number of other things,
       | thankfully we'd already significantly reduced speed and were
       | straight and already lost most of the speed we were carrying, so
       | we were fine and taxied to the disembark location, they had it up
       | and flying again within the day - but it certainly was
       | disconcerting to say the least".
       | 
       | I will be slightly paraphrasing from memory there, but certainly
       | was quite surprised how calm he was about the whole thing,
       | there's no way I'd board one of those things.
        
         | Filligree wrote:
         | APU failure maybe? That would be troublesome indeed; with no
         | engines and no APU you'd lose most instrumentation and a lot of
         | the hydraulics.
        
           | smcleod wrote:
           | I thought the guy I was speaking mentioned something about
           | instrumentation but I wasn't 100% sure and that sounded more
           | serious so didn't mention it - but if the aux engine failing
           | would do that - I guess that lines up!
        
           | n_ary wrote:
           | There is also a RAT at the back that can be deployed to
           | generate some power(~5-10 minutes max) in case of severe
           | emergency in Air. It is what you hear sometimes, when the
           | aircraft is making a very shrill noise flying over your head.
           | 
           | However, if it is not a test flight, a RAT deployment should
           | make you very uncomfortable and worried...
        
             | ortusdux wrote:
             | https://en.wikipedia.org/wiki/Ram_air_turbine
        
             | iwontberude wrote:
             | I find it hard to believe that anyone reading this was
             | within earshot of a plane in a severe emergency and heard
             | this particular sound and since turbine engines are already
             | quite shrill I am basically just sorta confused who your
             | audience is for this suggestion.
        
         | jiggawatts wrote:
         | Modern two-engine planes like the 787 have an _auxiliary power
         | unit_ (APU) in the tail. This is a small turbine that runs a
         | generator and a pump for the hydraulics. It's typically only
         | turned on when the plane is on the ground, or if there's an
         | emergency in mid-air. It is also needed to start the main
         | engines so if the APU is faulty the plane will probably be
         | stuck where it is. In theory a 787 can take off with just one
         | engine but this is not very safe and wouldn't be done in all
         | but the most exceptional circumstances.
         | 
         | There are variations on this depending on the plane model, of
         | course. Some older planes can use an external starter for their
         | engines, but I think that's very rare now.
        
           | thecosmicfrog wrote:
           | Aircraft with INOP APUs can generally be "air started" with a
           | ground-based high-pressure air system. It's relatively common
           | and I've been on a plane that had to do the procedure. It was
           | entirely undramatic other than engines being started _before_
           | the pushback, but I doubt most passengers even noticed.
           | 
           | Now, interestingly, the 787 is a "bleedless" aircraft, so it
           | doesn't use high-pressure air from the APU to spool up the
           | engines. I believe it can use its hefty bank of lithium-ion
           | batteries to start its engines if the APU (and associated
           | electrical generator) is INOP.
           | 
           | Not a pilot/engineer - just an enthusiast. Someone more au
           | fait with the 787 might be able to correct me on the above.
        
             | hinkley wrote:
             | My understanding is that there was a push to modify the U
             | shaped tow trucks they use to position planes to have a
             | battery powered system to start the engines.
             | 
             | The idea being that the APU isn't particularly clean
             | burning, not compared to power plant emissions. It's been a
             | long while since I've heard anything about that plan, for
             | or against.
        
       | fnordpiglet wrote:
       | I'd note that commercial airplanes generally operate with 6-7 9's
       | of availability. For anyone that's ever built a system with 5
       | 9's, this is impressive. In fact it's impressive enough you
       | probably don't think twice about sleeping on a flight.
        
         | woah wrote:
         | If something goes wrong, does it matter whether you are asleep
         | or awake?
        
           | greenchair wrote:
           | woosh!
        
           | vkou wrote:
           | Only when a flight attendant is asking on the intercom: "We
           | don't mean to alarm anyone, but is anyone on board a pilot?"
           | and you happen to be one.
        
             | incognito124 wrote:
             | It's entirely a different kind of flying
             | 
             | All together
        
             | hooverd wrote:
             | Hopefully you didn't have the fish.
        
             | hggigg wrote:
             | I know a commercial pilot who used that as a joke once and
             | got in trouble. The plane in question had several pilots on
             | it but the rest of the passengers didn't find it funny for
             | obvious reasons.
        
             | LeonB wrote:
             | "We don't wish to cause any alarm, but is there any one on
             | board who is familiar with regular expressions, cron
             | expressions and parameter expansion rules in bash?"
        
       | joejohnson wrote:
       | This was news in 2020. Has it been fixed?
        
       | tedunangst wrote:
       | And 4.5 years later, what's new?
        
       | xyst wrote:
       | This company just can't stay out of the news. Their planes are
       | trash. Software is straight garbage. Many people have died
       | because of this company and suffered undue stress/anxiety because
       | of the massive dip in quality.
       | 
       | Boeing engineers/builders caught on audio stating they wouldn't
       | be caught dead in their own planes unless feeling suicidal.
        
       | boohoo123 wrote:
       | this is what happens when you hire based on checked checkboxes
       | and not qualifications.
        
       | hggigg wrote:
       | Had a similar problem to this many years ago. Happened every 24
       | days approximately and lost one user setting. Had a logic
       | analyser connected to it for days trying to reproduce the issue
       | in some way. Went to go for a piss and get a coffee one afternoon
       | and came back and there it was triggered!
       | 
       | What happened? Well it turns out there was a timer that no one
       | used that overflowed and caused an interrupt which wasn't handled
       | any more, the interrupt handler fell through, caused a halt and
       | the WDT fired fire rebooting it and some idiot hadn't stored that
       | one setting in the NVRAM.
       | 
       | So then we had more problems. 5000 things with EPROMs in that
       | were rebooting every 24 days which were spread all over the
       | planet. Many questions to ask over how the hell it ended up like
       | that.
       | 
       | I hope people are asking these sorts of questions at Boeing.
       | 
       | Edit: also the source code we had did not match what was on the
       | devices. Turned out the engineer who provided the hex file hadn't
       | copied that code to the file server and had left a year before
       | hand. We didn't find that until the WDT fired and piqued our
       | interest and could reproduce it on the dev board because the
       | software was different (should have checked that past the label
       | on the ROM which was wrong!)
        
       ___________________________________________________________________
       (page generated 2024-10-24 23:01 UTC)