[HN Gopher] Boeing 787s must be reset every 51 days or 'misleadi...
___________________________________________________________________
Boeing 787s must be reset every 51 days or 'misleading data' is
shown to pilots
Author : jakey_bakey
Score : 94 points
Date : 2024-10-24 20:19 UTC (2 hours ago)
(HTM) web link (www.theregister.com)
(TXT) w3m dump (www.theregister.com)
| tomudding wrote:
| (2020)
| shadowgovt wrote:
| This is remarkably business-as-usual for airplane electronics.
|
| As a more mundane example: the wifi on planes does temporary
| [edit: DHCP, not NAT] leases. But the system on many has
| expiration windows on the order of hours, possibly more than a
| day... Couple that with the number of passengers planes serve and
| busy routes can easily exhaust the lease pool.
|
| The solution: there's a button the flight attendants can push to
| reboot the router, dumping the lease table.
| JosephRedfern wrote:
| Nitpicking here, but you mean DHCP rather than NAT, right?
| shadowgovt wrote:
| Yes; thank you.
| Matheus28 wrote:
| Even with super long leases, couldn't they just have a larger
| subnet? A /8 oughta do it.
|
| But I guess we're talking about the same people who made the
| mistake in the first place...
| jmholla wrote:
| To steelman the choice, the reserved IP /8 subnet is 10.x.x.x
| and is often used for corporate networks and other larger
| subnets experience similar usage. People on the plane using
| WiFi are likely to access their corporate networks via VPN,
| potentially causing routing issues.
|
| Users VPNing into the reused address space for their own home
| VPN are probably knowledgeable enough to figure out what is
| going on and a small enough user base to not care about.
| Filligree wrote:
| Couldn't we spare a single extra /8 for airplanes to use?
|
| Though I suppose it's not worth it when you can hit
| 'reboot'.
| ordersofmag wrote:
| I'm no network guy so someone please explain why using
| 10.x.x.x. on a plane might "potentially cause routing
| issues"? It doesn't jive with what I understand about
| unrouteable address spaces. Is the 10.x.x.x space somehow
| different than the 192.168.x.x space that millions of
| people use VPN's out of every day (basically every WFH
| person on their cheap NAT'd home Wifi)?
| jcelerier wrote:
| 51 days * 86400 seconds * 1000
|
| => 4406400000
|
| 2^32
|
| => 4294967296
|
| the coincidence seems unlikely, it's basically ~~5 hours and a
| half~~ 30 hours of difference if one has a 1-ms counter increment
| Dylan16807 wrote:
| It's a day and a half difference, and since 2^32 is the
| _smaller_ number that would be pretty catastrophic. Pretty
| likely it 's coincidence.
| sitkack wrote:
| Watch Windows 95 crash live as it exceeds 49.7 days uptime
| https://news.ycombinator.com/item?id=28340101
|
| Must be a northwest washington thing.
| throwbadubadu wrote:
| Not getting it.. yeah the famous 32 bit ms overflow after 49
| something days. But why then 51 here? Shouldn't they be
| required to reboot after 49 days please please? :D
| tines wrote:
| Possibly cumulative error in the timing source?
| icelancer wrote:
| This is even scarier than the base concern.
| jcelerier wrote:
| Or just ticking every 1.025 ms (e.g. at 975 Hz instead of
| 1khz)... that brings us to : (4406400000
| - 1.025*2 ^ 32)/1000
|
| so a difference of 1.12 hours with the "51 days" mention.
| hinkley wrote:
| It's possible to run tasks instead of starting every
| second, starting one second after the previous iteration
| finishes.
|
| So if you have something that checks the system health
| every millisecond, and keeps a count instead of a duration,
| then if it takes a couple microseconds to complete you
| might get something less than 86 million ticks per day
| instead of 86.4 million.
| Jtsummers wrote:
| The OS used on the 787 has a hard real-time scheduler.
| Tasks are started up at a specific frequency (set per
| task), run to completion or to the end of their time slot
| (set per task) and terminated. We had, IIRC, a strict
| 100ms slot for our bit of LRU software to do everything
| and it would be launched every 1s (from memory, that was
| 15 years ago). Information could be stored between
| executions so partial completion is something you could
| handle if needed by storing state information and using
| it at the start of the next iteration (we didn't need
| that, our tasks finished in the slot).
|
| You don't base the start of a future task on the end of
| the prior one, you base it on a fixed clock for these
| kinds of systems.
| amelius wrote:
| Maybe it takes 2 days to boot the entire thing?
| thamer wrote:
| Where did you get 5 hours and a half? It seems to be closer to
| 31 hours: >>> round((4406400000 -
| 2**32)/(1000 * 3600), 3) 30.954
| jcelerier wrote:
| from me typing too quickly in bc, apparently :')
| avelis wrote:
| In the software world I call this an end user discovered issue.
| But when the issue involves a plane that is carrying actual
| souls. That can feel very scary.
|
| I am sure this has been resolved by now since its from 2020.
| recursive wrote:
| I don't think airplane software ships updates the way npm
| packages do. I would be more surprised if this _is_ fixed.
| thecosmicfrog wrote:
| > I don't think airplane software ships updates the way npm
| packages do.
|
| I'd ideally like to sleep tonight, thanks.
| advisedwang wrote:
| I think from the point of view of Boeing, the FAA and the
| airlines, "put it in our maintenance checklist to reboot
| every 51 days" _is_ a fix.
| woah wrote:
| With that framing, this sounds like one of the easiest
| maintenance tasks imaginable. No wrenches or grease
| involved.
| Dylan16807 wrote:
| That depends on how much code was having trouble, and what you
| mean by "resolved".
|
| The safe option might be to avoid the situation, and I could
| imagine that even if there is a code update it might just make
| the plane balk at getting ready to take off after a certain
| amount of uptime.
| AmVess wrote:
| Scary would be right.
|
| Reminds me of the F-22 Raptor crossing the International
| Dateline error in 2007. They were flying a squadron of them
| from Hawaii to Japan. They crossed the IDL and all nav/fuel
| systems went down, as well as some communications gear.
|
| They only made it back because they were flying with tankers at
| time, who led them back to base.
| Dylan16807 wrote:
| Previous: https://news.ycombinator.com/item?id=22761395
| https://news.ycombinator.com/item?id=33233827
|
| More interesting, a root cause analysis:
| https://news.ycombinator.com/item?id=33239443
| https://ioactive.com/reverse-engineers-perspective-on-the-bo...
|
| The 47 bit timestamp at 32MHz would explain the duration (Though
| not why it isn't 33MHz?).
| rich_sasha wrote:
| Scary as it is, is there any reason for a passenger jet to have
| uptime if more than, say, 24hrs? Wouldn't you just switch it off
| and on again between every flight, regardless?
|
| If this issue was in a car, we would never know as no one keeps
| their car running for 50 days straight.
| fnordpiglet wrote:
| I'll bet you the typical EV stays powered on 24/7 with reboots
| around OTA updates.
| garyfirestorm wrote:
| unsure what you mean here. most of the systems go to a sleep
| state in modern vehicles ev or not. the 12v battery keeps
| only certain ECU's up - think ECUs that control alarm, lock
| and unlock state and any communication with the mobile app
| via LTE... but the rest of the systems are OFF, you don't
| want an EV battery to hit 0% and 12V to also hit 0% - that
| would basically make it a brick from what I understand-
| because EV's have contactors which need to shut for the
| battery to be 'engaged' the 12V battery controls these
| contactors.
| fnordpiglet wrote:
| A car with an enormous rack of high capacity batteries able
| to accelerate an 8000 pound object to 60mph and sustain
| that for hundreds of miles generally doesn't depend on the
| backup battery for literally anything. It has so much
| excess energy storage in the form of electricity in the
| primary batteries it generally doesn't power down the
| onboard computers at all.
|
| Indeed when you get close to exhausting the main battery
| rack it starts selectively shutting down everything. I've
| never personally let mine get to 0% ever - but for instance
| a Tesla is continuously on, and if you use sentry mode it's
| not just on but the GPU is constantly doing classification
| of the environment to determine if someone is prowling your
| vehicle.
| themoonisachees wrote:
| Some of these planes are constantly flying as long as they're
| not in maintenance. A plane not in the air is a plane the
| company bought that's not currently generating profit.
| ceejayoz wrote:
| Overnight, planes tend to be plugged in to ground power, to
| ventilate, keep the batteries charged, for the cleaning crews,
| etc. Most get rebooted once in a while, but it's always
| possible one won't be, hence the directive to be certain.
|
| This particular problem has been known for years (the article
| is from 2020).
| n_ary wrote:
| Unfortunately, an aircraft has no "reboot". It is just a
| violent power cut. A lot of headache is introduced in non-
| critical aircraft software because there is no "graceful
| shutdown" or long power duration. Infact, certain hardware
| has an upper limit(much lower than a week) before which it
| needs one power cut(sometimes called power cycle) or it
| suffers from various buffer overflow, counter overflow and
| starts acting mysterious.
| kulahan wrote:
| >an aircraft has no "reboot". It is just a violent power
| cut
|
| Guess how I typically reboot things :)
| thfuran wrote:
| By traveling to Mexico and laying out bait along the
| migratory path of the butterflies?
| ceejayoz wrote:
| > Unfortunately, an aircraft has no "reboot". It is just a
| violent power cut.
|
| That's a reboot.
| jcgrillo wrote:
| It's amazing that's legal. Like, why do we accept software
| that does this? It can be done in such a way that these
| things don't happen.Put another way, why aren't the
| companies involved being fined and sued out of business?
| Why aren't their managers facing criminal negligence
| charges? It's outrageous.
| ceejayoz wrote:
| Because it works fine. A maintenance tech gets one extra
| line item on the weekly or monthly inspection checklist.
| jcgrillo wrote:
| It works fine until it doesn't and people die. At which
| point the blame falls on the maintenance crew? That's
| _wrong_. And where there 's smoke there's fire. If the
| software has this horrible bug, likely the broken culture
| that created it has written worse, more subtle bugs.
| ceejayoz wrote:
| Commercial air travel in the US is incredibly safe. The
| last fatal crash was in 2009.
| Veserv wrote:
| Because there has never been a single commercial jetliner
| fatality caused by software in its intended operational
| domain failing to operate according to specification.
| That makes the commercial jetliner software development
| and deployment process by far the safest and highest
| reliability ever conceived by multiple orders of
| magnitude. We are talking in the 10-12 9s range.
|
| And just to get ahead of: "Well what about the 737 MAX",
| that was a system specification error, not due to "buggy"
| software failing to conform to its specification. The
| software did what it was supposed to do, but it should
| not have been designed to do that given the
| characteristics of the plane and the safety process
| around its usage.
| sitkack wrote:
| Many car's control units continue to run while the car is off.
| If you want to reboot your vehicle, you need to unplug the 12v
| battery for at least a minute.
| jcgrillo wrote:
| On some cars (recent VWs in particular) when you plug the
| battery back in you need to twiddle some settings in the
| computer otherwise the charging circuit will fry the battery
| prematurely. We've gotten ahead of our skis with this
| nonsense, time to rein it in.
| symisc_devel wrote:
| This issue is notorious for BMW cars. You have to notify
| the ECU each time you install a new battery.
| dzhiurgis wrote:
| Ahhh, "program a new battery" $400 please.
| jcgrillo wrote:
| It's hard to imagine an interpretation of this behavior
| that doesn't involve manufacturers trying to punish
| independent mechanics and end users who service their own
| cars. Like, there's no way it's an "honest mistake",
| right?
|
| BTW I have an AGM ("advanced glass mat") battery in my
| 1995 Toyota which has a completely analog charging
| system, and it doesn't get cooked, so it's not because
| there's something special about the battery.
| HeyLaughingBoy wrote:
| Don't attribute to malice what can easily be explained by
| overstressed Systems Engineers trying to resolve multiple
| conflicting Requirements.
| RichardHesketh wrote:
| Rein. It's about controlling a horse, not an entire nation.
| jcgrillo wrote:
| Thanks, I blame phone autocorrect
| n_ary wrote:
| Very strange, because for me, an aircraft(medium) is never
| alive for more than 24h. A big one like 787 may be alive for up
| to 72h(assuming longer routes). 50 days for me would be a dream
| and a lot less headache but it is very expensive to keep an
| aircraft powered that long with ground power.
| rodgerd wrote:
| It's another thing on a checklist that can go wrong.
| akira2501 wrote:
| > This alarming-sounding situation
|
| That's not what's alarming to me. What's alarming is that the
| plane could possibly be in a position to be continuously powered
| on for 51 days in the first place.
| stavros wrote:
| When a minute of downtime costs thousands, why wouldn't you
| expect planes to be in constant utilization?
| fallingknife wrote:
| The number of flights varies a lot by time of day, so there
| is nothing close to constant utilization.
| Filligree wrote:
| There's not much reason to turn them off outside of
| maintenance. When they're parked, they're connected to grid
| power.
| n_ary wrote:
| A parked Aircraft is not kept powered when there are no
| maintenance or other routine(cleaning/checks/certificatio
| n/preparation/restocking etc.)
|
| It is very surprising that how a lot of comments here
| claim the contrary.
|
| Even when parked for next flight, until resupply and
| cargo routines are declared, it is also not powered.
| thecosmicfrog wrote:
| Airliners are regularly and routinely shut down. "Cold
| and dark" is a common startup procedure for the first
| flight of the day.
| CactusOnFire wrote:
| I've flown with airlines before where there was a cascading
| delay due to a "plane deficit" at the terminal (not the
| technical term, that's my own). Not to say it's always
| uptime, but I imagine there are instances of constant
| uptime.
| fallingknife wrote:
| They can't just change things up on a dime like that.
| Even if it's 3 AM and most planes are sitting on the
| ground they can't just be used for your flight like that
| because they are all scheduled to take off in the morning
| rush a few hours later.
| akira2501 wrote:
| > why wouldn't you expect planes to be in constant
| utilization?
|
| They require weekly maintenance which takes them out of
| service for at least 12 hours.
|
| What we may of as 'constant utilization' is quite different
| in a regulated fleet environment like airlines.
| hinkley wrote:
| maintenance would happen with the aircraft in 'wheels on
| ground' mode but that may not mean all systems are turned
| off. I expect it's like a bug in the SMC on a computer. To
| really turn it off you have to do some magic.
| stavros wrote:
| "Constant utilization" means "they aren't sitting idle",
| not "they aren't undergoing necessary maintenance ever".
| smcleod wrote:
| I was speaking with a 787 pilot last Sunday, I told him that the
| week before when I was at an airport there were two pilots
| sitting next to me talking about how "This is the third bloody
| 787 rescue we've had this month... I can't believe we had full
| engine and <I think he said auxiliary?> failure at the same time"
| - I asked him if this is common and he said "I hear of it, but I
| haven't had that many major failures, but lots of little things -
| last time I flew in from <city> a few moments after we touched
| down we lost auxiliary power from the rear engine, all the cabin
| lighting went black along with a number of other things,
| thankfully we'd already significantly reduced speed and were
| straight and already lost most of the speed we were carrying, so
| we were fine and taxied to the disembark location, they had it up
| and flying again within the day - but it certainly was
| disconcerting to say the least".
|
| I will be slightly paraphrasing from memory there, but certainly
| was quite surprised how calm he was about the whole thing,
| there's no way I'd board one of those things.
| Filligree wrote:
| APU failure maybe? That would be troublesome indeed; with no
| engines and no APU you'd lose most instrumentation and a lot of
| the hydraulics.
| smcleod wrote:
| I thought the guy I was speaking mentioned something about
| instrumentation but I wasn't 100% sure and that sounded more
| serious so didn't mention it - but if the aux engine failing
| would do that - I guess that lines up!
| n_ary wrote:
| There is also a RAT at the back that can be deployed to
| generate some power(~5-10 minutes max) in case of severe
| emergency in Air. It is what you hear sometimes, when the
| aircraft is making a very shrill noise flying over your head.
|
| However, if it is not a test flight, a RAT deployment should
| make you very uncomfortable and worried...
| ortusdux wrote:
| https://en.wikipedia.org/wiki/Ram_air_turbine
| iwontberude wrote:
| I find it hard to believe that anyone reading this was
| within earshot of a plane in a severe emergency and heard
| this particular sound and since turbine engines are already
| quite shrill I am basically just sorta confused who your
| audience is for this suggestion.
| jiggawatts wrote:
| Modern two-engine planes like the 787 have an _auxiliary power
| unit_ (APU) in the tail. This is a small turbine that runs a
| generator and a pump for the hydraulics. It's typically only
| turned on when the plane is on the ground, or if there's an
| emergency in mid-air. It is also needed to start the main
| engines so if the APU is faulty the plane will probably be
| stuck where it is. In theory a 787 can take off with just one
| engine but this is not very safe and wouldn't be done in all
| but the most exceptional circumstances.
|
| There are variations on this depending on the plane model, of
| course. Some older planes can use an external starter for their
| engines, but I think that's very rare now.
| thecosmicfrog wrote:
| Aircraft with INOP APUs can generally be "air started" with a
| ground-based high-pressure air system. It's relatively common
| and I've been on a plane that had to do the procedure. It was
| entirely undramatic other than engines being started _before_
| the pushback, but I doubt most passengers even noticed.
|
| Now, interestingly, the 787 is a "bleedless" aircraft, so it
| doesn't use high-pressure air from the APU to spool up the
| engines. I believe it can use its hefty bank of lithium-ion
| batteries to start its engines if the APU (and associated
| electrical generator) is INOP.
|
| Not a pilot/engineer - just an enthusiast. Someone more au
| fait with the 787 might be able to correct me on the above.
| hinkley wrote:
| My understanding is that there was a push to modify the U
| shaped tow trucks they use to position planes to have a
| battery powered system to start the engines.
|
| The idea being that the APU isn't particularly clean
| burning, not compared to power plant emissions. It's been a
| long while since I've heard anything about that plan, for
| or against.
| fnordpiglet wrote:
| I'd note that commercial airplanes generally operate with 6-7 9's
| of availability. For anyone that's ever built a system with 5
| 9's, this is impressive. In fact it's impressive enough you
| probably don't think twice about sleeping on a flight.
| woah wrote:
| If something goes wrong, does it matter whether you are asleep
| or awake?
| greenchair wrote:
| woosh!
| vkou wrote:
| Only when a flight attendant is asking on the intercom: "We
| don't mean to alarm anyone, but is anyone on board a pilot?"
| and you happen to be one.
| incognito124 wrote:
| It's entirely a different kind of flying
|
| All together
| hooverd wrote:
| Hopefully you didn't have the fish.
| hggigg wrote:
| I know a commercial pilot who used that as a joke once and
| got in trouble. The plane in question had several pilots on
| it but the rest of the passengers didn't find it funny for
| obvious reasons.
| LeonB wrote:
| "We don't wish to cause any alarm, but is there any one on
| board who is familiar with regular expressions, cron
| expressions and parameter expansion rules in bash?"
| joejohnson wrote:
| This was news in 2020. Has it been fixed?
| tedunangst wrote:
| And 4.5 years later, what's new?
| xyst wrote:
| This company just can't stay out of the news. Their planes are
| trash. Software is straight garbage. Many people have died
| because of this company and suffered undue stress/anxiety because
| of the massive dip in quality.
|
| Boeing engineers/builders caught on audio stating they wouldn't
| be caught dead in their own planes unless feeling suicidal.
| boohoo123 wrote:
| this is what happens when you hire based on checked checkboxes
| and not qualifications.
| hggigg wrote:
| Had a similar problem to this many years ago. Happened every 24
| days approximately and lost one user setting. Had a logic
| analyser connected to it for days trying to reproduce the issue
| in some way. Went to go for a piss and get a coffee one afternoon
| and came back and there it was triggered!
|
| What happened? Well it turns out there was a timer that no one
| used that overflowed and caused an interrupt which wasn't handled
| any more, the interrupt handler fell through, caused a halt and
| the WDT fired fire rebooting it and some idiot hadn't stored that
| one setting in the NVRAM.
|
| So then we had more problems. 5000 things with EPROMs in that
| were rebooting every 24 days which were spread all over the
| planet. Many questions to ask over how the hell it ended up like
| that.
|
| I hope people are asking these sorts of questions at Boeing.
|
| Edit: also the source code we had did not match what was on the
| devices. Turned out the engineer who provided the hex file hadn't
| copied that code to the file server and had left a year before
| hand. We didn't find that until the WDT fired and piqued our
| interest and could reproduce it on the dev board because the
| software was different (should have checked that past the label
| on the ROM which was wrong!)
___________________________________________________________________
(page generated 2024-10-24 23:01 UTC)