[HN Gopher] How to avoid a BSOD on your 2B dollar spacecraft
       ___________________________________________________________________
        
       How to avoid a BSOD on your 2B dollar spacecraft
        
       Author : linebeck
       Score  : 55 points
       Date   : 2024-09-25 18:40 UTC (4 hours ago)
        
 (HTM) web link (clarkwakeland.com)
 (TXT) w3m dump (clarkwakeland.com)
        
       | sharpshadow wrote:
       | One must have balls of steel to run windows on a spaceship.
        
         | perching_aix wrote:
         | Personally, I wouldn't be stoked to run Linux on them either to
         | be honest. But both are being done. Practicality rules I
         | suppose.
        
           | anonzzzies wrote:
           | Having sourcecode to everything would make me trust things
           | more as at least we could fix things without calling MS.
           | 
           | But what would you run? QNX? BSD?
        
             | withinboredom wrote:
             | Just because you have the source code, doesn't mean you
             | have the knowledge to fix it before you die.
        
               | anonzzzies wrote:
               | Sure, but without it, you stand no chance at all.
        
               | exe34 wrote:
               | windows is source-available if you have deep enough
               | pockets: https://www.microsoft.com/en-
               | us/sharedsource/enterprise-sour...
        
               | anonzzzies wrote:
               | Anyway, I was responding to someone who wouldn't run
               | either on a spaceship so I still would like to know what
               | they would want to run. I am from a formal verification
               | school of thought, so I would want something sel4.
        
               | exe34 wrote:
               | I'd want manual backup pushbuttons.
        
               | foobar1962 wrote:
               | And another box close enough so the CD tray can press
               | them when it opens.
        
               | 0cf8612b2e1e wrote:
               | Does Microsoft also provide you the tools to build it? I
               | assume there are many Microsoft internal tools,
               | libraries, etc required to compile anything of note.
               | Presumably it has been dog fooded for so long it would be
               | impossible to bootstrap without some number of binary
               | artifacts in hand.
        
               | withinboredom wrote:
               | I feel like this misses the point so much that it might
               | as well be nonsense.
               | 
               | A better way to put my argument is: could an average mom
               | build Linux on specialized hardware in space? If the
               | answer is "yes", then you may have a point.
               | 
               | I don't think the answer is yes.
        
               | johnisgood wrote:
               | No, but you can review the source code before (or at any
               | point), whereas with Windows, you cannot even do that.
        
               | withinboredom wrote:
               | I've occasionally worked on drivers for windows and
               | linux. In either case, I didn't really need to read the
               | source code; neither was it a valuable proposition. If
               | the advertised API didn't do what it was supposed to, I
               | likely wouldn't have understood enough to fix it: and
               | this is my point.
               | 
               | Just because you can read it, doesn't mean you can or
               | will be able to actually fix it; not because of
               | technicality, but because of personal knowledge.
               | 
               | In this case, they are both black boxes.
        
               | johnisgood wrote:
               | I mean, I agree, I am just saying that it is better (in
               | general) to have the source than not having it.
        
               | withinboredom wrote:
               | How so?
               | 
               | I once spent three days trying to figure out an issue,
               | stepping line by line through hadoop (after figuring out
               | the issue was in hadoop and not my own code). Yay, I
               | proved the issue was actually in Java itself. Guess what
               | happened next? We avoided the bug. Why?
               | 
               | - We couldn't update Java.
               | 
               | - We couldn't change hadoop because we were using a
               | packaged solution. So, we just filed a bug with them.
               | 
               | Had the source not been available, we would have just
               | skipped all of that, and it would have been our vendor's
               | problem 3 days earlier.
        
               | icehawk wrote:
               | For something like the spacecraft in the article, you
               | absolutely would have the ability to get access to the
               | Windows source code.
        
               | exe34 wrote:
               | windows is source-available if you have deep enough
               | pockets: https://www.microsoft.com/en-
               | us/sharedsource/enterprise-sour...
        
               | johnisgood wrote:
               | I have not heard of anyone building their own, custom
               | Windows though, how common is it? I do not see Windows
               | forks around either (I get it, it would not be legal).
        
             | GlenTheMachine wrote:
             | VxWorks, usually
        
             | redleader55 wrote:
             | If you ever count how many patches are in between stable
             | point releases - eg. 6.10.5 to 6.10.6, you'll see having
             | the source code is not enough. All of these patches are
             | fixes for something or another, not features, but fixes.
             | 
             | If you look at an LTS branch, you'll see there are hundreds
             | of point releases. Usually a point release is created once
             | every 5-10 days. I interpret that to mean bugs were not
             | found until many weeks after the LTS branch was cut.
             | Obviously, not all of them affect you, but many patches
             | apply important subsystems which affect you.
        
             | ssrc wrote:
             | VxWorks, LynxOS or RTEMS. RTEMS is open source.
        
           | wubrr wrote:
           | What's the practical benefit of using Windows here?
        
             | wil421 wrote:
             | The new galactic empire has an enterprise licensing
             | agreement with Microsoft.
        
             | jandrese wrote:
             | Easier to find developers who are comfortable with Windows.
        
           | jmclnx wrote:
           | Really, with Linux (or a BSD), you can make tweaks for free
           | to save memory and to only allow specific tasks and hardware.
           | Plus you could publish these changes and maybe they will be
           | accepted by upstream.
           | 
           | With Windows, you need to beg and pay Microsoft for
           | customizations and hope these changes will not cause other
           | issues.
           | 
           | Plus, most space projects are on a tight and limited budgets
           | where management would rather spend on hardware than
           | software.
        
         | dylan604 wrote:
         | Well, if your account is only a Home Edition, you will not get
         | the same support as if you upgrade to Universal Galactic
         | Edition which has a LTS measured in generations.
        
         | hypeatei wrote:
         | Wonder if it's LTSC or the standard image with candy crush,
         | windows store, and Xbox game app bloat?
        
           | lpribis wrote:
           | Don't know if it's publically available, but MS makes Windows
           | IoT which is a stripped down distro for embedded systems.
        
         | geepytee wrote:
         | Genuine question, what would you use instead and why?
        
           | MPSimmons wrote:
           | Depending on the complexity of the satellite, a linux node,
           | several linux nodes, or a LOT of linux nodes. Or a very small
           | embedded SoC if your satellite is very simple or has a
           | segregated payload.
        
           | jasonwatkinspdx wrote:
           | Real Time Operating System(s) like VxWorks.
           | 
           | Commodity equipment running specialized distros of Linux is a
           | growing thing however.
        
             | MPSimmons wrote:
             | Linux just mainlined the RTOS patch that a lot of people
             | have used. Including SpaceX - https://ntrs.nasa.gov/api/cit
             | ations/20200002390/downloads/20...
             | 
             | Linux RT merge: https://git.kernel.org/pub/scm/linux/kernel
             | /git/torvalds/lin...
        
           | scottyah wrote:
           | I'm forgetting the name of it, but there's a special OS
           | designed for space. The main issue is that bits get flipped
           | all the time once you leave the protection of the
           | magnetosphere. On the surface of the earth, we generally
           | trust memory a lot more than you can in space, even with
           | special chips and shielding. It causes all sorts of weird
           | problems and slow-downs.
        
           | cogman10 wrote:
           | If I had barrels full of money to waste, probably a
           | microkernel architecture like fuchsia [1]. The barrels full
           | of money would be turning it into a real time OS. The benefit
           | of such an OS is if there's a bug in the drivers, the kernel
           | itself keeps on plugging along, it can dump and reload the
           | misbehaving driver without crashing.
           | 
           | [1] https://fuchsia.dev/
        
         | wongarsu wrote:
         | The article doesn't actually involve Windows. They wanted to
         | avoid a satellite going into safemode, which they describe as
         | "the satellite equivalent of a blue screen of death". That's
         | honestly not even a good analogy. The headline is just bad
        
         | hunter2_ wrote:
         | For anyone reading only headlines and comments, from TFA:
         | 
         | > If the watchdog timer has not been restarted and instead
         | times out after ~30 seconds, the satellite enters something
         | called safemode. Safemode is when all non critical functions
         | are automatically shut down and the satellite becomes entirely
         | focused on generating power by pointing its solar panels
         | towards the Sun and trying to reestablish any communication
         | that was lost. It's a state the vehicle goes into when
         | something bad happens [...] the satellite equivalent of a blue
         | screen of death.
         | 
         | If only Windows would be so kind!
        
         | neuralRiot wrote:
         | Installing update 1 of 456. Please don't turn off your
         | spaceship.
        
         | lisper wrote:
         | They're not running Windows.
         | 
         | https://news.ycombinator.com/item?id=41651715
        
       | GlenTheMachine wrote:
       | Thee are a bunch of comments here asking why one would run
       | Windows on a spacecraft.
       | 
       | I am a spacecraft engineer. I don't see anything in the linked
       | article indicating that they are actually running Windows - the
       | BSOD claim is tongue-in-cheek, or at least that's how I read it.
       | I also don't know of anyone anywhere that runs Windows on a
       | spacecraft, with the exception of laptops used by astronauts.
       | Typically one runs vxWorks, or maybe QNX. Some experimental (high
       | risk, low cost) systems run Linux. Older spacecraft don't run any
       | OS at all, everything is running on bare metal, and that may be
       | true for a handful of current spacecraft as well.
       | 
       | Windows is used in some places by ground controllers, but these
       | days they tend to be running Linux a lot more often.
        
         | TrueDuality wrote:
         | Seconding the vxWorks and bare metal. Never seen Windows or
         | Linux on a satellite bus. Haven't really touched payloads but
         | I've seen some wonky things shipped to orbit by universities
         | and not all them have been cubesat student projects.
        
           | nicce wrote:
           | Every Starlink runs with Linux.
           | 
           | The license list is a bit long:
           | 
           | https://www.starlink.com/assets/pdfs/Starlink-Open-Source-
           | Co...
        
         | zanthras wrote:
         | Linux(with realtime patch) is used very heavily in spacecraft
         | by Spacex. So both in terms of high visibility/important/danger
         | (dragon 2) and high count (starlink) it is very widely used.
         | 
         | citation
         | https://old.reddit.com/r/spacex/comments/ncj4vz/we_are_the_s...
        
           | XorNot wrote:
           | I wonder how the integration of PREEMPT_RT is going to affect
           | that technology stack going forwards (I imagine slowly, but
           | it's there now).
        
             | yndoendo wrote:
             | Save costs by integration with the new feature or
             | increasing cost with maintaining a custom kernel branch in
             | the long run.
        
         | lisper wrote:
         | The author of TFA clarifies here:
         | 
         | https://news.ycombinator.com/item?id=41651715
         | 
         | TL;DR: the spacecraft is indeed not running Windows. It's
         | running a custom OS written in C.
        
       | aghilmort wrote:
       | Wendy's tablet menus in NYC are windows and lits like whyyyyy
       | just make them android web browser$$$$$$$
        
       | farceSpherule wrote:
       | Or you can avoid contracting with Boeing.
        
       | rdist wrote:
       | And here I thought we were going to rehash Crowdstrike ;-)
        
         | TrueDuality wrote:
         | Just a tactful reference hahah
         | 
         | > the US government isn't burning taxpayer dollars on a ten
         | figure spaceship just to have us push a Crowdstrike update on
         | it.
        
       | linebeck wrote:
       | Author here: I should clarify the satellite is not running
       | Windows. Instead, it's running its own custom OS written in C
       | called Flight Software (FSW) specifically designed for the
       | satellite onboard computer.
       | 
       | Re-reading the post, I see how the title, my analogies, and poor
       | attempts at humor would give the incorrect description of what's
       | happening with the satellite when it enters safemode. I'll amend
       | the post soon.
       | 
       | Thanks for the feedback, I'll be better next time.
        
         | barbegal wrote:
         | Could I ask you to clarify why avoiding safemode is so
         | important? In a non satellite system safemode means everything
         | is driven to a safe state which is fine during testing in the
         | lab.
         | 
         | Also do you not run these tests in an even more simulated
         | environment where there is only the flight computer and no real
         | hardware at all?
        
           | linebeck wrote:
           | Having discussed this same question with the more experienced
           | members of my team, the only conclusion I can draw is that
           | the customer (US Government) is incredibly risk averse. Any
           | unexpected entry into safemode would require a report,
           | multiple meetings with the customer, and them being pretty
           | angry. Their line of reasoning seems to be
           | "Safemode->Something is wrong->Why is something wrong? We're
           | not paying you to be wrong". I'm personally of the opinion
           | that safemode isn't that bad. It's fully recoverable and
           | shows the system is working properly.
           | 
           | We normally have a Functional Test Assembly (real computer
           | and some other hardware for testing) to run our tests
           | against, but we only have one setup and it is consistently
           | unreliable. This particular CLT was unable to get a clean run
           | in the lab but it was decided that the issues were related to
           | the lab setup rather than the actual test, so we moved
           | forward to run on the satellite (against our team's
           | protests).
           | 
           | This to me is the real crux of the issue: if we can't even
           | trust our own testing environment, what's the point of having
           | it at all? If the customer is so risk averse, why would we
           | take this chance? Needless to say, I don't think we'll be
           | running anything on the satellite without full FTA vetting
           | anytime in the near future.
        
             | Jtsummers wrote:
             | > Any unexpected entry into safemode would require a
             | report, multiple meetings with the customer, and them being
             | pretty angry. Their line of reasoning seems to be
             | "Safemode->Something is wrong->Why is something wrong?
             | We're not paying you to be wrong". I'm personally of the
             | opinion that safemode isn't that bad. It's fully
             | recoverable and shows the system is working properly.
             | 
             | To the last part first: Good that safe mode kicked in and
             | did the right thing, but now what? What _caused_ it to
             | enter safe mode in the first place?
             | 
             | That's why they care when it happens. If they don't know
             | why it's entering safe mode, they can't correct the actual
             | problems in the system.
        
               | axus wrote:
               | "Safemode is when all non critical functions are
               | automatically shut down and the satellite becomes
               | entirely focused on generating power by pointing its
               | solar panels towards the Sun and trying to reestablish
               | any communication that was lost."
               | 
               | The non-critical functions are all the things the
               | customer actually bought the satellite for. Cool that
               | it's still alive, but now the Space Internet / death
               | lasers / etc. are offline.
        
         | topspin wrote:
         | I understood you were using an analogy. Didn't even occur to me
         | that Windows was actually being used.
         | 
         | However, I did come away thinking there are other dysfunctions
         | at play in all of this. Perhaps an excessive amount of wheel
         | re-inventing.
        
       | dangoodmanUT wrote:
       | Step 1: Use linux
        
         | ksajh wrote:
         | Step 1: Read and understand the article
        
         | imoverclocked wrote:
         | Step 2: install vxworks
        
       ___________________________________________________________________
       (page generated 2024-09-25 23:00 UTC)