https://stackoverflow.blog/2021/05/10/dont-push-that-button-exploring-the-software-that-flies-spacex-starships/ _ Leave stale documentation behind. Find solutions faster with Stack Overflow for Teams. What is Teams? Essays, opinions, and advice on the act of computer programming from Stack Overflow. Search for: [ ] [Search] Latest Newsletter Podcast Company [DR_3-1200x554] code-for-a-living May 10, 2021 Don't push that button: Exploring the software that flies SpaceX rockets and starships Spaceflight, from the beginning, has depended on computers - both on the ground and in the spacecraft. SpaceX has carried it to a new level. We recently spoke with Steven Gerding, Dragon's software development lead, about the special challenges software development has for SpaceX's many missions. Avatar for Avatar for Ben Popper Charles R. Martin and Ben Popper Editor's note: All this week, we're running articles about the software and engineering behind SpaceX's rockets, starships, and satellite internet. Each article covers a different part of the process. We hope you find it as exciting as we do! Check out the full series here. Spaceflight, from the beginning, has depended on computers - both on the ground and in the spacecraft. SpaceX has carried it to a new level. We recently spoke with Steven Gerding, Dragon's software development lead, about the special challenges software development has for SpaceX's many missions. On April 23, 2021, SpaceX and NASA launched Dragon's second operational mission (Crew-2) to the International Space Station, becoming the first human spaceflight mission to fly astronauts on a flight-proven Falcon 9 and Dragon. Approximately 24-hours later, Dragon autonomously docked with the Station, becoming the first time two Crew Dragons were attached simultaneously to the orbiting laboratory. This marks the beginning of a new era for SpaceX, one where it will aim to routinely fly astronauts to the ISS. The actual work of software development by vehicle engineers such as Gerding is largely done using C++, which has been the mainstay of the company's code since its early days. The software reads text-based configuration files. "We invented simple domain specific languages to express those things, such that other engineers in the company who are not software engineers can maybe configure it." Flight software for rockets at SpaceX is structured around the concept of a control cycle. "You read all of your inputs: sensors that we read in through an ADC, packets from the network, data from an IMU, updates from a star tracker or guidance sensor, commands from the ground," explains Gerding. "You do some processing of those to determine your state, like where you are in the world or the status of the life support system. That determines your outputs - you write those, wait until the next tick of the clock, and then do the whole thing over again." The control cycle highlights some of the performance requirements of the software. "On Dragon, some computers run [the control cycle] at 50 Hertz and some run at 10 Hertz. The main flight computer runs at 10 Hertz. That's managing the overall mission and sending commands to the other computers. Some of those need to react faster to certain events, so those run at 50 Hertz." There is a wide variety of machines talking to the central flight system. "We have inputs from sensors all over the vehicle, all kinds of different sensors." Many are measuring internal values critical to the health of the ship and crew. "Temperatures are important. For crewed vehicles, we have oxygen and carbon dioxide sensors, cabin pressure sensors and things like that." Another set of sensors looks externally to aid in navigation and telemetry. "That would be like the IMU, GPS, and star trackers." Once they are close enough to the space station, they also use laser range finders. The other side of the control cycle are the outputs. "There are two different types of outputs. One is to actually 'open or close a valve' or 'turn a switch on or off'.' The other one is telemetry, which is basically a stream of key-value pairs that, every 20 to 100 milliseconds, tell you the value of a certain thing." Sometimes the results come directly from the sensors as raw data. But other times processing is involved. "It can be some kind of computed value from the software, like the current value for our state machine or the result of an algorithm that's going to drive an output." When the vehicle is on the ground, the data goes over a hardwired connection that provides a high data rate. "Once it lifts off, there are different communication systems where we can pipe varying subsets of that telemetry down to the ground." Once it gets to the ground, systems exist that let operators look at the instantaneous values and make decisions in terms of commanding the vehicle. There's also a system that stores critical data for posterity, something that is quite important when you plan to reuse booster rockets and shuttles on future missions. Dragon currently autonomously docks to the International Space Station and ultimately, the goal is for the vehicle to be fully autonomous. "We do have the ability for the astronauts to take control and steer the vehicle if needed - that was a capability we demonstrated on the Dragon Demo-2 mission," said Gerding. [freemium-D2D-728x90-1-1] We asked what happens if there's a malfunction. "It's more obvious, I guess, what to do when there are hardware failures. We have copies of hardware, whether it's the computer hardware or the sensors or actuators, and so we detect those failures and kind of route around them." Gerding points out that there's no way to protect against any arbitrary software bug. "We try to design the software in a way that if it were to fail, the impact of that failure is minimal." For example, if a software error were to crop up in the propulsion system, that wouldn't affect the life support system or the guidance systems ability to steer the spacecraft and vice versa. "Isolating the different subsystems is key." The software is designed defensively, such that even within a component, SpaceX tries to isolate the effects of errors. "We're always checking error codes and return values. We also have the ability for operators or the crew to override different aspects of the algorithm." A big part of the total software development process is verification and validation. "Writing the software is some small percentage of what actually goes into getting it ready to fly on the space vehicle." With the first demonstration mission (Demo-1) that went to the space station, the software was required by NASA to be tolerant to any two faults in the system. "We implemented this triple string computer architecture and we needed the system to drive it." Gerding had some distributed systems experience from working at Google previously, making him a good fit for the new task. "There were only 10 people on the software team at that time. I picked it up and went with it. I find that kind of stuff, distributed systems, really interesting." Uptime requirements were treated differently at Google. "You would really want your process to fail, if something anomalous happened. It was one of thousands of similar processes which would then be restarted. If you got enough of those failures, you would be paged and could spend some time figuring out what the problem was and building a solution to address it." At Google, these mishaps were a useful signal among the noise. But that approach doesn't work for crewed rockets. "At SpaceX we really don't want our processes to fail as a result of a software failure. We'd rather just continue with the rest of the software that actually isn't impacted by that failure. We still need to know about that failure and that's where the telemetry factors in, but we want things to keep going, controlling it the best that we can." There is a lot more work that goes into crafting the code which put Baby Yoda into space last November. We'll have another article on their space-based internet satellites, Starlink, tomorrow. If you want to learn more about what it's like to work as a vehicle engineer at Space X, check out their careers page. Part two of our Software in Space series is now live: Building a Space Based ISP Tags: software in space, spacex Podcast logo The Stack Overflow Podcast is a weekly conversation about working in software development, learning to code, and the art and culture of computer programming. Related Three particles showing possible directions and positions partner-content February 18, 2021 Strangeworks is on a mission to make quantum computing easy...well, easier To move this technology forward, the tools and platforms surrounding it must become more accessible. Avatar for Avatar for Medi Madelen Gwosdz whurley - Founder and CEO, Strangeworks and Medi Madelen Gwosdz [blog-low_latency] code-for-a-living February 22, 2021 Choosing Java instead of C++ for low-latency systems When it comes to developing low latency software systems, the received wisdom is that you would be crazy to use anything but C++ because anything else has too high a latency. But I'm here to convince you of the opposite, counter-intuitive, almost heretical notion: that when it comes to achieving low latency in software systems, Java is better. Avatar for Theodoros 'Theo' Karasavvas [The-Overflow-Blog] newsletter April 2, 2021 The Overflow #67: Forget Moore's Law. Algorithms drive technology forward Welcome to ISSUE #67 of the Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. This week: inertial navigation, new research from MIT on who's building our algorithmic commons, plus a conversation with Slack about APIs and open-source. From the blog Forget Moore's Law. Algorithms drive... Avatar for Cassidy Williams Avatar for Medi Madelen Gwosdz Cassidy Williams and Medi Madelen Gwosdz [blog-measuring-developer-productivity-5] code-for-a-living December 7, 2020 Can developer productivity be measured? Defining and measuring programmer productivity is one of the most difficult parts of an engineering manager or CTO's job description. When everything you do is intangible, how should you measure it? Can it be measured at all? Avatar for Isaac Lyman 11 Comments [101] Gurkchen says: 10 May 21 at 12:49 "We're always checking error codes and return values" Wow! That's definitely rocket science! Reply * [065] Vorac says: 10 May 21 at 7:51 I've always loved exceptions because they force the caller "check the error code". ... It seems that either I am smarter than a team of Best of the Best OR there's indeed something fishy to stack unwinding in hard realtime contexts. Reply * [e81] Kudzi says: 10 May 21 at 10:52 I like the technology and the security around the software. We CARE exploring our space because of such technology. Well done and keep it up. Reply * [c83] Rob says: 11 May 21 at 12:54 Very good interesting. But whats and ADC and and IMU? Breaking the SpaceX tule against acronyms? LOL Reply [17f] Forrest Hopkins says: 10 May 21 at 2:49 Cool series! Looking forward to the other posts! Reply [563] Jeremy says: 10 May 21 at 9:05 I wonder if they have a button somewhere in the interface that upon being pressed produces a notification that reads, "please do not press this button again".? Reply [ccb] Brett McSweeney says: 11 May 21 at 1:58 I remember when Ada (named after Ada Lovelace, purportedly the first programmer) was developed and mandated by DoD for flight control systems. Don't hear much about it these days, so I presume the SpaceX software is mostly C++. Reply [2c5] C++ here says: 11 May 21 at 8:13 "Using C++, which has been the mainstay of the company's code" In the end, C++ is great for low-latency systems, not Java. Not even C. Reply * [b56] Paul says: 11 May 21 at 11:32 The antagonist of C is the complexity. This may sound counter intuitive, since everyone calls it 'simple'. Anyone who wrote at least half-comlex software in C knows well how complex it is. You need a machine, that never slips to write software in C. Any simple mistake, a copy-paste here and there, an off by one when handling strings, thousand of cases. Slip just once, and you have a bug, that will silently stay unnoticed. In their field, it may mean live-vs-death outcome. C++ is order of magnitude safer. And in skilled hands, not a bit slower. Even faster, sometimes. After all, the compiler (+ smart language design) is the machine that helps you. Reply [335] Nathan Myers says: 11 May 21 at 10:21 Given the Google connection (Google is allergic to exceptions handling, for notoriously bad reasons) and the remark about checking result codes, it seems likely the code does not make effective use of exceptions. Avoiding use of exceptions has unfortunate consequences for the architecture and maintainability of systems, something visible in Google operations. People sometimes insist that they are incompatible with resl-time systems, but the relaxed 100ms and 20ms cycles here leave huge margins that exceptions would have no trouble fitting in. Reply [211] Hitul Mistry says: 11 May 21 at 12:27 Microservices has lots of things similar to spacetech. Reply Leave a Reply Cancel reply Your email address will not be published. Required fields are marked * [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment [ ] Name * [ ] Email * [ ] Website [ ] [ ] Save my name, email, and website in this browser for the next time I comment. [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] This site uses Akismet to reduce spam. Learn how your comment data is processed. (c) 2021 All Rights Reserved. Proudly powered by Wordpress Stack Overflow Questions Jobs Developer Jobs Directory Salary Calculator Products Teams Talent Advertising Enterprise Company About Press Work Here Legal Privacy Policy Contact Us Channels Podcast Newsletter Facebook Twitter LinkedIn Instagram *