[HN Gopher] Aerugo - RTOS for aerospace uses written in Rust
       ___________________________________________________________________
        
       Aerugo - RTOS for aerospace uses written in Rust
        
       Author : todsacerdoti
       Score  : 67 points
       Date   : 2024-02-01 07:34 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jiehong wrote:
       | Can someone clarify this part like I'm 5:
       | 
       | > Its design is inspired by purely functional programming
       | paradigm and transputers architecture.
       | 
       | > RTOS is implemented in a form of an executor instead of classic
       | scheduler and doesn't support preemption. Executor runs tasklets,
       | which are fine-grained units of computation, that execute a
       | processing step in a finite amount of time.
       | 
       | Transputers seem to be a leftover technology from the 80s, and
       | lack of preemption seems hard to reconcile with real time
       | guarantees?
        
         | flyinglizard wrote:
         | No preemption makes this an organizer more than an RTOS.
        
           | csdreamer7 wrote:
           | Would you please explain more?
        
             | thfuran wrote:
             | What happens when a tasklet take too long?
        
               | qayxc wrote:
               | I haven't poked into the source code, but usually a QoS
               | event is fired, e.g. "deadline missed". What happens
               | exactly in that case is very application specific.
        
               | addaon wrote:
               | > What happens when a tasklet take too long?
               | 
               | The same thing that happens when 1+1 == 3, or when a task
               | tries to write to memory that it doesn't have permissions
               | for. The static analysis that your system relies on for
               | correct behavior is no longer valid, so a hardware belt-
               | and-suspender mechanism (a schedule overrun timer
               | interrupt, a lockstep core check failure, or an MPU
               | fault, respectively) resets or otherwise safe-states the
               | failed ECU and safety is assured higher up in the system
               | analysis.
        
             | kimixa wrote:
             | No preemption means that any realtime guarantees are
             | handled by user space, as the kernel relies on that
             | yielding to hit time guarantees. It means it doesn't have
             | to do much of the "difficult" work for a realtime system.
        
               | aidenn0 wrote:
               | On a preemptive OS, all multitasking with (non-CPU)
               | contended resources is, to a certain extent, cooperative.
               | If one task acquires a mutex and then never releases it,
               | no amount of priority-inheritance, highest-locker
               | semantics &c. will fix that. Making everything
               | cooperative increases the analysis needed because the CPU
               | becomes a manually managed contended resource.
               | 
               | In addition, while not being preemptive, it sounds like
               | executors are expected to run in bounded time, and the
               | system triggers an event when the time is exceeded.
               | 
               | Complete systems, not operating-systems, are realtime.
               | Operating systems can be non-realtime (most general
               | purpose schedulers are completely unsuited for hard
               | realtime systems), but at their best only make the design
               | of realtime systems tractable.
        
               | PaulDavisThe1st wrote:
               | You are confusing deadlock with scheduling. Sure,
               | deadlock can happen in the absence of "cooperation". And
               | lack of on-time scheduling can happen in a cooperatively
               | scheduled system in the absence of the appropriate
               | cooperation.
               | 
               | But scheduling has been considered by most kernel
               | designers to be the responsible of the kernel, not the
               | participants (i.e. preemptively scheduled threads not
               | cooperatively).
               | 
               | Even if a thread is going to deadlock as soon as it
               | starts running (again), there is a huge difference
               | between that thread being scheduled at the right time and
               | it not being scheduled. You can fix the former (deadlock)
               | with better coding. You cannot fix the latter without
               | fixing the kernel.
        
               | aidenn0 wrote:
               | > You are confusing deadlock with scheduling. Sure,
               | deadlock can happen in the absence of "cooperation". And
               | lack of on-time scheduling can happen in a cooperatively
               | scheduled system in the absence of the appropriate
               | cooperation.
               | 
               | I was describing situations much broader than deadlock.
               | The following pseudocode is not deadlock, but
               | nevertheless a failure of cooperation:
               | WaitForMutex(m)       DoSomeReallyLongComputation()
               | 
               | My point was that the above code in a preemptively
               | scheduled system is as damaging to all tasks that will
               | contend for "m" as this code is in a cooperatively
               | scheduled system:
               | DoSomeReallyLongComputation()
               | 
               | > But scheduling has been considered by most kernel
               | designers to be the responsible of the kernel, not the
               | participants (i.e. preemptively scheduled threads not
               | cooperatively).
               | 
               | Yes and no. Schedulers tend to have parameters. Realtime
               | systems will rely heavily on those parameters. Those
               | parameters will sometimes even include promises for
               | thread T to not run for more than X amount of time in a
               | period of Y time.
               | 
               | > Even if a thread is going to deadlock as soon as it
               | starts running (again), there is a huge difference
               | between that thread being scheduled at the right time and
               | it not being scheduled. You can fix the former (deadlock)
               | with better coding. You cannot fix the latter without
               | fixing the kernel.
               | 
               | This is true in a _preemptively_ scheduled kernel. It 's
               | kind of tautological that fixing issues with resource X
               | needs to fix the kernel IFF X is managed by the kernel.
               | See also my above paragraph about kernel scheduler
               | parameters.
               | 
               | [edit]
               | 
               | Just saw who I was replying to. I suspect that you and I
               | have different visions of what a "Real Time System" is,
               | given that I'm thinking industrial control and you're
               | probably thinking audio. There's definitely overlap in
               | theory and discipline, but the hardware and software
               | stacks are rather different.
        
               | PaulDavisThe1st wrote:
               | A kernel (because that's where interrupt handlers are
               | located) can ensure that a thread is scheduled with N
               | usecs of when it "ought to be" (which could be based on
               | some sort of time allocation algorithm, or simple
               | priorities or whatever other scheme may be in use. The
               | kernel can say "oh look, it's been N usecs, let's check
               | who is running and who is ready to run ... hey, time for
               | Thread 2 to run". This is preemptive scheduling.
               | 
               | No cooperative scheduling system can ensure this.
               | 
               | Your example involves poorly designed code, which is not
               | the responsibility of the scheduler. It's just is to make
               | sure that threads run when they "ought to" - it cannot
               | protect against priority inversions in user space _and_
               | ensure that RT guarantees are met (pick 1, and even then,
               | you lose).
        
         | qayxc wrote:
         | As far as my very limited understanding goes, the transputer
         | reference is simply referring to a highly parallel message-
         | passing system based on message queues, not actual hardware
         | implementations.
         | 
         | As for lack of preemption, that's explained by the system not
         | implementing a scheduler. The system "just" implements an
         | executor.
         | 
         | An Executor is a component that handles subscriptions (e.g. to
         | interrupts or events/messages), services (both clients and
         | servers), timers, and QoS events (deadline missed, invalid QoS
         | request, task died/unresponsive, etc.), which are implemented
         | as tasks (Aerugo calls them "tasklets") and registered with the
         | executor.
         | 
         | * Subscriptions are tasks that subscribe to a topic (of a
         | message).
         | 
         | * Service servers are tasks that are invoked when a request
         | from a client is received.
         | 
         | * Service clients are tasks that are invoked whenever a
         | response from a server is received.
         | 
         | * Timers trigger when a timer expired.
         | 
         | * QoS events are fired when a deadline has been missed, a task
         | died/became unresponsive, and other stuff like a QoS request
         | couldn't be met, etc.
         | 
         | Aerugo supports a subset of these capabilities, namely message
         | queues (for subscribers/clients and producers/servers), events
         | (IRQs, h/w signals, etc.), and cyclic execution (execute the
         | task n-times or forever).
         | 
         | Scheduling - i.e. the priority of things - is handled by the
         | user or built into the Executor itself. For Aerugo the former
         | is the case, so if you need control over priorities and order
         | of task execution, you'd have to implement it on top of it.
         | 
         | Now the realtime-part isn't really affected by any of this. It
         | still allows for weaker RT systems. RT just means there's a
         | time constraint on executed tasks. Weaker systems have average
         | runtime guarantees (e.g. a tasks have to finish within a limit
         | _on average_ , i.e. outliers are allowed). Stronger systems are
         | stricter in the sense that tasks _always_ have to finish within
         | the limit. The strongest systems even enforce an exact limit
         | (that is, tasks aren 't even allowed to finish sooner - they
         | have to take _exactly_ a given amount of time).
         | 
         | Preemption is required if tasks have to be scheduled w.r.t. to
         | different priorities, especiallyy in multi-threaded scenarios.
         | Lack of it means that the executor will never interrupt a
         | running task to execute tasks that take priority - at least
         | from what I understand, which might be wrong.
        
       | sigmonsays wrote:
       | can someone please make an ode to "write in C" (1) entitled
       | "Write in Rust"
       | 
       | (1) https://www.youtube.com/watch?v=wJ81MZUlrDo
        
         | bee_rider wrote:
         | Was this song written in like 1990 and recorded/posted in 2013,
         | or something?
        
       | hlandau wrote:
       | A bit surprised to see a common MCU used here rather than one
       | designed for safety applications (e.g. TMS570). Is there any
       | rationale to this?
        
         | steveklabnik wrote:
         | I don't see a super explicit one, but:
         | https://activities.esa.int/4000140241
         | 
         | > The proposed activity is to evaluate the usage of Rust
         | programming language in space applications,
         | 
         | > The design of the system will be guided to support potential
         | future qualification activities.
         | 
         | > This application will showcase the viability of the developed
         | RTOS and provide input to a Lessons Learned report, describing
         | the encountered issues, potential problem and improvement
         | areas, usage recommendations and proposed way forward.
         | 
         | Looks like it's not intended for real applications, but instead
         | to gain some experience. What better way to ensure that you
         | don't ship the prototype than by doing it on hardware that is
         | similar but different enough to ensure that it won't be used in
         | production.
         | 
         | Just a guess though!
        
           | sesm wrote:
           | I hope they also publish the Lessons Learned report mentioned
           | above, would be an interesting read.
        
         | topspin wrote:
         | > Is there any rationale to this?
         | 
         | It's all ARM Cortex? The RTOS doesn't care.
        
       | matt3210 wrote:
       | The program cannot be verified against the standard to be
       | correct. There is no standard, how can it be verified.
        
         | steveklabnik wrote:
         | Because verification does not require a standard. rustc has
         | already been qualified (though not for any aerospace-specific
         | things yet that I'm aware of, but in my understanding the shape
         | is the same) (via Ferrocene[1]), even though there is no Rust
         | Standard. No issues here.
         | 
         | 1: https://ferrous-systems.com/blog/qualifying-rust-without-
         | for...
         | 
         | Also, I don't see where this project claims to be verified.
        
         | Xylakant wrote:
         | If by "no standard" you mean that there is no language
         | specification for rust, then there is no standard. However, a
         | language specification is not sufficient to verify program
         | correctness, nor is it required.
         | 
         | A standard may (and the C standard for example does) leave
         | parts of the behavior as "implementation specific" and there's
         | quite a few edge cases - and that's not even talking about
         | "undefined behavior", of which there is plenty. An even in the
         | behavior that is neither implementation specific nor undefined
         | you'll find enough rope to hang yourself (all the beautiful
         | pointers). There's a reason things such as MISRA C exist -
         | effectively a standard on top of a standard.
         | 
         | On the other hand, the rust language - while having no formal
         | spec - is fairly well described, in the form of its RFCs and
         | testsuite. We (the ferrocene team) were able to derive a
         | descriptive specification from the existing description fairly
         | easily. So while there is no ISO standard, and no spec that
         | would be sufficient to write a competing implementation, there
         | is a description of what the language behaves like. You can
         | read up on it at https://spec.ferrocene.dev/
         | 
         | As for verification of correct behavior of such a program, you
         | can employ a host of different techniques depending on what
         | your requirements are - down to verification of the produced
         | bytecode by means of blackbox testing or other.
        
         | addaon wrote:
         | > There is no standard, how can it be verified.
         | 
         | The behavior of the generated binary can be verified against
         | the requirements. Yeah, the most common way to do this is to
         | verify certain properties at the source code level, and then
         | rely on various ways to show equivalence between the source
         | code and the generated assembly, the generated assembly and the
         | generated binary, and the formal semantics of the generated
         | binary and the as-executed semantics on the chosen hardware;
         | but it's perfectly reasonable, and not even particularly
         | unusual, to skip the first equivalence and verify the assembly
         | against the requirements directly.
        
       ___________________________________________________________________
       (page generated 2024-02-02 23:00 UTC)