[HN Gopher] The real realtime preemption end game
___________________________________________________________________
The real realtime preemption end game
Author : chmaynard
Score : 446 points
Date : 2023-11-16 14:47 UTC (1 days ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| andy_ppp wrote:
| What do other realtime OS kernels do when printing from various
| places? It almost seems like this should be done in hardware
| because it's such a difficult problem to not lose messages but
| also have them on a different OS thread in most cases.
| EdSchouten wrote:
| Another option is simply to print less, but expose more events
| in the form of counters.
|
| Unfortunately, within a kernel that's as big as Linux, that
| would leave you with many, many, many counters. All of which
| need to be exported and monitored somehow.
| taeric wrote:
| This seems to imply you would have more counters than
| messages? Why would that be?
|
| That is, I would expect moving to counters to be less
| information, period. That not the case?
| nraynaud wrote:
| My guess is that each counter would need to have a
| discovery point, a regular update mechanism and a
| documentation, while you can send obscure messages willy-
| nilly in the log? And also they become an Application
| Interface with a life cycle while (hopefully) not too many
| people will go parse the log as an API.
| taeric wrote:
| I think that makes sense, though I would still expect
| counters to be more dense than logs. I'm definitely
| interested in any case studies on this.
| ajross wrote:
| It's just hard, and there's no single answer.
|
| In Zephyr, we have a synchronous printk() too, as for low-level
| debugging and platform bringup that's usually _desirable_ (i.e.
| I 'd like to see the dump from just before the panic please!).
|
| For production logging use, though, there is a fancier log
| system[1] designed around latency boundaries that essentially
| logs a minimally processed stream to a buffer than then gets
| flushed from a low priority thread. And this works, and avoids
| the kinds of problems detailed in the linked article. But it's
| fiddly to configure, expensive in an RTOS environment (you need
| RAM for that thread stack and the buffer), depends on having a
| I/O backend that is itself async/low-latency, and has the
| mentioned misfeature where when things blow up, it's usually
| failed to flush the information you need out of its buffer.
|
| [1] Somewhat but not completely orthogonal with printk. Both
| can be implemented in terms of each others, mostly. Sometimes.
| vlovich123 wrote:
| What if the lower priority thread is starved and the buffer
| is full? Do you start dropping messages? Or overwrite the
| oldest ones and skip messages?
| ajross wrote:
| It drops messages. That's almost always the desired
| behavior: you never want your logging system to be doing
| work when the system is productively tasked with other
| things.
|
| I know there was some level of argument about whether it's
| best to overwrite older content (ring-buffer-style,
| probably keeps the most important stuff) or drop messages
| at input time (faster, probably fewer messages dropped
| overall). But logging isn't my area of expertise and I
| forget the details.
|
| But again, the general point being that this is a
| complicated problem with tradeoffs, where most developers
| up the stack tend to think of it as a fixed facility that
| shouldn't ever fail or require developer bandwidth. And
| it's not, it's hard.
| xenadu02 wrote:
| In many problem spaces you can optimize for the common success
| and failure paths if you accept certain losses on long-tail
| failure scenarios.
|
| A common logging strategy is to use a ring buffer with a
| separate isolated process reading from the ring. The vast
| majority of the time the ring buffer handles temporary
| disruptions (eg slow disk I/O to write messages to disk) but in
| the rare failure scenarios you simply overwrite events in the
| buffer and increment an atomic overwritten event counter.
| Events do not get silently dropped but you prioritize forward
| progress at the cost of data loss in rare scenarios.
|
| Microkernels and pushing everything to userspace just moves the
| tradeoffs around. If your driver is in userspace and blocks
| writing a log message because the log daemon is blocked or the
| I/O device it is writing the log to is overloaded it does the
| same thing. Your realtime thread won't get what it needs from
| the driver within your time limit.
|
| It all comes down to CAP theorem stuff. If you always want the
| kernel (or any other software) to be able to make forward
| progress within specific time limits then you must be willing
| to tolerate some data loss in failure scenarios. How much and
| how often it happens depends on specific design factors, memory
| usage, etc.
| TeeMassive wrote:
| It's kind of crazy that a feature necessitated 20 years of active
| development to be somewhat called complete.
|
| I hope it will be ready soon. I'm working in a project that has
| strict serial communication requirements and it has caused us a
| lot of headaches.
| eisbaw wrote:
| Zephyr RTOS.
| worthless-trash wrote:
| Can you expand on this, as I'm a little naive in this area, say
| you isolated the cpus (isolcpus parameter) and then taskset
| your task onto the isolated cpu, would not the scheduler no
| longer be involved, and your task be the only thing serviced by
| that CPU ?
|
| Is it other interrupts on the CPU that break your process out
| of the "real time" requirement, I find this all so interesting.
| TeeMassive wrote:
| It's an embedded system with two logical cores with at least
| 4 other critical processes running. Doing that will only
| displace the problem.
| worthless-trash wrote:
| I (incorrectly) assumed that serial port control was the
| highly sensitive time problem that was being dealt with
| here.
| loeg wrote:
| Synchronous logging strikes again! We ran into this some at work
| with GLOG (Google's logging library), which can, e.g., block on
| disk IO if stdout is a file or whatever. GLOG was like, 90-99% of
| culprits when our service stalled for over 100ms.
| cduzz wrote:
| I have discussions with cow-orkers around logging;
|
| "We have Best-Effort and Guaranteed-Delivery APIs"
|
| "I want Guaranteed Delivery!!!"
|
| "If the GD logging interface is offline or slow, you'll take
| downtime; is that okay?"
|
| "NO NO Must not take downtime!"
|
| "If you need it logged, and can't log it, what do you do?"
|
| These days I just point to the CAP theorem and suggest that
| logging is the same as any other distributed system. Because
| there's a wikipedia article with a triangle and the word
| "theorem" people seem to accept that.
|
| [edit: added "GD" to clarify that I was referring to the
| guaranteed delivery logging api, not the best effort logging
| API]
| loeg wrote:
| I read GD as "god damn," which also seems to fit.
| rezonant wrote:
| aw you beat me to it
| msm_ wrote:
| Interesting, I'd think logging is one of the clearest
| situations when you want best effort. Logging is, almost by
| definition, not the "core" of your application, so failure to
| log properly should not prevent the core of the program from
| working. Killing the whole program because logging server is
| clearly throwing the baby out with the bathwater.
|
| What people probably mean is "logging is important, let's
| avoid losing log messages if possible", which is what "best"
| in "best effort" stands for. For example it's often a good
| idea to have a local log queue, to avoid data loss in case of
| a temporary log server downtime.
| cduzz wrote:
| People use logging (appropriately or inappropriately; not
| my bucket of monkeys) for a variety of things including
| audit and billing records, which are likely a good case for
| a guaranteed delivery API.
|
| People often don't think precisely about what they say or
| want, and also often don't think through corner cases such
| as "what if XYZ breaks or gets slow?"
|
| And don't get me started on "log" messages that are 300mb
| events. Per log. Sigh.
| insanitybit wrote:
| If you lose logs when your service crashes you're losing
| logs at the time they are most important.
| tux1968 wrote:
| That's unavoidable if the logging service is down when
| your server crashes.
|
| Having a local queue doesn't mean logging to the service
| is delayed, it can be sent immediately. All the local
| queue does is give you some resiliency, by being able to
| retry if the first logging attempt fails.
| insanitybit wrote:
| If your logging service is down all bets are off. But by
| buffering logs you're now accepting that problems _not_
| related to the logging service will also cause you to
| drop logs - as I mentioned, your service crashing, or
| being OOM 'd, would be one example.
| tux1968 wrote:
| What's more likely? An intermittent network issue, the
| logging service being momentarily down, or a local crash
| that only affects your buffering queue?
|
| If an OOM happens, all bets are off anyway, since it has
| as much likelihood of taking out your application as it
| does your buffering code. The local buffering code might
| very well be part of the application in the first place,
| so the fate of the buffering code is the same as the
| application anyway.
|
| It seems you're trying very hard to contrive a situation
| where doing nothing is better than taking reasonable
| steps to counter occasional network hiccups.
| insanitybit wrote:
| > It seems you're trying very hard to contrive a
| situation where doing nothing is better than taking
| reasonable steps to counter occasional network hiccups.
|
| I think you've completely misunderstood me then. I
| haven't taken a stance at all on what should be done. I'm
| only trying to agree with the grandparent poster about
| logging ultimately reflecting CAP Theorem.
| andreasmetsala wrote:
| No, you're losing client logs when your logging service
| crashes. Your logging service should probably not be
| logging through calls to itself.
| tremon wrote:
| But if your service has downtime because the logs could
| not be written, that seems strictly inferior. As someone
| else wrote upthread, you only want guaranteed delivery
| for logs if they're required under a strict audit regime
| and the cost of noncompliance is higher than the cost of
| a service outage.
| insanitybit wrote:
| FWIW I agree, I'm just trying to be clear that you are
| choosing one or the other, as the grandparent was
| stating.
| linuxdude314 wrote:
| It's not the core of the application, but it can be the
| core of the business.
|
| For companies that sell API access logs in one form or
| another are how bills are reconciled and usage metered.
| wolverine876 wrote:
| Logging can be essential to security (to auditing). It's
| your record of what happened. If an attacker can cause
| logging to fail, they can cover their tracks more easily.
| deathanatos wrote:
| To me audit logs aren't "logs" (in the normal sense),
| despite the name. They tend to have different
| requirements; e.g., in my industry, they must be
| retained, by law, and for far longer than our normal
| logs.
|
| To me, those different requirements imply that they
| _should_ be treated differently by the code, probably
| even under distinct flows: synchronously, and ideally to
| somewhere that I can later compress like hell and store
| in some very cheap long term storage.
|
| Whereas the debug logs that I use for debugging? Rotate
| out after 30 to 90d, ... and yeah, best effort is fine.
|
| (The audit logs might also end up in one's normal logs
| too, for convenience.)
| wolverine876 wrote:
| While I generally agree, I'll add that the debug logs can
| be useful in security incidents.
| fnordpiglet wrote:
| It depends.
|
| Some systems the logs are journaled records for the
| business or are discoverable artifacts for compliance. In
| highly secure environments logs are not only durable but
| measures are taken to fingerprint them and their ordering
| (like ratchet hashing) to ensure integrity is invariant.
|
| I would note that using disk based logging is generally
| harmful in these situations IMO. Network based logging is
| less likely to cause blocking at some OS level or other
| sorts of jitter that's harder to mask. Typically I develop
| logging as an in memory thing that offloads to a remote
| service over the network. The durability of the memory
| store can be an issue in highly sensitive workloads, and
| you'll want to do synchronous disk IO for that case to
| ensure durability and consistent time budgets, but for
| almost all application disk less logging is preferable.
| shawnz wrote:
| If you're not waiting for the remote log server to write
| the messages to its disk before proceeding, then it seems
| like that's not guaranteed to me? And if you are, then
| you suffer all the problems of local disk logging but
| also all the extra failure modes introduced by the
| network, too
| fnordpiglet wrote:
| The difference is that network IO can be more easily
| masked by the operating system than block device IO. When
| you offload your logging to another thread the story
| isn't over because your disk logging can interfere at a
| system level. Network IO isn't as noisy. If durability is
| important you might still need to wait for an ACK before
| freeing the buffer for the message which might lead to
| more overall memory use, all the operations play nicely
| in a preemptable scheduling system.
|
| Also, the failure modes of _systems_ are very tied to
| durable storage devices attached to the system and very
| rarely to network devices. By reducing the number of
| things that need a disk (ideally to zero) you can remove
| disks from the system and its availability story. Once
| you get to fully disk less systems the system failure
| modes are actually almost nothing. But even with disks
| attached reducing the times you interact with the disk
| (especially for chatty things like logs!) reduces the
| likelihood the entire system fails due to a disk issue.
| lmm wrote:
| > If you're not waiting for the remote log server to
| write the messages to its disk before proceeding, then it
| seems like that's not guaranteed to me?
|
| Depends on your failure model. I'd consider e.g.
| "received in memory by at least 3/5 remote servers in
| separate datacenters" to be safer than "committed to
| local disk".
| cduzz wrote:
| You're still on one side or another of the CAP triangle.
|
| In a network partition, you are either offline or your
| data is not consistent.
|
| If you're writing local to your system, you're losing
| data if there's a single device failure.
|
| https://en.wikipedia.org/wiki/CAP_theorem
| fnordpiglet wrote:
| For logs, which are immutable time series journals, any
| copy is entirely sufficient. The first write is a quorum.
| Also from a systems POV reads are not a feature of logs.
| lmm wrote:
| CAP is irrelevant, consistency does not matter for logs.
| cduzz wrote:
| Consistency is a synonym for "guaranteed", and means
| "written to 2 remote, reliable, append-only storage
| endpoints" (for any reasonable definition of reliability)
|
| So -- a single system collecting a log event -- it is not
| reliable (guaranteed) if written just to some device on
| that system. Instances can be de-provisioned (and logs
| lost), filesystems or databases can be scrambled, badguys
| can encrypt your data, etc.
|
| In this context, a "network partition" prevents
| consistency (data not written to reliable media) or
| prevents availability (won't accept new requests until
| their activity can be logged reliably).
|
| If you define "reliably" differently, you may have a
| different interpretation of log consistency.
| fnordpiglet wrote:
| I'm not sure I understand the way you're using the
| vocabulary. Consistency is a read operation concept not
| write. There is no online reads for logs.
|
| Availability is achieved if at least one writer
| acknowledges a write. In a partition, it means when you
| have multiple parts of the system disagreeing about the
| write contents due to a partition in the network. But
| because logs are immutable and write only, this doesn't
| happen in any situation. The only situation this might
| occur is if you're maintaining a distributed ratchet with
| in delivery order semantics rather than eventually
| consistent temporal semantics- in which case you will
| never have CAP. But that's an insanely rare edge case.
|
| Note CAP doesn't ensure perfect durability. I feel like
| you're confusing consistency with durability. Consistency
| means after I've durably written something all nodes
| agree on read it's been written. Since logs don't support
| read on the online data plane this is trivially not an
| issue. Any write acknowledgment is sufficient.
| lmm wrote:
| > Consistency is a synonym for "guaranteed", and means
| "written to 2 remote, reliable, append-only storage
| endpoints" (for any reasonable definition of reliability)
|
| No it doesn't. Read your own wiki link.
|
| > In this context, a "network partition" prevents
| consistency (data not written to reliable media) or
| prevents availability (won't accept new requests until
| their activity can be logged reliably).
|
| A network partition doesn't matter for a log system
| because there is no way to have consistency issues with
| logs. Even a single partitioned-off instance can accept
| writes without causing any problem.
|
| Of course if you cannot connect to any instance of your
| log service then you cannot write logs. But that's got
| nothing to do with the CAP theorem.
| ReactiveJelly wrote:
| If it's a journaled record for the business then I think
| I'd write it to SQLite or something with good
| transactions and not mix it in the debug logs
| fnordpiglet wrote:
| There are more logs than debug logs, and using SQLite as
| the encoding store for your logs doesn't make it not
| logging.
| supriyo-biswas wrote:
| The better way to do this is to write the logs to a file or
| an in-memory ring buffer and have a separate thread/process
| push logs from the file/ring-buffer to the logging service,
| allowing for retries if the logging service is down or slow
| (for moderately short values of down/slow).
|
| Promtail[1] can do this if you're using Loki for logging.
|
| [1] https://grafana.com/docs/loki/latest/send-data/promtail/
| insanitybit wrote:
| But that's still not guaranteed delivery. You're doing what
| the OP presented - choosing to drop logs under some
| circumstances when the system is down.
|
| a) If your service crashes and it's in-memory, you lose
| logs
|
| b) If your service can't push logs off (upstream service is
| down or slow) you either drop logs, run out of memory, or
| block
| hgfghui7 wrote:
| You are thinking too much in terms of the stated
| requirements instead of what people actually want: good
| uptime and good debugability. Falling back to local
| logging means a blip in logging availability doesn't turn
| into all hands on deck everything is on fire. And it
| means that logs will very likely be available for any
| failures.
|
| In other words it's good enough.
| mort96 wrote:
| "Good uptime and good reliability but no guarantees" is
| just a good best effort system.
| insanitybit wrote:
| Good enough is literally "best effort delivery", you're
| just agreeing with them that this is ultimately a
| distributed systems problem and you either choose CP or
| AP.
| kbenson wrote:
| Yeah, what the "best effort" actually means in practice
| is usually a result of how much resources you want to
| throw at the problem. Those give you runway on how much
| of a problem you can withstand and perhaps recover from
| without any loss of data (logs), but in the end you're
| usually still just buying time. That's usually enough
| though.
| o11c wrote:
| Logging to `mmap`ed files is resilient to service
| crashes, just not hardware crashes.
| sroussey wrote:
| We did something like this at Weebly for stats. The app
| sent the stats to a local service via UDP, so shoot and
| forget. That service aggregated for 1s and then sent off
| server.
| laurencerowe wrote:
| Why UDP for a local service rather than a unix socket?
| sroussey wrote:
| Send and forget. Did not want to wait on an ack from a
| broken process.
| rezonant wrote:
| > "If the GD logging interface is offline or slow, you'll
| take downtime; is that okay?"
|
| > [edit: added "GD" to clarify that I was referring to the
| guaranteed delivery logging api, not the best effort logging
| API]
|
| i read GD as god-damned :-)
| salamanderman wrote:
| me too [EDIT: and I totally empathized]
| Zondartul wrote:
| I have some wishfull thinking ideas on this, but it should be
| possible to have both at least in an imaginary, theoretical
| scenario.
|
| You can have both guaranteed delivery and no downtime if your
| whole system is so deterministic that anything that normally
| would result in blocking just will not, cannot happen. In
| other words it should be a hard real-time system that is
| formally verified top to bottom, down to the last transistor.
| Does anyone actually do that? Verify the program and the
| hardware to prove that it will never run out of memory for
| logs and such?
|
| Continuing this thought, logs are probably generated
| endlessly, so either whoever wants them has to also guarantee
| that that they are processedand disposed of right after being
| logged... or there is a finite ammount of log messages that
| can be stored (arbitrary number like 10 000) but the user (of
| logs) has to guarantee that they will take the "mail" out of
| the box sooner than it overfills (at some predictable,
| deterministic rate). So really that means even if OUR system
| is mathematically perfect, we're just making the downtime
| someone elses problem - namely, the consumer of the infinite
| logs.
|
| That, or we guarantee that the final resources of our self-
| contained, verified system will last longer than the finite
| shelf life of the system as a whole (like maybe 5 years for
| another arbitrary number)
| morelisp wrote:
| PACELC says you get blocking or unavailability or
| inconsistency.
| ElectricalUnion wrote:
| From a hardware point of view, this system is unlikely to
| exist, because you need a system with components that never
| have any reliability issues ever to have a totally
| deterministic system.
|
| From a software point of view, this system is unlikely to
| exist as it doesn't matter that the cause of your downtime
| is "something else that isn't our system". As a result,
| you're gonna end up requiring infinite reliable storage to
| upkeep your promises.
| tuetuopay wrote:
| We had prod halt once when the syslog server hanged. Logs were
| pushed through TCP which propagated the blocking to the whole
| of prod. We switched to UDP transport since: better to lose
| some logs than the whole of prod.
| tetha wrote:
| Especially if some system is unhappy enough to log enough
| volume to blow up the local log disk... you'll usually have
| enough messages and clues in the bazillion other messages
| that have been logged.
| deathanatos wrote:
| TCP vs. UDP and async best-effort vs. synchronous are
| _completely_ orthogonal...
|
| E.g., a service I wrote wrote logs to an ELK setup; we logged
| over TCP. But the logging was async: we didn't wait for logs
| to make it to ELK, and if the logging services went down, we
| just queued up logs locally. (To a point; at some point, the
| buffer fills up, and logs were discarded. The process would
| make a note of this if it happened, locally.)
| tuetuopay wrote:
| > TCP vs. UDP and async best-effort vs. synchronous are
| completely orthogonal...
|
| I agree, when stuff is properly written. I don't remember
| the exact details, but at least with UDP the asyncness is
| built-in: there is no backpressure whatsoever. So poorly
| written software can just send udp to heart's end.
| lopkeny12ko wrote:
| I would posit that if your product's availability hinges on +/-
| 100ms, you are doing something deeply wrong, and it's not your
| logging library's fault. Users are not going to care if a
| button press takes 100 more ms to complete.
| fnordpiglet wrote:
| 100ms for something like say API authorization on a high
| volume data plane service would be unacceptable. Exceeding
| latencies like that can degrade bandwidth and cause workers
| to exhaust connection counts. Likewise, even in humans
| response space, 100ms is an enormous part of a budget for
| responsiveness. Taking again authorization, if you spend
| 100ms, you're exhausting the perceptible threshold for a
| humans sense of responsiveness to do something that's of no
| practical value but is entirely necessary. Your UI developers
| will be literally camped outside your zoom room with virtual
| pitch forks night and day.
| loeg wrote:
| Yes, and in fact the service I am talking about is a high
| volume data plane service.
| hamandcheese wrote:
| Not every API is a simple CRUD app with a user at the other
| end.
| kccqzy wrote:
| Add some fan out and 100ms could suddenly become 1s, 10s...
| saagarjha wrote:
| Core libraries at, say, Google, are supposed to be reliable
| to several nines. If they go down for long enough for a human
| to notice, they're failing SLA.
| loeg wrote:
| Our service is expected to respond to small reads at under
| 1ms at the 99th percentile. >100ms stalls (which can go into
| many seconds) are absolutely unacceptable.
| oneepic wrote:
| Oh, we had this type of issue ("logging lib breaks everything")
| with a $MSFT logging library. Imagine having 100 threads each
| with their own logging buffer of 300MB. Needless to say it
| _annihilated_ our memory and our server crashed, even on the
| most expensive sku of Azure App Service.
| pests wrote:
| Brilliant strategy.
|
| Reminds me a litte of the oldtimers trick of adding a
| sleep(1000) somewhere so they could later come back and have
| some resources later, or if they needed a quick win with the
| client.
|
| Now cloud companies are using malloc(300000000) it to fake
| resource usage. /s
| RobertVDB wrote:
| Ah, the classic GLOG-induced service stall - brings back
| memories! I've seen similar scenarios where logging, meant to
| be a safety net, turns into a trap. Your 90-99% figure
| resonates with my experience. It's like opening a small window
| for fresh air and having a storm barrel in. We eventually had
| to balance between logging verbosity and system performance,
| kind of like a tightrope walk over a sea of unpredictable IO
| delays. Makes one appreciate the delicate art of designing
| logging systems that don't end up hogging the spotlight (and
| resources) themselves, doesn't it?
| tyingq wrote:
| I wonder if this being fixed will result in it displacing some
| notable amount of made-for-realtime hardware/software combos.
| Especially since there's now lots of cheap, relatively low power,
| and high clock rate ARM and x86 chips to choose from. With the
| clock rates so high, perfect real-time becomes less important as
| you would often have many cycles to spare for misses.
|
| I understand it's less elegant, efficient, etc. But sometimes
| commodity wins over correctness.
| foobarian wrote:
| Ethernet nods in vigorous agreement
| tuetuopay wrote:
| The thing is, stuff that require hard realtime cannot satisfy
| with "many cycles to spare for misses". And CPU cycles is not
| the whole story. A badly made task could lock down the kernel
| not doing anything useful. The point of hard realtime is
| "nothing cannot prevent this critical task from running".
|
| For automotive and aerospace, you really want the control
| systems to be able to run no matter what.
| tyingq wrote:
| Yes, there are parts of the space that can't be displaced
| with this.
|
| I'm unclear on why you put "many cycles to spare for misses"
| in quotes, as if it's unimportant. If a linux/arm (or x86)
| solution is displacing a much lower speed "real real time"
| solution, that's the situation...the extra cycles mean you
| can tolerate some misses while still being as granular as
| what you're replacing. Not for every use case, but for many.
| bee_rider wrote:
| It is sort of funny that language has changed to the point
| where quotes are assumed to be dismissive or sarcastic.
|
| Maybe they used the quotes because they were quoting you,
| haha.
| tuetuopay wrote:
| it's precisely why I quoted the text, to quote :)
| archgoon wrote:
| I'm pretty sure they were just putting it in quotes because
| it was the expression you used, and they thus were
| referencing it.
| tuetuopay wrote:
| You won't be saved from two tasks deadlocking with
| cycles/second. _this_ is what hard realtime systems are
| about. However, I do agree that not all systems have a real
| hard realtime requirements. But those usually can handle a
| non-rt kernel.
|
| As for the quotes, it was a direct citation, not a way to
| dismiss what you said.
| tremon wrote:
| I don't think realtime anything has much to do with mutex
| deadlocks, those are pretty much orthogonal concepts. In
| fact, I would make a stronger claim: if your "realtime"
| system can deadlock, it's either not really realtime or
| it has a design flaw and should be sent back to the
| drawing board. It's not like you can say "oh, we have a
| realtime kernel now, so deadlocks are the kernel's
| problem".
|
| Actual realtime systems are about workload scheduling
| that takes into account processing deadlines. Hard
| realtime systems can make guarantees about processing
| latencies, and can preemptively kill or skip tasks if the
| result would arrive too late. But this is not something
| that the Linux kernel can provide, because it is a system
| property rather than about just the kernel: you can't
| provide any hard guarantees if you have no time bounds
| for your data processing workload. So any discussion
| about -rt in the context of the Linux kernel will always
| be about soft realtime only.
| tuetuopay wrote:
| much agreed. I used deadlocks as an extreme example
| that's easy to reason about and straight to the point of
| "something independent of cpu cycles". something more
| realistic would be IO operations taking more time than
| expected. you would not want this to be blocking
| execution for hard rt tasks.
|
| In the case of the kernel, it is indeed too large to be
| considered hard realtime. Best case we can make it into a
| firmer realtime than it currently is. But I would place
| it nowhere near avionics flight calculators (like fly-by-
| wire systems).
| hamilyon2 wrote:
| I had an introductory course on OS and learned about hard
| real-time systems. I had impression hard real-time is
| about memory, deadlocks, livelocks, starvation, and so
| on. And in general about how to design system that moves
| forward even in presence of serious bugs and unplanned-
| for circumstances.
| syntheweave wrote:
| Bugs related to concurrency - which is where you get race
| conditions and deadlocks - tend to pop up wherever
| there's an implied sequence of dependencies to complete
| the computation, and the sequence is determined
| dynamically by an algorithm.
|
| For example, if I have a video game where there's
| collision against the walls, I can understand this as
| potentially colliding against "multiple things
| simultaneously", since I'm likely to describe the scene
| as a composite of bounding boxes, polygons, etc.
|
| But to get an answer for what to do in response when I
| contact a wall, I have to come up with an algorithm that
| tests all the relevant shapes or volumes.
|
| The concurrency bug that appears when doing this in a
| naive way is that I test one, give an answer to that,
| then modify the answer when testing the others. That can
| lead to losing information and "popping through" a wall.
| And the direction in which I pop through depends on which
| one is tested first.
|
| The conventional gamedev solution to that is to define
| down the solution set so that it no longer matters which
| order I test the walls in: with axis aligned boxes, I can
| say "move only the X axis first, then move only the Y
| axis". Now there is a fixed order, and a built-in bias to
| favor one or the other axis. But this is enough for the
| gameplay of your average platforming game.
|
| The generalization on that is to describe it as a
| constraint optimization problem: there are some number of
| potential solutions, and they can be ranked relative to
| the "unimpeded movement" heuristic, which is usually
| desirable when clipping around walls. That solution set
| is then filtered down through the collision tests, and
| the top ranked one becomes the answer for that timestep.
|
| Problems of this nature come up with resource allocation,
| scheduling, etc. Some kind of coordinating mechanism is
| needed, and OS kernels tend to shoulder a lot of the
| burden for this.
|
| It's different from real-time in that real-time is a
| specification of what kind of performance constraint you
| are solving for, vs allowing any kind of performance
| outcome that returns acceptable concurrent answers.
| nine_k wrote:
| How much more expensive and power-hungry an ARM core would
| be, if it displaces a lower-specced core?
|
| I bet there are hard-realtime (commercial) OSes running on
| ARM, and the ability to use a lower-specced (cheaper,
| simpler, consuming less power) core may be seen as an
| advantage enough to pay for the OS license.
| lmm wrote:
| > How much more expensive and power-hungry an ARM core
| would be, if it displaces a lower-specced core?
|
| The power issue is real, but it might well be the same
| price or cheaper - a standard ARM that gets stamped out
| by the million can cost less than a "simpler"
| microcontroller with a smaller production run.
| zmgsabst wrote:
| What's an example of a system that requires hard real time
| and couldn't cope with soft real time on a 3GHz system having
| 1000 cycle misses costing 0.3us?
| lelanthran wrote:
| > What's an example of a system that requires hard real
| time and couldn't cope with soft real time on a 3GHz system
| having 1000 cycle misses costing 0.3us?
|
| Any system that deadlocks.
| LeifCarrotson wrote:
| We've successfully used a Delta Tau real-time Linux motion
| controller to run a 24 kHz laser galvo system. It's
| ostensibly good for 25 microsecond loop rates, and pretty
| intolerant of jitter (you could delay a measurement by a
| full loop period if you're early). And the processor is a
| fixed frequency Arm industrial deal that only runs at 1.2
| GHz.
|
| Perhaps even that's not an example of such a system, 0.3
| microseconds is close to the allowable real-time budget,
| and QC would probably not scrap a $20k part if you were off
| by that much once.
|
| But in practice, every time I've heard "soft real time"
| suggested, the failure mode is not a sub-microsecond miss
| but a 100 millisecond plus deadlock, where a hardware
| watchdog would be needed to drop the whole system offline
| and probably crash the tool (hopefully fusing at the tool
| instead of destroying spindle bearings, axis ball screws,
| or motors and gearboxes) and scrap the part.
| zmgsabst wrote:
| Thanks for the detailed reply!
|
| I'm trying to understand where the roadblock on a rPi +
| small FPGA hybrid board for $50 fails at the task... and
| it sounds like the OS/firmware doesn't suffice. (Or a
| SoC, like a Zynq.)
|
| Eg, if we could guarantee that the 1.5GHz core won't "be
| off" by more than 1us on responding and the FPGA can
| manage IO directly to buffer out (some of) the jitter,
| then the cost of many hobby systems with "(still not
| quite) hard" real time systems would come down to
| reasonable.
| rmu09 wrote:
| You can get pretty far nowadays with preempt rt and an
| FPGA. Maybe you even can get near 1us max jitter. One
| problem with the older RPis was unpredictable (to me)
| behaviour of the hardware, i.e. "randomly" changing SPI
| clocks, and limited bandwidth.
|
| Hobby systems like a small CNC mill or lathe usually
| don't need anything near 1us (or better) max jitter.
| LinuxCNC (derived from NIST's Enhanced Machine
| Controller, name changed due to legal threats) runs fine
| on preempt-rt with control loops around 1kHz, with some
| systems you can also run a "fast" thread with say 20kHz
| and more to generate stepper motor signals, but that job
| is best left for the FPGA or an additional uC IMHO.
| krylon wrote:
| I suspect a fair amount of hard real time applications are
| not running on 3GHz CPUs. A 100MHz CPU (or lower) without
| an MMU or FPU is probably more representative.
|
| But it's not really so much about being fast, it's about
| being able to _guarantee_ that your system can respond to
| an event within a given amount of time _every time_. (At
| least that is how a friend who works in embedded /real time
| explained it to me.)
| imtringued wrote:
| Sure, but this won't magically remove the need for dedicated
| cores. What will probably happen is that people will tell the
| scheduler to exclusively put non-premptible real time tasks on
| one of the LITTLE cores.
| binary132 wrote:
| I get the sense that applications with true realtime
| requirements generally have hard enough requirements that they
| cannot allow even the remote possibility of failure. Think
| avionics, medical devices, automotive, military applications.
|
| If you really need realtime, then you _really need_ it and
| "close enough" doesn't really exist.
|
| This is just my perception as an outsider though.
| calvinmorrison wrote:
| If you really need realtime, and you really actually need it,
| should you be using a system like Linux at all?
| refulgentis wrote:
| ...yes, after realtime support lands
| lumb63 wrote:
| A lot of realtime systems don't have sufficient resources
| to run Linux. Their hardware is much less powerful than
| Linux requires.
|
| Even if a system can run (RT-)Linux, it doesn't mean it's
| suitable for real-time. Hardware for real-time projects
| needs much lower interrupt latency than a lot of hardware
| provides. Preemption isn't the only thing necessary to
| support real-time requirements.
| skyde wrote:
| what kind of Hardware is considered to have "lower
| interrupt latency"? Is there some kind of Arduino board I
| could get that fit those lower interrupt latency required
| for real-time but still support things like Bluetooth?
| lumb63 wrote:
| Take a look at the Cortex R series. The Cortex M series
| still has lower interrupt latency than the A series, but
| lower processing power. I imagine for something like
| Bluetooth that an M is more than sufficient.
| refulgentis wrote:
| Sure but that was already mentioned before the comment I
| was replying to. Standard hardware not being great for
| realtime has nothing to do with hypothetical realtime
| Linux.
| rcxdude wrote:
| realtime just means execution time is bounded. It doesn't
| necessarily mean the latency is low. Though, in this
| sense RT-linux should probably be mostly thought of as
| low-latency linux, and the improvement in realtime
| guarantees is mostly in reducing the amount of things
| that can cause you to miss a deadline as opposed to
| allowing you to guarantee any particular deadline, even a
| long one.
| tyingq wrote:
| I'm guessing it's not that technical experts will be
| choosing this path, but rather companies. Once it's "good
| enough", and much easier to hire for, etc...you hire non-
| experts because it works _most_ of the time. I 'm not
| saying it's good, just that it's a somewhat likely outcome.
| And not for everything, just the places where they can get
| away with it.
| froh wrote:
| nah. when functional safety enters the room (as it does
| for hard real time) then engineers go to jail if they
| sign off something unsafe and people die because of that.
| since the challenger disaster there is an awareness that
| not listening to engineers can be expensive and cost
| lifes.
| synergy20 wrote:
| no you don't, you use a true RTOS instead.
|
| linux RTOS is at microseconds granularity but it still can
| not 100% guarantee it, anything in cache nature (L2 cache,
| TLB miss) are hard for hard real time.
|
| a dual kernel with xenomai could improve it, but it is not
| widely used somehow, only used in industrial controls I
| think.
|
| linux RT is great for audio, multimedia etc as well, where
| real-time is crucial, but not a MUST.
| froh wrote:
| > anything in cache nature (L2 cache, TLB miss) are hard
| for hard real time
|
| yup that's why you'd pin the memory and the core for the
| critical task. which, alas, will affect performance of
| the other cores and all other tasks. and whoosh there
| goes the BOM...
|
| which again as we both probably are familiar with leads
| to the SoC designs with a real time core microcontroller
| and a HPC microprocessor on the same package. which leads
| to the question how to architect the combined system of
| real-time microcontroller and compute power but soft real
| time microprocessor such that the overall system remains
| sufficiently reliable...
|
| oh joy and fun!
| synergy20 wrote:
| that's indeed the trend, i.e. put a small RTOS core along
| with a normal CPU for non-real-time tasks, in the past
| it's done on two boards: one is a MCU another is a
| typical CPU, now it's one one, very important for where
| RTOS is a must, e.g. robotics.
|
| How the CPU and MCU communicate is a good question to
| tackle, typically chip vendors provide some solutions, I
| think OpenAMP is for this.
| snickerbockers wrote:
| Pretty sure most people who think they need a real-time
| thread actually don't tbh.
| rcxdude wrote:
| really depends on your paranoia level and the consequences
| for failure. soft to hard realtime is a bit of a spectrum
| in terms of how hard of a failure missing a deadline
| actually is, and therefore how much you try to verify that
| you will actually meet that deadline.
| eschneider wrote:
| The beauty of multicore/multi-cpu systems is that you can
| dedicate cores to running realtime OSs and leave the non-
| hard realtime stuff to an embedded linux on it's own core.
| snvzz wrote:
| This is why the distinction between soft and hard realtime
| exists.
|
| Linux-rt makes linux actually decent at soft realtime.
| PREEMPT_RT usually results on measured peak latency for
| realtime tasks (SCHED_RR/SCHED_FIFO) on the order of a few
| hundred usec.
|
| Standard Linux lets latency go to tens of milliseconds,
| easily verifiable by running cyclictest from rt-tests for a
| few hours while using the computer. Needless to say, this
| is unacceptable for many user cases, including pro audio,
| videoconference and even gaming.
|
| In contrast, AmigaOS's exec.library had no trouble yielding
| solid sub-millisecond behaviour in 1985, on a relatively
| slow 7MHz 68000.
|
| No amount of patching Linux can give you hard realtime, as
| it is about hard guarantees, backed up by proofs built from
| formal verification, which Linux is excluded from due to
| its sheer size.
|
| There's a few RTOSs that are formally verified, but I only
| know one that provides process isolation via the usual
| supervisor vs user CPU modes virtualization model: seL4.
| cptaj wrote:
| Unless its just music
| itishappy wrote:
| It may not be safety critical, but remember that people can
| and will purchase $14k power chords to (ostensibly) improve
| the experience of listening to "just music".
|
| https://www.audioadvice.com/audioquest-nrg-dragon-high-
| curre...
| cwillu wrote:
| FWIW, a power chord is a _very_ different thing than a
| power cord.
| itishappy wrote:
| LOL, what a typo! Good catch!
| binary132 wrote:
| what if your analog sampler ruins the only good take you
| can get? What if it's recording a historically important
| speech? Starting to get philosophical here...
| duped wrote:
| Unless that music is being played through a multi kW
| amplifier into a stadium and an xrun causes damage to the
| drivers and/or audience (although, they should have hearing
| protection anyway).
| beiller wrote:
| Your talk of xrun is giving me anxiety. When I was
| younger I dreamed of having a linux audio effects stack
| with cheap hardware on stage and xruns brought my dreams
| crashing down.
| robocat wrote:
| xrun definition:
| https://unix.stackexchange.com/questions/199498/what-are-
| xru...
|
| (I didn't know the term, trying to be helpful if others
| don't)
| tinix wrote:
| just a buffer under/overrun
| snvzz wrote:
| it's cross-run aka xrun because these buffers are
| circular.
|
| Depending on implementation, it will either pause or play
| the old sample where the new one isn't yet but should be.
| spacechild1 wrote:
| > and an xrun causes damage to the drivers and/or
| audience
|
| An xrun typically manifests itself as a (very short)
| discontinuity or gap in the audio signal. It might sound
| unpleasant, but there's nothing dangerous about it.
| dripton wrote:
| You can divide realtime applications into safety-critical and
| non-safety-critical ones. For safety-critical apps, you're
| totally right. For non-critical apps, if it's late and
| therefore buggy once in a while, that sucks but nobody dies.
|
| Examples of the latter include audio and video playback and
| video games. Nobody wants pauses or glitches, but if you get
| one once in a while, nobody dies. So people deliver these on
| non-RT operating systems for cost reasons.
| binary132 wrote:
| This kind of makes the same point I made though -- apps
| without hard realtime requirements aren't "really realtime"
| applications
| duped wrote:
| The traditional language is "hard" vs "soft" realtime
| binary132 wrote:
| RTOS means hard realtime.
| pluto_modadic wrote:
| I sense that people will insist on their requirements
| being hard unnecessarily... and that the bug is the fault
| of it being on a near-realtime system instead of it being
| faulty even on a realtime one.
| tremon wrote:
| No -- soft realtime applications are things like video
| conferencing, where you care mostly about low latency in
| the audio/video stream but it's ok to drop the occasional
| frame. These are still realtime requirements, different
| from what your typical browser does (for example): who
| cares if a webpage is rendered in 100ms or 2s? Hard
| realtime is more like professional audio/video recording
| where you want hard guarantees that each captured frame
| is stored and processed within the alotted time.
| atq2119 wrote:
| > who cares if a webpage is rendered in 100ms or 2s?
|
| Do you really stand by the statement of this rhetorical
| question? Because if yes: this attitude is a big reason
| for why web apps are so unpleasant to work with compared
| to locally running applications. Depending on the
| application, even 16ms vs 32ms can make a big difference.
| tremon wrote:
| Yes I do, because I don't think the attitude is the
| reason, the choice of technology is the reason. If you
| want to control for UI latency, you don't use a generic
| kitchen-sink layout engine, you write a custom interface.
| You can't eat your cake and have it too, even though most
| web developers want to disagree.
| lll-o-lll wrote:
| > You can divide realtime applications into safety-critical
| and non-safety-critical ones.
|
| No. This is a common misconception. The distinction between
| a hard realtime system and a soft realtime system is simply
| whether missing a timing deadline leads to a) failure of
| the system or b) degradation of the system (but the system
| continues to operate). Safety is not part of it.
|
| Interacting with the real physical world often imposes
| "hard realtime" constraints (think signal processing).
| Whether this has safety implications simply depends on the
| application.
| jancsika wrote:
| Your division puts audio _performance_ applications in a
| grey area.
|
| On the one hand they aren't safety critical.
|
| On the other, I can imagine someone getting chewed out or
| even fired for a pause or a glitch in a professional
| performance.
|
| Probably the same with live commercial video compositing.
| eschneider wrote:
| Audio is definitely hard realtime. The slightest delays
| are VERY noticeable.
| jancsika wrote:
| I mean, it should be.
|
| But there are plenty of performers who apparently rely on
| Linux boxes and gumption.
| wongarsu wrote:
| There is some amount of realtime in factory control where
| infrequent misses will just increase your reject rate in QA.
| abe_m wrote:
| Having worked on a number of "real time" machine control
| applications:
|
| 1) There is always a possibility that something fails to run
| by its due date. Planes crash sometimes. Cars won't start
| some times. Factory machinery makes scrap parts sometimes. In
| a great many applications, missing a real time deadline
| results in degraded quality, not end of life, or regional
| catastrophy. The care that must be taken to lower the
| probability of failure needs to be in proportion to the
| consequence of the failure. Airplanes have redundant systems
| to reduce (but not eliminate) possibility of failure, while
| cars and trucks generally don't.
|
| 2) Even in properly working real time systems, there is a
| tolerance window on execution time. As machines change modes
| of operation, the amount of calculation effort to complete a
| cycle changes. If the machine is in a warm up phase, it may
| be doing minimal calculations, and the scan cycle is fast.
| Later it may be doing a quality control function that needs
| to do calculations on inputs from numerous sensors, and the
| scan cycle slows down. So long as the scan cycle doesn't
| exceed the limit for the process, the variation doesn't cause
| problems.
| mlsu wrote:
| That is true, but generally not acceptable to a regulating
| body for these critical applications. You would need to
| design and implement a validation test to prove timing in
| your system.
|
| Much easier to just use an RTOS and save the expensive
| testing.
| vlovich123 wrote:
| But you still need to implement the validation test to
| prove that the RTOS has these requirements...
| mlsu wrote:
| You do not, if you use an RTOS that is already certified
| by the vendor. This saves not only a lot of time and
| effort for verification and validation, but also a lot of
| risk, since validation is unpredictable and extremely
| expensive.
|
| Therefore it'd be remarkable not to see a certified RTOS
| in such industries and applications where that validation
| is required, like aerospace or medical.
| blt wrote:
| How is your point 2) a response to any of the earlier
| points? Hard realtime systems don't care about variation,
| only the worst case. If your code does a single multiply-
| add most of the time but calls `log` every now and then,
| hard realtime requirement is perfectly satisfied if the
| bound on the worst-case runtime of `log` is small enough.
| abe_m wrote:
| I suppose it isn't, but I bristle when I see someone
| tossing around statements like "close enough doesn't
| really exist". In my experience when statements like that
| start up, there are people involved that don't understand
| variation is a part of every real process. My point is
| that if you're going to get into safety critical systems,
| there is always going to be some amount of variation, and
| there is always a "close enough", as there is never an
| "exact" in real systems.
| jancsika wrote:
| The point is to care about the _worst case_ within that
| variation.
|
| Most software cares about the average case, or, in the
| case of the Windows 10/11 start menu animation, the
| average across all supported machines apparently going 20
| years into the future.
| moffkalast wrote:
| I feel like at this point we have enough cores (or will soon,
| anyway) that you could dedicate one entirely to one process
| and have it run realtime.
| KWxIUElW8Xt0tD9 wrote:
| That's one way to run DPDK processes under LINUX -- you get
| the whole processor for doing whatever network processing
| you want to do -- no interruptions from anything.
| ajross wrote:
| > Think avionics, medical devices, automotive, military
| applications.
|
| FWIW by-device/by-transistor-count, the bulk of "hard
| realtime systems" with millisecond-scale latency requirements
| are just audio.
|
| The sexy stuff are all real applications too. But mostly we
| need this just so we don't hear pops and echos in our video
| calls.
| binary132 wrote:
| Nobody thinks Teams is a realtime application
| ajross wrote:
| No[1], but the people writing the audio drivers and DSP
| firmware absolutely do. Kernel preemption isn't a feature
| for top-level apps.
|
| [1] Actually even that's wrong: for sure there are teams
| of people within MS (and Apple, and anyone else in this
| space) measuring latency behavior at the top-level app
| layer and doing tuning all the way through the stack. App
| latency excursions can impact streams too, though ideally
| you have some layer of insulation there.
| lmm wrote:
| Like many binary distinctions, when you zoom in on the
| details hard-versus-soft realtime is really more of a
| spectrum. There's "people will die if it's late". "The line
| will have to stop for a day if it's late". "If it's late,
| it'll wreck the part currently being made". Etc.
|
| Even hard-realtime systems have a failure rate, in practice
| if not in theory - even a formally verified system might
| encounter a hardware bug. So it's always a case of tradeoffs
| between failure rate and other factors (like cost). If
| commodity operating systems can push their failure rate down
| a few orders of magnitude, that moves the needle, at least
| for some applications.
| JohnFen wrote:
| When I'm doing realtime applications using cheap, low-power,
| high-clockrate ARM chips (I don't consider x86 chips for those
| sorts of applications), I'm not using an operating system at
| all. An OS interferes too much, even an RTOS. I don't see how
| this changes anything.
|
| But it all depends on what your application is. There are a lot
| of applications that are "almost real-time" in need. For those,
| this might be useful.
| PaulDavisThe1st wrote:
| CPU speed and clock rate has absolutely nothing to do with
| realtime anything.
| eisbaw wrote:
| Great to hear. However even if Linux the kernel is real-time,
| likely the hardware won't be due to caches and internal magic CPU
| trickery.
|
| Big complex hardware is a no-no for true real-time.
|
| That's why AbsInt and WCET tools mainly has simple CPU
| architectures. 8051 will truly live forever.
|
| btw, Zephyr RTOS.
| wholesomepotato wrote:
| Features of modern CPUs don't really prevent them from real
| time usage, afaik. As long as something is bounded and can be
| reasoned about it can be used to build a real time system. You
| can always assume no cache hits and alikes, maximum load etc
| and as long as you can put a bound on the time it will take,
| you're good to go.
| synergy20 wrote:
| mlock your memory, test with cache miss and cache
| invalidation scenarios will help, using no heap for memory
| allocation, but it's a bit hard
| jeffreygoesto wrote:
| Right. But still possible.
|
| https://www.etas.com/en/applications/etas-middleware-
| solutio...
| eschneider wrote:
| Does anyone use paged memory in hard realtime systems?
| SAI_Peregrinus wrote:
| Exactly. "Real-time" is a misnomer, it should be called
| "bounded-time". As long as the bound is deterministic, known
| in advance, and guaranteed, it's "real-time". For it to be
| useful it also must be under some application-specific
| duration.
|
| The bounds are usually in CPU cycles, so a faster CPU can
| sometimes be used even if it takes more cycles. CPUs capable
| of running Linux usually have higher latency (in cycles) than
| microcontrollers, but as long as that can be kept under the
| (wall clock) duration limits with bounded-time it's fine.
| There will still be cases where the worst-case latency to
| fetch from DRAM in an RT-Linux system will be higher than a
| slower MCU fetching from internal SRAM, so RT-Linux won't
| take over all these systems.
| bloak wrote:
| So the things that might prevent you are:
|
| 1. Suppliers have not given you sufficient information for
| you to be able to prove an upper bound on the time taken.
| (That must happen a lot.)
|
| 2. The system is so complicated that you are not totally
| confident of the correctness of your proof of the upper
| bound.
|
| 3. The only upper bound that can prove with reasonable
| confidence is so amazingly bad that you'd be better off with
| cheaper, simpler hardware.
|
| 4. There really isn't a worst case. There might, for example,
| be a situation equivalent to "roll the dice until you don't
| get snake eyes". In networking, for example, sometimes after
| a collision both parties try again after a random delay so
| the situation is resolved eventually with probability one but
| there's no actual upper bound. A complex CPU and memory
| system might have something like that? Perhaps you'd be happy
| with "the probability of this operation taking more than 2000
| clock cycles is less than 10^-13" but perhaps not.
| formerly_proven wrote:
| You're probably thinking about bus arbiters in 4.), which
| are generally fast but have no bounded settling time.
| dooglius wrote:
| System management mode is one example of a feature on modern
| CPUs that prevents real-time usage https://wiki.linuxfoundati
| on.org/realtime/documentation/howt...
| nraynaud wrote:
| I think it's really useful on 'big' MCU, like the raspberry pi.
| There exists an entire real time spirit there, where you don't
| really use the CPU to do any bit banging but everything is on
| time as seen from the outside. You have timers that receive the
| quadrature encoders inputs, and they just send interrupt when
| they wrap, the GPIO system can be plugged to the DMA, so you
| can stream the memory to the output pins without involving the
| CPU (again, interrupts at mid-buffer and empty buffer). You can
| stream to a DAC, stream from a ADC to memory with the DMA. A
| lot of that stuff bypasses the caches to get a predictable
| latency.
| stefan_ wrote:
| Nice idea but big chip design strikes again: on the latest
| Raspberry Pi, GPIO pins are handled by the separate IO chip
| connected over PCI Express. So now all your GPIO stuff needs
| to traverse a shared serial bus (that is also doing bulk
| stuff like say raw camera images).
|
| And already on many bigger MCUs, GPIOs are just separate
| blocks on a shared internal bus like AHB/APB that connects
| together all the chip IP, causing unpredictable latencies.
| 0xDEF wrote:
| >Big complex hardware is a no-no for true real-time.
|
| SpaceX uses x86 processors for their rockets. That small drone
| copter NASA put on Mars uses "big-ish" ARM cores that can
| probably run older versions of Android.
| ska wrote:
| Does everything runs on those CPUs though? Hard realtime
| control is often done on much simpler MCU at the lowest
| level, with oversight/planning for a high level system....
| zokier wrote:
| In short, no. For Ingenuity (the Mars2020 helicopter) the
| flight computer runs on pair of hard-realtime Cortex R5
| MCUs paired with a FPGA. The non-realtime Snapdragon SoC
| handles navigation/image processing duties.
|
| https://news.ycombinator.com/item?id=26907669
| ska wrote:
| That's basically what I expected, thanks.
| SubjectToChange wrote:
| _Big complex hardware is a no-no for true real-time._
|
| There are advanced real time cores like the Arm Coretex-R82. In
| fact many real time systems are becoming quite powerful due to
| the need to process and aggregate ever increasing amounts of
| sensor data.
| snvzz wrote:
| >8051 will truly live forever.
|
| 68000 is the true king of realtime.
| Aaargh20318 wrote:
| What does this mean for the common user? Is this something you
| would only enable in very specific circumstances or can it also
| bring a more responsive system to the general public?
| stavros wrote:
| As far as I can understand, this is for Linux becoming an
| option when you need an RTOS, so for critical things like
| aviation, medical devices, and other such systems. It doesn't
| do anything for the common user.
| SubjectToChange wrote:
| The Linux kernel, real-time or not, is simply too large and
| too complex to realistically certify for anything safety
| critical.
| ska wrote:
| For the parts of such systems that you would need an RTOS for
| this isn't really a likely replacement because the OS is way
| too complex.
|
| The sort of thing it could help with is servicing hardware
| that _does_ run hard realtime. For example, you have an RTOS
| doing direct control of a robot or medical device or
| whatever, and you have a UI pendant or the like that a user
| is interacting with. If linux on that pendant can make some
| realtime latency guarantees, you may be able to simplify
| communication between the two without risking dropping bits
| on the floor.
|
| Conversely, for the common user it could improve things like
| audio/video streaming, in theory but I haven't looked into
| details or how much trouble there is currently.
| elcritch wrote:
| It depends on the field. I know of one robots control
| software company planning to switch to a RT Linux stack.
| Their current one is a *BSD derived rtos that runs, kid you
| not, alongside windows.
|
| RT Linux might not pass on some certifications, but there's
| likely many systems where it would be sufficient.
| ska wrote:
| With robots a lot depends on the scope of movement and
| speed, and whether or not it interacts with
| environment/people. For some applications the controller
| is already dedicated hardware on the joint module anyway
| with some sophistication, connected to an CAN (or
| etherCAT) bus or something like that - so no OS is the
| tightest loop - I could see the high level control
| working on a RT linux or whatever if you wanted too, lots
| of tradeoffs. Mainly though it's the same argument, you
| probably don't want a complex OS involved in the lowest
| level/finest time tick updates. Hell some of the encoders
| are spewing enough data you probably end up with the
| first thing it hits being an ASIC anyway, then a MCU
| dealing with control updates/fusion etc., then a higher
| level system for planning.
| ravingraven wrote:
| If by "common" user you mean the desktop user, not much. But
| this is a huge deal for embedded devices like industrial
| control and communication equipment, as their devs will be able
| to use the latest mainline kernel if they need real-time
| scheduling.
| fbdab103 wrote:
| My understanding is that real-time makes a system _slower_. To
| be real-time, you have to put a time allocation on everything.
| Each operation is allowed X budget, and will not deviate. This
| means if the best-case operation is fast, but the worst case is
| slow, the system has to always assume worst case.
| sesm wrote:
| It's a classic latency-throughout trade off: smaller latency
| - lower throughout. Doing certain operations in bulk (like
| GC) increases latency, but is also more efficient and
| increases throughput.
| snvzz wrote:
| >real-time makes a system slower.
|
| linux-rt's PREEMPT_RT has a negligible impact. It is there,
| but it is negligible. It does, however, enable a lot of use
| cases where Linux fails otherwise, such as pro audio.
|
| In modern usage, it even helps reduce input jitter with
| videogames and enable lower latency videoconference.
|
| I am hopeful most distributions will turn it on by default,
| as it benefits most users, and causes negligible impact on
| throughput-centric workloads.
| dist-epoch wrote:
| It could allow very low latency audio (1-2 ms). Not a huge
| thing, but nice for some audio people.
| snvzz wrote:
| s/nice/needed/g
| andrewaylett wrote:
| RT doesn't necessarily improve latency, it gives it a fixed
| upper bound for _some_ operations. But the work needed to allow
| RT can definitely improve latency in the general case -- the
| example of avoiding synchronous printk() calls is a case in
| point. It should improve latency under load even when RT isn 't
| even enabled.
|
| I think I'm right in asserting that a fully-upstreamed RT
| kernel won't actually do anything different from a normal one
| unless you're actually running RT processes on it. The reason
| it's taken so long to upstream has been the trade-offs that
| have been needed to enable RT, and (per the article) there
| aren't many of those left.
| rcxdude wrote:
| the most common desktop end-user that might benefit from this
| is those doing audio work: latency and especially jitter can be
| quite a pain there.
| knorker wrote:
| I just want SCHED_IDLEPRIO to actually do what it says.
| deepsquirrelnet wrote:
| What a blast from the past. I compiled a kernel for Debian with
| RT_PREEMPT about 17-18 years ago to use with scientific equipment
| that needed tighter timings. I was very impressed at the
| latencies and jitter.
|
| I haven't really thought about it since then, but I can imagine
| lots of used cases for something like an embedded application
| with raspberry pi where you don't quite want to make the leap
| into a microcontroller running an RTOS.
| HankB99 wrote:
| Interesting to mention the Raspberry Pi. I saw an article just
| a day or two ago that claimed that the RpiOS was stated by and
| ran on top of RTOS. That's particularly interesting because at
| one time years ago, I saw suggestions that Linux could run as a
| task on an RTOS. Things that required hard real time deadlines
| could run on the RTOS and not be subject to the delays that a
| virtual memory system could entail.
|
| I don't recall if this was just an idea or was actually
| implemented. I also have seen only the one mention of RpiOS on
| an RTOS so I'm curious about that.
| rsaxvc wrote:
| >That's particularly interesting because at one time years
| ago, I saw suggestions that Linux could run as a task on an
| RTOS.
|
| I've worked with systems that ran Linux as a task of uITRON
| as well as threadX, both on somewhat obscure ARM hardware.
| Linux managed the MMU but had a large carveout for the RTOS
| code. They had some strange interrupt management so that
| Linux could 'disable interrupts' but while Linux IRQs were
| disabled, an RTOS IRQ could still fire and context switch
| back to an RTOS task. I haven't seen anything like this on
| RPi though, but it's totally doable.
| HankB99 wrote:
| Interesting to know that it was more than just an idea -
| thanks!
| 0xDEF wrote:
| What do embedded real-time Linux people use for bootloader, init
| system, utilities, and C standard library implementation? Even
| Android that does not have real-time constraints ended up using
| Toybox for utilities and rolling their own C standard library
| (Bionic).
| rcxdude wrote:
| You aren't likely to need to change a lot of these: the whole
| point is basically making it so that all that can run as normal
| but won't really get in the way of your high-priority process.
| It's just that your high-priority process needs to be careful
| not to block on anything that might take too long due to some
| other stuff running. In which case you may need to avoid
| certain C standard library calls, but not replace it entirely.
| jovial_cavalier wrote:
| I use u-boot for a boot loader. As for init and libc, I just
| use systemd and glibc.
|
| Boot time is not a bottleneck for my application (however long
| it takes, the client will take longer...), and I'm sure there's
| some more optimal libc to use, but I'm not sure the juice is
| worth the squeeze.
|
| I'm also interested in what others are doing.
| shandor wrote:
| I guess U-boot, uclibc, and busybox is quite common starting
| point.
|
| Of course, this varies immensely between different use cases,
| as "embedded Linux" spans such a huge swath of different kinds
| of systems from very cheap and simple to complex and powerful.
| salamanderman wrote:
| I had a frustrating number of job interviews in my early career
| where the interviewers didn't know what realtime actually was.
| That "and predictable delay" concept from the article frequently
| seemed to be lost on many folks, who seemed to think realtime
| just meant fast, whatever that means.
| mort96 wrote:
| I would even remove the "minimum" part altogether; the point of
| realtime is that operations have predictable upper bounds. That
| might even mean slower average cases than in non-realtime
| systems. If you're controlling a car's braking system, "the
| average delay is 50ms but might take up to 80ms" might be
| acceptable, whereas "the average delay is 1ms but it might take
| arbitrarily long, possibly multiple seconds" isn't.
| ska wrote:
| The old saying "real time" /= "real fast". Hard vs "soft"
| realtime muddies things a bit, but I think it's probably the
| majority of software developers don't really understand what
| realtime actually is either.
| NalNezumi wrote:
| Slightly tangential, but does anyone know good learning material
| to understand real-time (Linux) kernel more? For someone with
| rudimentary Linux knowledge.
|
| I've had to compile&install real-time kernel as a requirement for
| a robot arm (franka) control computer. It would be nice to know a
| bit more than just how to install the kernel.
| ActorNightly wrote:
| https://www.freertos.org/implementation/a00002.html
|
| Generally, having experience with Greenhills in a previous job,
| for personal projects like robotics or control systems I would
| recommend programming a microcontroller directly rather than
| dealing with SoC with RTOS. Modern STM32s with Cortex chips
| have enough processing power to run pretty much anything.
| alangibson wrote:
| Very exiting news for those of us building CNC machines with
| LinuxCNC. The end of kernel patches is nigh!
| Tomte wrote:
| OSADL runs a cool QA farm: https://www.osadl.org/OSADL-QA-Farm-
| Real-time.linux-real-tim...
| Animats wrote:
| QNX had this right decades ago. The microkernel has upper bounds
| on everything it does. There are only a few tens of thousands of
| lines of microkernel code. All the microkernel does is allocate
| memory, dispatch the CPU, and pass messages between processes.
| Everything else, including drivers and loggers, is in user space
| and can be preempted by higher priority threads.
|
| The QNX kernel doesn't do anything with strings. No parsing, no
| formatting, no messages.
|
| Linux suffers from being too bloated for real time. Millions of
| lines of kernel, all of which have to be made preemptable. It's
| the wrong architecture for real time. So it took two decades to
| try to fix this.
| vacuity wrote:
| For a modern example, there's seL4. I believe it does no
| dynamic memory allocation. It's also formally verified for
| various properties. (Arguably?) its biggest contribution to
| kernel design is the pervasive usage of capabilities to
| securely but flexibly export control to userspace.
| adastra22 wrote:
| Capabilities are important, but I don't think that was
| introduced by seL4. Mach (which underlies macOS) has the same
| capability-based system.
| vacuity wrote:
| I didn't say seL4 introduced capabilities. However, to my
| knowledge, seL4 was the first kernel to show that
| _pervasive_ usage of capabilities is both feasible and
| beneficial.
| monocasa wrote:
| The other L4s before it showed that caps are useful and
| can be implemented efficiently.
| vacuity wrote:
| https://dl.acm.org/doi/pdf/10.1145/2517349.2522720
|
| " We took a substantially different approach with seL4;
| its model for managing kernel memory is seL4's main
| contribution to OS design. Motivated by the desire to
| reason about resource usage and isolation, we subject all
| kernel memory to authority conveyed by capabili- ties
| (except for the fixed amount used by the kernel to boot
| up, including its strictly bounded stack). "
|
| I guess I should've said seL4 took capabilities to the
| extreme.
| naasking wrote:
| seL4 was heavily inspired by prior capability based
| operating systems like EROS (now CapROS) and Coyotos.
| Tying all storage to capabilities was core to those
| designs.
| adastra22 wrote:
| There's quite a history of capabilities-based research
| OS's that culminated in, but did not start with L4 (of
| which seL4 is a later variant).
| vacuity wrote:
| Yes, but I believe seL4 took it to the max. I may be
| wrong on that count, but I think seL4 is unique in that
| it leverages capabilities for pretty much everything
| except the scheduler. (There was work in that area, but
| it's incomplete.)
| adastra22 wrote:
| L4 was developed in the 90's. Operating Systems like
| Amoeba, which were fundamentally capability-based to a
| degree that even exceeds L4, were a hot research topic in
| the 80's.
|
| L4's contribution was speed. It was assumed that
| microkernels, and especially capability-based
| microkernels were fundamentally slower than monolithic
| kernels. This is why Linux (1991) is monolithic. Yet L4
| (1994) was the fastest operating system in existence at
| the time, despite being a microkernel and capability
| based. It's too bad those dates aren't reversed, or we
| might have had a fast, capability-based, microkernel
| Linux :(
| josephg wrote:
| How did it achieve its speed? My understanding was that
| microkernel architectures were fundamentally slower than
| monolithic kernels because context switching is slower
| than function calls. How did L4 manage to be the fastest?
| vacuity wrote:
| Two factors are the small working set (fit the code and
| data in cache) and heavily optimized IPC. The IPC in L4
| kernels is "a context switch with benefits": in the best
| case, the arguments are placed in registers and the
| context is switched. Under real workloads, microkernels
| probably will be slower, but not by much.
| mananaysiempre wrote:
| IIRC the KeyKOS/EROS/CapROS tradition used capabilities
| for everything _including_ the scheduler. Of course,
| pervasive persistence makes those systems somewhat
| esoteric (barring fresh builds, they never shut down or
| boot up, only go to sleep and wake up in new bodies;
| compare Smalltalk, etc.).
| vacuity wrote:
| Guess I'm too ignorant. I need to read up on these. I did
| know about the persistence feature. I think it's not
| terrible but also not great, and systems should be
| _designed_ for being shut down and apps being closed.
| naasking wrote:
| > I think it's not terrible but also not great, and
| systems should be designed for being shut down and apps
| being closed.
|
| The problem with shutdowns and restarts is the secure
| bootstrapping problem. The boot process must be within
| the trusted computing base, so how do you minimize the
| chance of introducing vulnerabilities? With
| checkpointing, if you start in a secure state, you're
| guaranteed to have a secure state after a reboot. This is
| not the case with other any other form of reboot,
| particularly ones that are highly configurable and so
| easy for the user to introduce an insecure configuration.
|
| In any case, many apps are now designed to restore their
| state on restart, so they are effectively checkpointing
| themselves, so there's clearly value to checkpointing. In
| systems with OS-provided checkpointing it's a central
| shared service and doesn't have to be replicated in every
| program. That's a significant reduction in overall system
| code that can go wrong.
| vacuity wrote:
| It's fallacious to assume that the persistence model of
| the system can't enter an invalid state and thus cause
| issues similar to bootstrapping. The threat model also
| doesn't make sense to me: if an attacker can manipulate
| the boot process, I feel like they would be able to
| attack the overall system just fine. Also, there's the
| bandwidth usage, latency, and whatnot. I think
| persistence is a strictly less powerful, although
| certainly convenient, design for an OS.
| naasking wrote:
| > The threat model also doesn't make sense to me: if an
| attacker can manipulate the boot process, I feel like
| they would be able to attack the overall system just
| fine.
|
| That's not true actually. These capability systems have
| the principle of least privilege right down to their
| core. The checkpointing code is in the kernel which only
| calls out to the disk driver in user space. The
| checkpointing code itself is basically just "flush these
| cached pages to their corresponding locations on disk,
| then update a boot sector pointer to the new checkpoint",
| and booting a system is "read these pages pointed to by
| this disk pointer sequentially into memory and resume".
|
| The attack surface in this system is incomparably small
| compared to the boot process of a typical OS, which run
| user-defined scripts and scripts written by completely
| unknown people from software you downloaded from the
| internet, often with root or other broad sets of
| privileges.
|
| I really don't think you can appreciate how this system
| works without digging into it a little. EROS was built
| from the design of KeyKOS that ran transactional bank
| systems back in the 80s. KeyKOS pioneered this kind of
| checkpointing system, so it saw real industry use in
| secure systems for years. I recommend at least reading an
| overview:
|
| https://flint.cs.yale.edu/cs428/doc/eros-ieee.pdf
|
| EROS is kind of like what you'd get it if you took
| Smalltalk and tried to push it into the hardware as an
| operating system, while removing all sources of ambient
| authority. It lives on as CapROS:
|
| https://www.capros.org/
| vacuity wrote:
| I don't deny that bootstrapping in current systems is
| ridiculous, but I don't see why it can't be improved.
| It's not like EROS is a typical OS either. In any case,
| I'll read up on those OSes.
| adastra22 wrote:
| Amoeba was my favorite, as it was a homogeneous,
| decentralized operating system. Different CPU
| architectures spread across different data centers, and
| it was all homogenized together into a single system
| image. You had a shell prompt where you typed commands
| and the OS could decide to spawn your process on your
| local device, in the server room rack, or in some
| connected datacenter in Amsterdam, it didn't make a
| difference. From the perspective of you, your program, or
| the shell, it's just a giant many-core machine with weird
| memory and peripheral access latencies that the OS
| manages.
|
| Oh, and anytime as needed the OS could serialize out your
| process, pipe it across the network to another machine,
| and resume. Useful for load balancing, or relocating a
| program to be near the data it is accessing. Unless your
| program pays special attention to the clock, it wouldn't
| notice.
|
| I still think about Amoeba from time to time, and imagine
| what could have been if we had gone down that route
| instead.
| vacuity wrote:
| Wouldn't there be issues following from distributed
| systems and CAP? Admittedly, I know nothing about Amoeba.
|
| E.g. You spawn a process on another computer and then the
| connection drops.
| adastra22 wrote:
| There's no free lunch of course, so you would have
| circumstances where a network partition at a bad time
| would result in a clone instead of a copy. I don't know
| what, if anything, Amoeba did about this.
|
| In practice it might not be an issue. The reason you'd
| typically do something like move processes across a WAN
| is because you want it to operate next to data it is
| making heavy use of. The copy that booted up local to the
| data would continue operating, while the copy at the
| point of origin would suddenly see the data source go
| offline.
|
| Now of course more complex schemes can be devised, like
| if the data source is replicated and so both copies
| continue operating. Maybe a metric could be devised for
| detecting these instances when the partition is healed,
| and one or both processes are suspended for manual
| resolution? Or maybe programs just have to be written
| with the expectation that their capabilities might
| suddenly become invalid at any time, because the
| capability sides with the partition that includes the
| resource? Or maybe go down the route of making the entire
| system transactional, so that partition healing can
| occur, and only throw away transaction deltas once
| receipts are received for all nodes ratcheting state
| forward?
|
| It'd be an interesting research area for sure.
| Animats wrote:
| No, that was KeyKOS, which was way ahead of its time.[1]
| Norm Hardy was brilliant but had a terrible time getting
| his ideas across.
|
| [1] https://en.wikipedia.org/wiki/KeyKOS
| _kb wrote:
| And unfortunately had its funding dumped because it wasn't
| shiny AI.
| snvzz wrote:
| Its old source of funding. And it was much more complex[0]
| than that.
|
| seL4 is now a healthy non-profit, seL4 foundation[1].
|
| 0. https://microkerneldude.org/2022/02/17/a-story-of-
| betrayal-c...
|
| 1. https://microkerneldude.org/2022/03/22/ts-in-2022-were-
| back/
| Animats wrote:
| The trouble with L4 is that it's so low-level you have to
| put another OS on top of it to do anything Which usually
| means a bloated Linux. QNX offers a basic POSIX
| interface, implemented mostly as libraries.
| snvzz wrote:
| Note that L4 and seL4 are very different kernels. They
| represent the 2nd generation and 3rd generation of
| microkernels respectively.
|
| With that out of the way, you're right in that the
| microkernel doesn't present a posix interface.
|
| But, like QNX, there are libraries for that, seL4
| foundation itself maintains some.
|
| They have a major ongoing effort on system servers,
| driver APIs and ways to deploy system scenarios. Some of
| them were talked about in a recent seL4 conference.
|
| And then there's third party efforts like the amazing
| Genode[0], which supports dynamic scenarios with the same
| drivers and userspace binaries across multiple
| microkernels.
|
| They even have a modern webbrowser, 3d acceleration as
| well as providing a virtualbox box that runs inside
| Genode, so the dogfooding developers are be able to run
| e.g. Linux inside a virtualbox to bridge the gap.
|
| 0. https://www.genode.org/
| bregma wrote:
| The current (SDP 8) kernel has 15331 lines of code, including
| comments and Makefiles.
| gigatexal wrote:
| QNX is used in vehicle infotainment systems no? Where else?
|
| I'm not bothered by the kernel bloat. There's a lot of dev time
| being invested in Linux and while the desktop is not as much of
| a priority as say the server space a performant kernel on
| handhelds and other such devices and the dev work to get it
| there will benefit the desktop users like myself.
| bkallus wrote:
| I went to a conference at GE Research where I spoke to some
| QNX reps from Blackberry for a while. Seemed like they were
| hinting that some embedded computers in some of GE'S
| aerospace and energy stuff relies on QNX.
| lmm wrote:
| > QNX is used in vehicle infotainment systems no? Where else?
|
| A bunch of similar embedded systems. And blackberry, if
| anyone's still using them.
| tyfon wrote:
| It was used in my old toyota avensis from 2012. The
| infotainment was so slow you could measure performance in
| seconds pr frame instead of frames pr second :)
|
| In the end, all I could practically use it for was as a
| bluetooth audio connector.
| notrom wrote:
| I've worked with it in industrial automation systems in large
| scale manufacturing plants where it was pretty rock solid.
| And I'm aware of it's use in TV production and transmissions
| systems.
| Cyph0n wrote:
| Cisco routers running IOS-XR, until relatively recently.
| SubjectToChange wrote:
| Railroads/Positive Train Control, emergency call centers,
| etc. QNX is used all over the place. If you want an even more
| impressive Microkernel RTOS, then Green Hills INTEGRITY is a
| great example. It's the RTOS behind the B-2, F-{16,22,35},
| Boeing 787, Airbus A380, Sikorsky S-92, etc.
| yosefk wrote:
| "Even more impressive" in what way? I haven't used
| INTEGRITY but used the Green Hills compiler and debugger
| extensively for years and they're easily the most buggy
| development tools I've ever had the misfortune to use. To
| me the "impressive" thing is their ability to lock safety
| critical software developers into using this garbage.
| dilyevsky wrote:
| Routers, airplanes, satellites, nuclear power stations, lots
| of good stuff
| gigatexal wrote:
| > QNX had this right decades ago. The microkernel has upper
| bounds on everything it does. There are only a few tens of
| thousands of lines of microkernel code. All the microkernel
| does is allocate memory, dispatch the CPU, and pass messages
| between processes. Everything else, including drivers and
| loggers, is in user space and can be preempted by higher
| priority threads.
|
| So much like a well structured main method in a C program or
| other C like language where main just orchestrates the calling
| of other functions and such. In this case main might initialize
| different things where the QNX kernel doesn't but the idea or
| general concept remains.
|
| I'm no kernel dev but this sounds good to me. Keeps things
| simple.
| vacuity wrote:
| Recently, I've been thinking that we need a microkernel
| design in applications. You have the core and then services
| that can integrate amongst each other and the core that
| provide flexibility. Like the "browser as an OS" kind of
| things but applied more generally.
| galdosdi wrote:
| Yes! This reminds me strongly of the core/modules
| architecture of the apache httpd, as described by the
| excellent O'Reilly book on it.
|
| The process of serving an HTTP request is broken into a
| large number of fine grained stages and plugin modules may
| hook into any or all of these to modify the input and
| output to each stage.
|
| The same basic idea makes it easy to turn any application
| concept into a modules-and-core architecture. From the day
| I read (skimmed) that book a decade or two ago this pattern
| has been burned into my brain
| blackth0rn wrote:
| ECS systems for the gaming world are somewhat like this.
| There is the core ECS framework and then the systems and
| entity's integrate with each other
| spookie wrote:
| ECS is incredible. Other areas should take notice
| whstl wrote:
| Agreed. I find that we're going in this direction in many
| areas, games just got there much faster.
|
| Pretty much everywhere there is some undercurrent of "use
| this ultra-small generic interface for everything and
| life will be easier". With games and ECS, microkernels
| and IPC-for-everything, with frontend frameworks and
| components that only communicate between themselves via
| props and events, with event sourcing and CQRS backends,
| Actors in Erlang, with microservices only communicating
| via the network to enforce encapsulation... Perhaps even
| Haskell's functional-core-imperative-shell could count as
| that?
|
| I feel like OOP _tried_ to get to this point, with
| dependency injection and interface segregation, but
| didn't quite get there due to bad ergonomics, verbosity
| and because it was still too easy to break the rules. But
| it was definitely an attempt at improving things.
| vbezhenar wrote:
| COM, OSGI, Service architecture, microservice architecture
| and countless other approaches. This is correct way to
| build applications, because it gets reinvented over and
| over again.
| elcritch wrote:
| That's pretty much what Erlang/OTP is, and it's like a
| whole OS. Though it lacks capabilities.
| js2 wrote:
| VxWorks is what's used on Mars and it's a monolithic kernel, so
| there's more than one way to do it. :-)
| dilyevsky wrote:
| I think RT build also had to disable mmu
| signa11 wrote:
| this feels like tannenbaum-torvalds debate once again.
| creshal wrote:
| > Millions of lines of kernel, all of which have to be made
| preemptable.
|
| ~90% of those are device drivers, which you'd still need with a
| microkernel if you want it to run or arbitrary hardware.
| dontlaugh wrote:
| But crucially, drivers in a microkernel run in user space and
| are thus pre-emptible by default. Then the driver itself only
| has to worry about dealing with hardware timing when pre-
| empted.
| creshal wrote:
| Sure, but who's going to write the driver in the first
| place? Linux's "millions of lines of code" are a really
| underappreciated asset, there's tons of obscure hardware
| that is no longer supported by any other actively
| maintained OS.
| dontlaugh wrote:
| I also don't see how we could transition to a
| microkernel, indeed.
| naasking wrote:
| The very first "hypervisors" were actually microkernels
| that ran Linux as a guest. This was done with Mach on
| PowerPC/Mac systems, and also the L4 microkernel. That's
| one way.
|
| The only downside of course, is that you don't get the
| isolation benefits of the microkernel for anything
| depending on the Linux kernel process.
| matheusmoreira wrote:
| And yet it's getting done! It's very impressive work.
| AndyMcConachie wrote:
| Linus Torvalds and Andrew Tannenbaum called. They want their
| argument back!
| the8472 wrote:
| For an example how far the kernel goes to get log messages out
| even on a dying system and how that's used in real deployments:
|
| https://netflixtechblog.com/kubernetes-and-kernel-panics-ed6...
| rwmj wrote:
| About printk, the backported RT implementation of printk added to
| the RHEL 9.3 kernel has deadlocks ...
| https://issues.redhat.com/browse/RHEL-15897 &
| https://issues.redhat.com/browse/RHEL-9380
| w10-1 wrote:
| There is no end game until there are end users beating on the
| system. That would put the 'real' in 'real-time'.
|
| But who using a RTOS now would take the systems-integration
| cost/risk of switching? Would this put Android closer to Metal
| performance?
| sesm wrote:
| IMO if you really care about certain process being responsive,
| you should allocate dedicated CPU cores and a contiguous region
| of memory to it, that shouldn't be touched by the rest of OS. Oh,
| and also give a it direct access to a separate network card. I'm
| not sure if Linux supports this.
| pardoned_turkey wrote:
| The conversation here focuses on a distinction between "hard"
| real-time applications, where you probably don't want a general-
| purpose OS like Linux no matter what; and "soft" real-time
| applications like videoconferencing or audio playback, where you
| nothing terrible happens if you get a bit of stuttering or drop a
| couple of frames every now and then. The argument is that RT
| Linux would be a killer solution for that.
|
| But you can do all these proposed "soft" use cases with embedded
| Linux today. It's not like low-latency software video or audio
| playback is not possible, or wasn't possible twenty years ago.
| You only run into problems on busy systems where non-preemptible
| I/O could regularly get in the way. That's seldom a concern in
| embedded environments.
|
| I think there are compelling reasons for making the kernel fully-
| preemptible, giving people more control over scheduling, and so
| forth. But these reasons have relatively little to do with
| wanting Linux to supersede minimalistic realtime OSes or bare-
| metal code. It's just good hygiene that will result in an OS
| that, even in non-RT applications, behaves better under load.
| jovial_cavalier wrote:
| does HN have any thoughts on Xenomai[1]? I've been using it for
| years without issue.
|
| On a BeagleBone Black, it typically gives jitter on the order of
| hundreds of nanoseconds. I would consider it "hard" real-time (as
| do they). I'm able to schedule tasks periodically on the scale of
| tens of microseconds, and they never get missed.
|
| It differs from this in that Real-Time Linux attempts to make
| Linux itself preemptive, whereas Xenomai is essentially its own
| kernel, running Linux as a task on top. It provides an ABI which
| allows you to run your own tasks alongside or at higher prio than
| Linux. This sidesteps the `printk()` issue, for instance, since
| Xenomai doesn't care. It will gladly context switch out of printk
| in order to run your tasks.
|
| The downside is that you can't make normal syscalls while inside
| of the Xenomai context. Well... you can, but obviously this
| invalidates the realtime model. For example, calling `printf()`
| or `malloc()` inside of a xenomai task is not preemptable. The
| Xenomai ABI does its best to replicate everything you may need as
| far as syscalls, which works great as long as you're happy doing
| your own heap allocations.
|
| [1]: https://xenomai.org/
| dataflow wrote:
| I feel like focusing on the kernel side misses CPU level issues.
|
| Is there any known upper bound on, say, how long a memory access
| instruction takes on x86?
| rsaxvc wrote:
| I don't know for x86.
|
| But for things that really matter, I've tested by configuring
| the MMU to disable caching for the memory that the realtime
| code lives in and uses to emulate 0% hitrate. And there's
| usually still a fair amount of variance on top of that
| depending on if the memory controller has a small cache, and
| where the memory controller is in its refresh cycle.
| dataflow wrote:
| Yeah. And I'm not sure that even _that_ would give you the
| worst case as far as the cache is concerned. Of course I don
| 't know how these implementations work, but it seems
| plausible that code that directly uses memory could run
| faster than code that encounters a cache miss beforehand (or
| contention, if you're using multiple cores). Moreover there's
| also the instruction cache, and I'm not sure if you can
| disable caching for that in a meaningful way?
|
| For soft real time, I don't see a problem. But for hard real
| time, it seems a bit scary.
| rsaxvc wrote:
| You're right! I can think of two cases I've run into where
| bypassing the cache can be faster compared to a miss.
|
| On some caches the line must be filled before allowing a
| write(ignoring any write buffer at the interface above the
| cache) - those basically halve the memory bandwidth when
| writing to a lot of cache lines. Some systems now have
| instructions for filling a cache line directly to avoid
| this. And some CPUs have bit-per-byte validity tracking to
| avoid this too.
|
| Even on caches with hit-during-fill, a direct read from an
| address near the last-to-be-filled end of a cacheline can
| sometimes be a little faster than a cache miss, since the
| miss will fill the rest of the line first.
| rsaxvc wrote:
| > Moreover there's also the instruction cache, and I'm not
| sure if you can disable caching for that in a meaningful
| way?
|
| Intels used to boot with their caches disabled, but I
| haven't worked with them in forever, and never multicore.
|
| I worked with a lot of microcontrollers, and it's not
| uncommon to be able to disable the instruction cache there.
|
| There are a few things that require the data caches too,
| like atomic accesses on ARM. Usually we were doing
| something fairly short though in our realtime code, so it
| was easy enough to map just the memory it needed as
| uncacheable.
| saagarjha wrote:
| You can continually take page faults in a Turing complete way
| without executing any code, so I would guess this is unbounded?
| dataflow wrote:
| I almost mentioned page faults, but that's something the
| kernel has control over. It could just make sure everything
| is in memory so there aren't any faults. So it's not really
| an issue I think.
___________________________________________________________________
(page generated 2023-11-17 23:01 UTC)