[HN Gopher] An EEVDF CPU Scheduler for Linux
       ___________________________________________________________________
        
       An EEVDF CPU Scheduler for Linux
        
       Author : marcodiego
       Score  : 47 points
       Date   : 2023-03-25 19:51 UTC (3 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | mrtweetyhack wrote:
       | [dead]
        
       | loeg wrote:
       | EEVDF is an acronym, and capitalized in the original article
       | (title on HN is currently "An Eevdf CPU Scheduler for Linux").
        
         | junon wrote:
         | HN does this automatically.
        
       | steponlego wrote:
       | It's a shame that userspace governor was dropped by Intel, it
       | worked amazingly for laptops.
        
         | 867-5309 wrote:
         | is this why `powersave` or `performance` are the only options
         | available now?
        
       | mochomocha wrote:
       | While this looks like a step in the right direction to provide
       | user-space per-process control of the latency VS throughput
       | tradeoffs (lower runqueue latency at the expense of shorter
       | timeslices), I'd rather see the SCHED_EXT patch [1] merged, so
       | there's a lower barrier to entry when it comes to implementing
       | alternative scheduling policies (eBPF vs in-kernel). It's also
       | much faster to iterate user-space if you don't have to upgrade
       | kernels every time, especially in large fleet deployment setups
       | where AB testing is available.
       | 
       | On a more controversial note, I trust more Meta & Google
       | engineers proposing alternative schedulers that have been
       | properly AB tested on millions of servers running a wide range of
       | heterogeneous software VS small-scale synthetic benchmarks even
       | when run by experts/maintainers.
       | 
       | [1]: https://lwn.net/ml/linux-
       | kernel/20221130082313.3241517-1-tj@...
        
       | binkHN wrote:
       | For those, like me, who don't know what this is:
       | 
       | > The "Earliest Eligible Virtual Deadline First" (EEVDF)
       | scheduling algorithm is not new; it was described in [a] 1995
       | paper by Ion Stoica and Hussein Abdel-Wahab. Its name suggests
       | something similar to the Earliest Deadline First algorithm used
       | by the kernel's deadline scheduler but, unlike that scheduler,
       | EEVDF is not a realtime scheduler, so it works in different ways.
        
         | 867-5309 wrote:
         | still none the wiser..
        
       | Osiris wrote:
       | What I'm looking for is a governor that makes sure processes like
       | X/Wayland that handle user input are always available to handle
       | user input at the cost of other processes.
       | 
       | If I'm running a video encoding using all available CPUs, I don't
       | want that to cause lag with all my other processes.
       | 
       | This happens to me a lot when running node processes like unit
       | tests that running on all cores. It slows everything else down. I
       | even set the nice value on node to always be the lowest priority
       | and it only helps a little.
       | 
       | I've actually had Linux become completely unresponsive when
       | running many large node instances. That shouldn't happen.
        
         | invalidator wrote:
         | It will always have some impact because you're thrashing the
         | caches.
         | 
         | A couple more things to try:
         | 
         | Reduce swappiness. Linux tries to swap out idle pages to free
         | memory for block cache, even when there's only light memory
         | pressure. If you have a dozen processes working hard and your
         | UI is untouched for several minutes, it can get swapped out and
         | lag.                 sudo sysctl vm.swappiness=1
         | 
         | Set the CPU affinity for your build processes to exclude CPU 0,
         | ensuring one is always free for UI:                 taskset
         | 0xFFFFFFFE nice your_build_command_here
         | 
         | Neither is perfect but they can help.
        
         | post-factum wrote:
         | Setting the nice level is not enough. Instead, `SCHED_IDLE`
         | policy should be applied to the workload that is being run in
         | the background.
         | 
         | [1] may help if you run such a workload from CLI.
         | 
         | [1] https://codeberg.org/post-factum/litter
        
         | wongarsu wrote:
         | Windows handles this by giving the foreground window time slots
         | that are three times longer than normal (unless you are on a
         | server version or change the setting).
         | 
         | It also boosts the priority of threads that get woken up after
         | waiting for IO, events, etc, as opposed to being CPU bound.
         | Which I guess Linux's fair scheduler also kind of ends up
         | doing, even if via an entirely different mechanism (simply by
         | observing that they used less CPU).
         | 
         | It's interesting how different operating systems take
         | completely different approaches to their schedulers. Linux
         | seems to try to make quite sophisticated schedulers, trying out
         | very different concepts, but keeps the scheduler very seperated
         | and "blind" to the rest of the system. Meanwhile Windows has an
         | incredibly simple scheduler (run the highest-priority runnable
         | thread, interrupt it if its timeslot is over or a more
         | important thread become ready, repeat) and puts all the effort
         | into letting the rest of the kernel nudge priorities and
         | timeslot lengths based on what the process or thread is doing.
        
         | saint_yossarian wrote:
         | Have you tried a low-latency kernel, like Liquorix?
        
         | pengaru wrote:
         | Your compositor could already be run at realtime priority...
         | the risk is if the process has bugs which may result in
         | infinite loops, you may lose the ability to interact with the
         | system.
        
           | PlutoIsAPlanet wrote:
           | Wouldn't you also need some kind of priority system in the
           | graphics system? If you have a game open using all the
           | available GPU resources, doesn't matter if you can't block
           | input if you can't get any updates to the screen.
        
             | pengaru wrote:
             | Well, for anything actually putting pixels on the screen
             | under the compositor's purview, the compositor controls the
             | framerate of the game and there needs to be context
             | switching involving the scheduler to get things displayed.
             | 
             | I'm not clear on what mechanisms exist today for preventing
             | things like a DoS of GPU resources in something of a single
             | entry to the GPU driver without returning to userspace.
             | We've all seen things like GPU stalls reported in dmesg, so
             | clearly the drivers are tasked with preventing that sort of
             | hang where a process enters the driver and stays there too
             | long.
        
           | jlokier wrote:
           | The Linux kernel since 2.6.25 provides CPU time limits on
           | real-time priority so that runaway real-time tasks don't
           | prevent other interactive processes from having some CPU
           | time. So you can still interact with the system.
           | 
           | "man 7 sched" provides a lot of detail about the real-time
           | scheduling policies and options.
           | 
           | (Of course if the compositor is essential to the kinds of
           | interaction you need, e.g via the GUI, you'll still be stuck
           | if the compositor stops working. But that's not a real-time
           | priority problem.)
        
             | pengaru wrote:
             | Nice! The last time I programmed something with SCHED_FIFO
             | was long ago, and required a lot of sysrq.
        
       | xeeeeeeeeeeenu wrote:
       | I'm surprised the article doesn't mention BFS/MuQSS because it's
       | also based on the EEVDF algorithm.
        
       ___________________________________________________________________
       (page generated 2023-03-25 23:00 UTC)