[HN Gopher] The Linux kernel can spawn processes on its own
       ___________________________________________________________________
        
       The Linux kernel can spawn processes on its own
        
       Author : zdw
       Score  : 86 points
       Date   : 2022-06-11 14:48 UTC (8 hours ago)
        
 (HTM) web link (www.uninformativ.de)
 (TXT) w3m dump (www.uninformativ.de)
        
       | phendrenad2 wrote:
       | Interesting find. The kernel has all kinds of weird and wacky
       | things in drivers.
        
       | josephcsible wrote:
       | I wonder what the parent PID of these processes is.
        
         | cmeacham98 wrote:
         | init has a ppid of 0, which effectively is a special value that
         | means the kernel. These processes probably also do the same
         | thing (and thus, ppid=0 is a good way to detect kernel-started
         | processes).
        
           | jstimpfle wrote:
           | I think they could be spawned with ppid 1 as well.
        
       | Ericson2314 wrote:
       | I want to stop doing -j with make, and instead allow the OS
       | scheduler --- which alone is in the position to make the right
       | call ---- to "pull" more batch jobs to run as system resources
       | allow.
       | 
       | Perhaps one can do this or almost do this with io_uring?
        
         | gtirloni wrote:
         | _> which alone is in the position to make the right call_
         | 
         | Why is the OS scheduler alone in a better position than make
         | itself to make the right call?
        
           | Ericson2314 wrote:
           | Everyone else is guessing how may "fork" commands to throw
           | out the OS. The OS itself has much more information at its
           | disposal about what resources are being used, where the
           | bottle necks are etc.
           | 
           | An elegant design on doing this in userspace probably boils
           | down to going well on the awy to making a micro kernel.
        
         | alchemist1e9 wrote:
         | That would be a really neat feature. I do run slurm on single
         | nodes and sbatch with dependencies to emulate some of the
         | effect you are asking for.
        
           | Ericson2314 wrote:
           | Glad to hear you think so too!
        
       | 762236 wrote:
       | If you use SELinux, you'd see that it has the 'kernel' domain.
        
       | pengaru wrote:
       | When the kernel is the gatekeeper for the system, obviously it
       | can do anything userspace can, and then some.
        
         | adrianmonk wrote:
         | Also, I can speak 10 languages fluently... in the sense that
         | nobody has the power to stop me from learning 9 more languages
         | if I should decide to do that.
        
         | temac wrote:
         | It can do anything theoretically. It doesn't do anything
         | architecturally. For example, contrary to NT, syscalls can't
         | callback to userspace.
        
           | the8472 wrote:
           | > For example, contrary to NT, syscalls can't callback to
           | userspace.
           | 
           | There are some mechanisms that can call back into userspace
           | during syscalls such as seccomp filters, FUSE, ptrace,
           | userfaultfd, fanotify, the syscall_user_dispatch feature used
           | by wine... There's the core_pattern handler too.
           | 
           | Someone summarized it that a _mov_ instruction could be
           | serviced by starting a python process.
        
             | saagarjha wrote:
             | This is convenient but having the kernel block on userspace
             | doing something is typically bad design :(
        
           | p_l wrote:
           | Arguably signals were the classic "upcall", and netlink is
           | probably main mechanism for upcalls in current linux.
           | 
           | But NT's hidden VMS-esque API for upcalls would be somewhat
           | nicer option at times (or Solaris Doors)
        
           | dmatech wrote:
           | Intel processors have a feature called Supervisor Mode
           | Execution Prevention (SMEP) designed specifically to prevent
           | that in most cases (because there's a decent chance the code
           | might be attacker-controlled). It's totally optional, of
           | course.
           | 
           | https://lwn.net/Articles/517475/
        
       | tremon wrote:
       | This isn't new, right? Firmware loaders and hardware hotplug
       | events (when not using sysfs' netlink firehose) are also spawned
       | directly from the kernel, IIRC?
        
         | dezgeg wrote:
         | In those cases you mentioned the kernel has spawned a process
         | loaded from normal ELF executable from filesystem, just like
         | you could do with exec(). But in this bpfilter_umh case the
         | executable is not coming from the filesystem but instead is
         | embedded in the kernel itself. That seems to be quite new
         | invention.
        
       | chasil wrote:
       | This seems similar to nfsd, which is only present in many UNIX
       | implementations to prompt the kernel to service NFS requests.
       | 
       | A lot of (local) filesystems seem to do this as well (XFS in
       | particular).
        
         | [deleted]
        
         | temac wrote:
         | A lot of kernel code spawn threads. I was not aware that it can
         | also spawn usermode processes running code embedded into the
         | kernel. That should be advertised and developed more IMO. Tons
         | of drivers should have nothing to do in kernelmode.
        
       | convolvatron wrote:
       | how else would init get started?
        
         | anyfoo wrote:
         | After bootstrapping, the kernel sits in its idle task and does
         | absolutely nothing except serving interrupts, the scheduler
         | only switching between kernel threads at most. Then, after a
         | set amount of time (5 minutes), another device with full DRAM
         | access (or sufficiently configured IOMMU) and in the same
         | coherency domain (or some contraption to make things coherent)
         | starts to DMA the init task and the appropriate changes to the
         | kernel's task and thread set data structures into memory. When
         | made coherent, the kernel's scheduler will see the new init
         | task's thread in its set of runnable threads and eventually
         | switch that.
         | 
         | What? You did not ask for a method that is _not_ completely
         | batshit insane.
        
       ___________________________________________________________________
       (page generated 2022-06-11 23:00 UTC)