[HN Gopher] Antikernel: A Decentralized Secure Operating System ...
       ___________________________________________________________________
        
       Antikernel: A Decentralized Secure Operating System Architecture
       (2016) [pdf]
        
       Author : ingve
       Score  : 66 points
       Date   : 2021-06-18 14:53 UTC (8 hours ago)
        
 (HTM) web link (eprint.iacr.org)
 (TXT) w3m dump (eprint.iacr.org)
        
       | Veserv wrote:
       | I can see how this design might isolate allocated resources, but
       | how does this design prevent a malicious agent from just
       | requesting all free resources and DoS'ing the system other than
       | by having a fully pre-allocated static configuration?
        
       | [deleted]
        
       | glitchc wrote:
       | Here's the GitHub repo: https://github.com/azonenberg/antikernel
       | 
       | I was curious about the state of the project and whether any
       | source was available. Thought others might feel the same way.
        
       | hasmanean wrote:
       | This is a neat idea, one that I've been wanting to explore
       | recently. The devil is the details for implementation
       | practicality though.
       | 
       | One question is: how does this compare with micro-kernel OSes?
       | People thought that was the future even before linux was
       | invented.
        
         | azonenberg wrote:
         | My guiding design principle was to take a trendline from
         | monolithic kernels to microkernels, then keep on going until
         | you had no kernel left.
        
       | mikewarot wrote:
       | It is an interesting idea, but I think it would be easy to simply
       | spam/scan the handle space to elevate privilege. I couldn't find
       | any mention of how handles/capabilities are protected from
       | impersonation.
       | 
       | [Update] clearly it wasn't as simple as I thought, thanks!
        
         | azonenberg wrote:
         | NoC addresses are implicitly part of the handle. So for
         | example, a pointer isn't just physical address 0x41414141. The
         | full capability is "physical address 0x41414141 being accessed
         | by hardware thread 3 of CPU 0x8000" because when you
         | dereference that pointer, it creates a DMA packet from the
         | CPU's L1 cache with source address 0x8003 addressed to the NoC
         | address of the RAM controller.
         | 
         | If malicious code running in thread context 4 tries to reach
         | the same physical address, the packet will come from 0x8004.
         | Since the RAM controller knows that the page starting at
         | 0x41414000 is owned by thread 0x8003, the request is denied.
         | 
         | The source address is added by the CPU silicon and isn't under
         | software control, so it can't be overwritten or spoofed. If the
         | attacker were somehow able to change the "current thread ID"
         | registers, all register file and cache accesses would be
         | directed to that of the new thread, and all memory of their
         | evil self would be gone. Thus, all they did was trigger a
         | context switch.
        
           | akkartik wrote:
           | This is awesome.
        
         | gamacodre wrote:
         | In the doc, they claim:
         | 
         | > Antikernel is designed to ensure that following are not
         | possible given that an attacker has gained unprivileged code
         | execution within the context of a user-space application or
         | service: ... or (iv) gain access to handles belonging to
         | another process by any means.
        
           | mikewarot wrote:
           | There are only 16 address bits, that's a very small space to
           | search.
        
             | azonenberg wrote:
             | Which is why the interconnect fabric adds the source
             | address based on who made the request.
             | 
             | Even hardware IP blocks don't have the ability to send
             | packets _from_ arbitrary addresses. You can send _to_
             | anywhere you want, but you can 't lie about who made the
             | request. Which allows access control to be enforced at the
             | receiving end.
             | 
             | That "something you are" as well as "something you know"
             | factor prevents any capability from being spoofed, since
             | the identity of the device initiating the request is part
             | of the capability.
        
               | bruiseralmighty wrote:
               | I guess I am not understanding something still about this
               | spoof protection.
               | 
               | What prevents me, an attacker, from making (or
               | simulating) a bunch of hardware with different
               | identities? Perhaps by setting up my hardware to run
               | through multiple different 'self' identities and then
               | trying to talk to other hardware with each disguise.
               | 
               | If the address space is small and cheap, it seems that I
               | could quickly map any decentralized hardware at low cost.
        
               | azonenberg wrote:
               | This is a SoC, addresses of on chip devices are fixed
               | when the silicon is made.
               | 
               | The source address is added to a packet by the on-chip
               | router as it's received. You can send anything you want
               | into a port, but you can't make it seem like it came from
               | a different port without modifying the router.
               | 
               | This model allows you to integrate IP cores from semi-
               | untrusted third parties, or run untrusted code on a CPU,
               | without allowing impersonation.
               | 
               | Let me put it another way: I ran a SAT solver on the
               | compiled gate-level netlist of the router and proved that
               | it can't ever emit a packet with a source address other
               | than the hard-wired address of the source port. Any
               | packet coming from a peripheral into port X will have
               | port X's address on it when it leaves the router, end of
               | story.
        
       | vlovich123 wrote:
       | > To create a process, the user sends a message to the CPU core
       | he wishes to run it on
       | 
       | What happens if the user doesn't care which CPU the process runs
       | on?
       | 
       | This is certainly an interesting execution model. I think this
       | should be adopted for IoT & given the simplicity of devices there
       | I think it's a perfect fit. I do worry that this may increase the
       | BOM & make this model only cost-effective for small runs. The
       | code is generally a "fixed" upfront cost ammortized over the
       | number of units you sell (the more units you reuse the software
       | across, the cheaper that software cost to develop) whereas more
       | complex hardware is a constant % increase for each unit (ignoring
       | the costs for building that HW).
       | 
       | I think traditional CPUs/operating systems may pose additional
       | use-case to adoption & I'm interested to hear the author's
       | thoughts on this. Ignoring the ecosystem aspect (which is big,
       | but let's assume this idea is revolutionary enough that everyone
       | pivots to making this happen), how would you apply security &
       | resource usage policies? Let's say I want to access file at
       | "/bin/ps". Traditionally we have SW that controls your access to
       | that file. Additionally, we can choose to sandbox your process'
       | access to that file, restrict how much of the available HW
       | utilization can be dedicated to servicing that, etc etc. If we
       | implement it in HW, is the flexibility of these policies to
       | evolve over time fixed at manufacturing time? I wonder if that
       | proves to be a serious roadblock. In theory you could have
       | something sitting in the middle to apply the higher level policy,
       | but I think that's basically going to effectively reintroduce a
       | micro kernel for that purpose. You could say that that could be
       | implemented, in this example, at the filesystem layer (i.e. the
       | thing that reads for example EXT4 & is the only process talking
       | to that block device), but resource usage would prove tricky
       | (e.g. if a block device had an EXT4 & EXT3 partition, how would
       | you restrict a process to only occupy 5% of the I/O time on the
       | block device?).
        
         | azonenberg wrote:
         | I was targeting industrial control systems, medical implants,
         | and other critical applications. So a lot of the constraints of
         | desktop/mobile/server computing just don't apply.
         | 
         | The assumption was that you'd end up with something close to a
         | microkernel architecture, but without any all-powerful software
         | up top. A purely peer to peer architecture running on bare
         | metal.
        
       | cogman10 wrote:
       | Interesting and wildly impractical :D
       | 
       | In order for this to fly it's a ground up rewrite of fundamental
       | principles of hardware.
       | 
       | For example, how on earth would paging work in such a world? Who
       | or what would be in charge of swapping memory?
       | 
       | How would a system admin go about killing an errant process?
       | 
       | How would a developer code on such a machine that allows for no
       | inspection of remote application memory.
       | 
       | Beyond that, seems like instead of decreasing the amount of trust
       | needed to function, it increases it with every hardware
       | manufacture. All while making fixing security holes harder (since
       | they are now part of the hardware itself.)
       | 
       | Imagine, for example, your ram having a bug that allows you to
       | access memory regions outside of what you are privileged to do.
       | 
       | It's a nice idea, but I think google's zircon microkernel
       | approach is far more practical while being nearly as secure and
       | not requiring special purpose hardware.
        
         | azonenberg wrote:
         | That was the point of the research: throw away how everything
         | has been done and explore what a clean slate redesign would
         | look like if we had the benefit of 30-40 years of hindsight.
         | 
         | The easiest way to implement transparent paging would be to
         | have a paging block that exposed the memory manager API
         | (allocate, free, read, write) and would proxy requests to
         | either RAM or disk as appropriate. But since I was targeting
         | embedded devices, it was assumed that you would explicitly
         | allocate buffers in on-die RAM, off-die RAM, or flash as
         | required. The architecture was very much NUMA in nature.
         | 
         | The prototype made for my thesis only allowed the "terminate"
         | request to come from the process requesting its own
         | termination. It would be entirely plausible to add access
         | controls that allowed a designated supervisor process to
         | terminate it as well.
         | 
         | Remote memory can be inspected, but only with explicit consent.
         | I had a full gdbserver that would access memory by proxying
         | through the L1 cache of the thread context being debugged.
         | (Which, of course, had the ability to say no if it didn't want
         | to be debugged).
         | 
         | The goal was to have a system that had very small number of
         | trusted hardware blocks which did only one thing, then build an
         | OS on top of it microkernel style - except that all you have is
         | unprivileged servers, there's no longer any software on top.
        
           | cogman10 wrote:
           | What was your plan to address things like hardware
           | bugs/errata/updates? How could someone initiate a firmware
           | flash in this sort of system?
        
             | azonenberg wrote:
             | Keep the hardware design as simple as possible, formally
             | verify everything critical, then run the rest as userspace
             | software and hope you got all the critical bugs.
             | 
             | The point wasn't to move everything into silicon, it was to
             | move _just enough_ into silicon that you no longer needed
             | any code in ring-0.
             | 
             | As an example, the memory controller's access control list
             | and allocator was a FIFO of free pages and an array storing
             | an owner for each page. Super simple, very few gates, hard
             | to get wrong, and easy to verify.
        
               | cogman10 wrote:
               | What about the more complex hardware such as the CPU?
               | There are plenty of opportunities for mistakes there,
               | some not so obvious (such as Spectre attacks). And I
               | can't imagine you'd get away with completely isolating it
               | like the memory.
        
               | azonenberg wrote:
               | My long term plan was actually to do a full formal
               | verification of the CPU against the ISA spec and prove no
               | state leakage between thread contexts, but I didn't have
               | time to do that before I graduated
               | 
               | I deliberately went with a very simple CPU (2-way in
               | order, barrel scheduler with no pipeline forwarding, no
               | speculation or branch prediction) to minimize
               | opportunities for things to go wrong and keep the design
               | simple enough that full end to end formal in the future
               | would be tractable. Spectre/Meltdown are a perfect
               | example of attack classes that are entirely eliminated by
               | such a simple design.
               | 
               | I was targeting safety critical systems where you're
               | willing to give up some CPU performance for extreme
               | levels of assurance that the system won't fail on you.
        
               | aidenn0 wrote:
               | At what point does it become easier to formally verify
               | the software than the hardware? Certainly it's easier to
               | change the software than the hardware, so there is
               | already incentive to not move things into hardware that
               | need not be there.
        
               | akkartik wrote:
               | Since software is so easy to change, nobody bothers
               | verifying it. The more we put into software, the more
               | room there is for cutting corners in testing because, "we
               | can always send out a firmware upgrade later." Me, I'd
               | rather go back to the old days when updates to hardware
               | were expensive so you better get it right.
        
               | peheje wrote:
               | It's like writing with a ball pen vs a pencil. But you
               | _could_ be just as careful with a pencil. I 'm just
               | wondering if what you are proposing is the equivalent of
               | runes.
        
               | [deleted]
        
       | avmich wrote:
       | One of the problems with microkernels is performance - messages
       | are being sent between parts and that takes time. Here we have
       | sort of an ultimate microkernel - can we have enough hardware
       | support to have performance competitive, while keeping main
       | advantages?
        
         | azonenberg wrote:
         | This architecture also borrowed a lot from the exokernel
         | philosophy, which would actually have provided significant
         | speedups by cutting unnecessary bloat and abstraction.
         | 
         | My conjecture was that this would make up for most of the
         | overhead, but I didn't have time during the thesis to optimize
         | it sufficiently to do a fair comparison with existing
         | architectures.
        
           | avmich wrote:
           | Andrew, I think it's a wonderful project, but to meaningfully
           | evaluate it it definitely takes way more than an hour to
           | figure out what, why and how is done here :) reading articles
           | is hard. Thanks for your work!
        
       | fsflover wrote:
       | This reminds me of Qubes Air, which I think will be implemented
       | earlier: https://www.qubes-os.org/news/2018/01/22/qubes-air/
        
       ___________________________________________________________________
       (page generated 2021-06-18 23:01 UTC)