[HN Gopher] Antikernel: A Decentralized Secure Operating System ...
___________________________________________________________________
Antikernel: A Decentralized Secure Operating System Architecture
(2016) [pdf]
Author : ingve
Score : 66 points
Date : 2021-06-18 14:53 UTC (8 hours ago)
(HTM) web link (eprint.iacr.org)
(TXT) w3m dump (eprint.iacr.org)
| Veserv wrote:
| I can see how this design might isolate allocated resources, but
| how does this design prevent a malicious agent from just
| requesting all free resources and DoS'ing the system other than
| by having a fully pre-allocated static configuration?
| [deleted]
| glitchc wrote:
| Here's the GitHub repo: https://github.com/azonenberg/antikernel
|
| I was curious about the state of the project and whether any
| source was available. Thought others might feel the same way.
| hasmanean wrote:
| This is a neat idea, one that I've been wanting to explore
| recently. The devil is the details for implementation
| practicality though.
|
| One question is: how does this compare with micro-kernel OSes?
| People thought that was the future even before linux was
| invented.
| azonenberg wrote:
| My guiding design principle was to take a trendline from
| monolithic kernels to microkernels, then keep on going until
| you had no kernel left.
| mikewarot wrote:
| It is an interesting idea, but I think it would be easy to simply
| spam/scan the handle space to elevate privilege. I couldn't find
| any mention of how handles/capabilities are protected from
| impersonation.
|
| [Update] clearly it wasn't as simple as I thought, thanks!
| azonenberg wrote:
| NoC addresses are implicitly part of the handle. So for
| example, a pointer isn't just physical address 0x41414141. The
| full capability is "physical address 0x41414141 being accessed
| by hardware thread 3 of CPU 0x8000" because when you
| dereference that pointer, it creates a DMA packet from the
| CPU's L1 cache with source address 0x8003 addressed to the NoC
| address of the RAM controller.
|
| If malicious code running in thread context 4 tries to reach
| the same physical address, the packet will come from 0x8004.
| Since the RAM controller knows that the page starting at
| 0x41414000 is owned by thread 0x8003, the request is denied.
|
| The source address is added by the CPU silicon and isn't under
| software control, so it can't be overwritten or spoofed. If the
| attacker were somehow able to change the "current thread ID"
| registers, all register file and cache accesses would be
| directed to that of the new thread, and all memory of their
| evil self would be gone. Thus, all they did was trigger a
| context switch.
| akkartik wrote:
| This is awesome.
| gamacodre wrote:
| In the doc, they claim:
|
| > Antikernel is designed to ensure that following are not
| possible given that an attacker has gained unprivileged code
| execution within the context of a user-space application or
| service: ... or (iv) gain access to handles belonging to
| another process by any means.
| mikewarot wrote:
| There are only 16 address bits, that's a very small space to
| search.
| azonenberg wrote:
| Which is why the interconnect fabric adds the source
| address based on who made the request.
|
| Even hardware IP blocks don't have the ability to send
| packets _from_ arbitrary addresses. You can send _to_
| anywhere you want, but you can 't lie about who made the
| request. Which allows access control to be enforced at the
| receiving end.
|
| That "something you are" as well as "something you know"
| factor prevents any capability from being spoofed, since
| the identity of the device initiating the request is part
| of the capability.
| bruiseralmighty wrote:
| I guess I am not understanding something still about this
| spoof protection.
|
| What prevents me, an attacker, from making (or
| simulating) a bunch of hardware with different
| identities? Perhaps by setting up my hardware to run
| through multiple different 'self' identities and then
| trying to talk to other hardware with each disguise.
|
| If the address space is small and cheap, it seems that I
| could quickly map any decentralized hardware at low cost.
| azonenberg wrote:
| This is a SoC, addresses of on chip devices are fixed
| when the silicon is made.
|
| The source address is added to a packet by the on-chip
| router as it's received. You can send anything you want
| into a port, but you can't make it seem like it came from
| a different port without modifying the router.
|
| This model allows you to integrate IP cores from semi-
| untrusted third parties, or run untrusted code on a CPU,
| without allowing impersonation.
|
| Let me put it another way: I ran a SAT solver on the
| compiled gate-level netlist of the router and proved that
| it can't ever emit a packet with a source address other
| than the hard-wired address of the source port. Any
| packet coming from a peripheral into port X will have
| port X's address on it when it leaves the router, end of
| story.
| vlovich123 wrote:
| > To create a process, the user sends a message to the CPU core
| he wishes to run it on
|
| What happens if the user doesn't care which CPU the process runs
| on?
|
| This is certainly an interesting execution model. I think this
| should be adopted for IoT & given the simplicity of devices there
| I think it's a perfect fit. I do worry that this may increase the
| BOM & make this model only cost-effective for small runs. The
| code is generally a "fixed" upfront cost ammortized over the
| number of units you sell (the more units you reuse the software
| across, the cheaper that software cost to develop) whereas more
| complex hardware is a constant % increase for each unit (ignoring
| the costs for building that HW).
|
| I think traditional CPUs/operating systems may pose additional
| use-case to adoption & I'm interested to hear the author's
| thoughts on this. Ignoring the ecosystem aspect (which is big,
| but let's assume this idea is revolutionary enough that everyone
| pivots to making this happen), how would you apply security &
| resource usage policies? Let's say I want to access file at
| "/bin/ps". Traditionally we have SW that controls your access to
| that file. Additionally, we can choose to sandbox your process'
| access to that file, restrict how much of the available HW
| utilization can be dedicated to servicing that, etc etc. If we
| implement it in HW, is the flexibility of these policies to
| evolve over time fixed at manufacturing time? I wonder if that
| proves to be a serious roadblock. In theory you could have
| something sitting in the middle to apply the higher level policy,
| but I think that's basically going to effectively reintroduce a
| micro kernel for that purpose. You could say that that could be
| implemented, in this example, at the filesystem layer (i.e. the
| thing that reads for example EXT4 & is the only process talking
| to that block device), but resource usage would prove tricky
| (e.g. if a block device had an EXT4 & EXT3 partition, how would
| you restrict a process to only occupy 5% of the I/O time on the
| block device?).
| azonenberg wrote:
| I was targeting industrial control systems, medical implants,
| and other critical applications. So a lot of the constraints of
| desktop/mobile/server computing just don't apply.
|
| The assumption was that you'd end up with something close to a
| microkernel architecture, but without any all-powerful software
| up top. A purely peer to peer architecture running on bare
| metal.
| cogman10 wrote:
| Interesting and wildly impractical :D
|
| In order for this to fly it's a ground up rewrite of fundamental
| principles of hardware.
|
| For example, how on earth would paging work in such a world? Who
| or what would be in charge of swapping memory?
|
| How would a system admin go about killing an errant process?
|
| How would a developer code on such a machine that allows for no
| inspection of remote application memory.
|
| Beyond that, seems like instead of decreasing the amount of trust
| needed to function, it increases it with every hardware
| manufacture. All while making fixing security holes harder (since
| they are now part of the hardware itself.)
|
| Imagine, for example, your ram having a bug that allows you to
| access memory regions outside of what you are privileged to do.
|
| It's a nice idea, but I think google's zircon microkernel
| approach is far more practical while being nearly as secure and
| not requiring special purpose hardware.
| azonenberg wrote:
| That was the point of the research: throw away how everything
| has been done and explore what a clean slate redesign would
| look like if we had the benefit of 30-40 years of hindsight.
|
| The easiest way to implement transparent paging would be to
| have a paging block that exposed the memory manager API
| (allocate, free, read, write) and would proxy requests to
| either RAM or disk as appropriate. But since I was targeting
| embedded devices, it was assumed that you would explicitly
| allocate buffers in on-die RAM, off-die RAM, or flash as
| required. The architecture was very much NUMA in nature.
|
| The prototype made for my thesis only allowed the "terminate"
| request to come from the process requesting its own
| termination. It would be entirely plausible to add access
| controls that allowed a designated supervisor process to
| terminate it as well.
|
| Remote memory can be inspected, but only with explicit consent.
| I had a full gdbserver that would access memory by proxying
| through the L1 cache of the thread context being debugged.
| (Which, of course, had the ability to say no if it didn't want
| to be debugged).
|
| The goal was to have a system that had very small number of
| trusted hardware blocks which did only one thing, then build an
| OS on top of it microkernel style - except that all you have is
| unprivileged servers, there's no longer any software on top.
| cogman10 wrote:
| What was your plan to address things like hardware
| bugs/errata/updates? How could someone initiate a firmware
| flash in this sort of system?
| azonenberg wrote:
| Keep the hardware design as simple as possible, formally
| verify everything critical, then run the rest as userspace
| software and hope you got all the critical bugs.
|
| The point wasn't to move everything into silicon, it was to
| move _just enough_ into silicon that you no longer needed
| any code in ring-0.
|
| As an example, the memory controller's access control list
| and allocator was a FIFO of free pages and an array storing
| an owner for each page. Super simple, very few gates, hard
| to get wrong, and easy to verify.
| cogman10 wrote:
| What about the more complex hardware such as the CPU?
| There are plenty of opportunities for mistakes there,
| some not so obvious (such as Spectre attacks). And I
| can't imagine you'd get away with completely isolating it
| like the memory.
| azonenberg wrote:
| My long term plan was actually to do a full formal
| verification of the CPU against the ISA spec and prove no
| state leakage between thread contexts, but I didn't have
| time to do that before I graduated
|
| I deliberately went with a very simple CPU (2-way in
| order, barrel scheduler with no pipeline forwarding, no
| speculation or branch prediction) to minimize
| opportunities for things to go wrong and keep the design
| simple enough that full end to end formal in the future
| would be tractable. Spectre/Meltdown are a perfect
| example of attack classes that are entirely eliminated by
| such a simple design.
|
| I was targeting safety critical systems where you're
| willing to give up some CPU performance for extreme
| levels of assurance that the system won't fail on you.
| aidenn0 wrote:
| At what point does it become easier to formally verify
| the software than the hardware? Certainly it's easier to
| change the software than the hardware, so there is
| already incentive to not move things into hardware that
| need not be there.
| akkartik wrote:
| Since software is so easy to change, nobody bothers
| verifying it. The more we put into software, the more
| room there is for cutting corners in testing because, "we
| can always send out a firmware upgrade later." Me, I'd
| rather go back to the old days when updates to hardware
| were expensive so you better get it right.
| peheje wrote:
| It's like writing with a ball pen vs a pencil. But you
| _could_ be just as careful with a pencil. I 'm just
| wondering if what you are proposing is the equivalent of
| runes.
| [deleted]
| avmich wrote:
| One of the problems with microkernels is performance - messages
| are being sent between parts and that takes time. Here we have
| sort of an ultimate microkernel - can we have enough hardware
| support to have performance competitive, while keeping main
| advantages?
| azonenberg wrote:
| This architecture also borrowed a lot from the exokernel
| philosophy, which would actually have provided significant
| speedups by cutting unnecessary bloat and abstraction.
|
| My conjecture was that this would make up for most of the
| overhead, but I didn't have time during the thesis to optimize
| it sufficiently to do a fair comparison with existing
| architectures.
| avmich wrote:
| Andrew, I think it's a wonderful project, but to meaningfully
| evaluate it it definitely takes way more than an hour to
| figure out what, why and how is done here :) reading articles
| is hard. Thanks for your work!
| fsflover wrote:
| This reminds me of Qubes Air, which I think will be implemented
| earlier: https://www.qubes-os.org/news/2018/01/22/qubes-air/
___________________________________________________________________
(page generated 2021-06-18 23:01 UTC)