[HN Gopher] Unikernels
       ___________________________________________________________________
        
       Unikernels
        
       Author : seeker89
       Score  : 116 points
       Date   : 2022-02-16 09:48 UTC (13 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | seeker89 wrote:
       | It looks like around 2015 there was a lot of hype around the
       | topic, but as I tried to document the state of the art, I noticed
       | there hasn't been that much pick up yet. What's your take?
        
         | gwd wrote:
         | The two main advantages of unikernels are performance (reduced
         | CPU and/or memory requirements) and security (hypervisor rather
         | than container boundary).
         | 
         | It turns out that basically nobody cares about either of those.
         | I know someone who for a while worked at a company that was
         | trying to make money off unikernels. They ended up re-
         | implementing a database in a unikernel, in a way that gave them
         | a 2.5x performance improvement over the original project -- or
         | to put it a different way, switching should allow companies to
         | cut their hosting costs in half. Even with such a clear "win",
         | it was still a difficult sell.
         | 
         | Think about how much backend code is written in interpreted
         | languages like Python, or PHP, or Javascript, rather than in
         | compiled languages like Go or Rust. It's just simpler to start
         | with the simple solution and then throw money at the problem as
         | you scale. And while performance may be _one of the reasons_
         | that people are choosing Go or Rust for backends, if it were
         | the _only_ advantage, it 's unlikely that would be compelling
         | enough.
        
         | gjvc wrote:
         | I suspect it was because containers sufficed and using and
         | creating the tooling around them consumed the attention of
         | those who might have otherwise looked at unikernels.
        
           | seeker89 wrote:
           | >I suspect it was because containers sufficed and using and
           | creating the tooling around them consumed the attention of
           | those who might have otherwise looked at unikernels.
           | 
           | I think you might be right. Right place at the right time for
           | containers.
        
           | walterbell wrote:
           | Some marketing material on docker vs unikernels,
           | https://nanovms.com/learn/docker-vs-unikernels
        
       | kitd wrote:
       | I think micro VMs solved a lot of the issues that people had with
       | regular containers and that unikernels were going to fix. There
       | is still probably a performance improvement to be had with
       | unikernels, but not enough to throw away all the investment
       | companies made into containers.
       | 
       | That said, there are options for running unikernels as K8s
       | workloads if you want, eg NanoVMs: https://docs.ops.city/ops/k8s
        
       | fwsgonzo wrote:
       | There seems to be a lot of guessing by the author. VMs that run
       | on the CPU have the same performance as everyone else, except
       | when you trap into the hypervisor, and especially on buggy CPUs.
       | Now, there are a lot of buggy CPUs out there right now, so I
       | won't say any more. But just imagine a world where we have a
       | fully working io_uring (all system calls that make sense), and
       | less buggy CPUs.
        
       | goodpoint wrote:
       | Counterpoint: the absence of logging, monitoring, host-based IDS,
       | and all the system engineering tools you have on Linux is a big
       | negative for security.
        
         | wyuenho wrote:
         | You can compile in these metric collection facilities into the
         | unikernel if you need them. The whole point of unikernel is to
         | allow you to mix and match only the things you need.
        
         | dimitar wrote:
         | You can log and send metrics from your app over a network
         | (which you should probably be doing instead of writing on
         | disk), monitor a VM, IDS makes no sense when you don't have
         | logins and such. And the tools are rarely installed on "cattle"
         | anyway.
        
           | goodpoint wrote:
           | > You can log and send metrics from your app over a network
           | 
           | That's far from enough. It's conceptually and practically
           | wrong to rely on the application to monitor itself.
           | 
           | > IDS makes no sense when you don't have logins and such
           | 
           | On the contrary, there is plenty that IDS can do for webapps.
        
             | eyberg wrote:
             | I'm not sure anyone would advocate the application to
             | monitor itself. Many companies have entire teams of people
             | that have to deal with keeping machines up and they get
             | paid big bucks to do so.
             | 
             | As for the IDS question/statement - can you explain in more
             | detail? Are you talking about file integrity checks or?
             | Unikernels don't have the concept of users or shells or
             | remote login or many of the things that an IDS would
             | actually be looking at.
             | 
             | If it was something such as an attacker overwriting a
             | shared library and you want to monitor or ensure that can't
             | happen both of those operations are feasible in unikernels.
        
       | seeker89 wrote:
       | Also, some interesting perspectives on Linux unikernels:
       | https://www.bu.edu/rhcollab/files/2019/04/unikernel.pdf
        
       | api wrote:
       | I want a unikernel that runs as a process with no special
       | privileges. Huge bonus points if its portable to many common
       | operating systems.
       | 
       | Since recompilation is necessary anyway for unikernels, syscalls
       | could be replaced by function calls or some other user mode thing
       | instead of trapped. It would allow entire containers to run as
       | processes. Not that interesting for cloud, but very interesting
       | for distribution to endpoints or self-hostable apps.
        
       | bluepizza wrote:
       | Because they are a silly idea. Why would you spin up a full
       | kernel to run a single application on top of a hypervisor that is
       | balancing resources?
       | 
       | Operating systems already solve this problem relatively well,
       | without the overhead, via processes and containers.
        
         | jasode wrote:
         | _> Why would you spin up a full kernel to run a single
         | application_
         | 
         | You're misunderstanding the levels of abstraction probably
         | because the word _" kernel"_ within _" unikernel"_ is throwing
         | you off. The idea is to use a _partial_ kernel (only the
         | minimum services one needs). The so-called  "kernel" is a
         | library of code where you compile the minimum bits into the
         | single-process image.
         | 
         |  _> Operating systems already solve this problem relatively
         | well, without the overhead, via processes _
         | 
         | A full operating system like Linux is _expending extra overhead
         | to schedule /prioritize/monitor_ processes (plural) -- because
         | Linux is designed to be more general purpose and open-ended
         | than a specialized unikernel. In contrast, a unikernel with
         | only 1 singular process (say a specialized db engine) doesn't
         | need to expend extra cpu on processe(s) scheduling.
         | 
         | All that said, it doesn't seem like unikernels have enough
         | advantages to attract widespread adoption like containers.
        
           | hypertele-Xii wrote:
           | So it's less of an operating system and more of a single app
           | that runs on metal?
        
             | chakkepolja wrote:
             | Another way to describe it is OS-functionality-as-library.
        
             | packetlost wrote:
             | Yes. Think of it like depending on a small kernel directly
             | in your build step. So your application gets compiled with
             | everything (including OS interface) that it needs and
             | nothing more. The result is a bootable image that is only
             | capable of running your app.
             | 
             | I think the value isn't in the containerization vs
             | unikernel comparison. If you're using containerization
             | you've accepted certain security risks. Where unikernels
             | have a lot of potential IMO is in high security
             | environments where the security risks of containerization
             | are not acceptable.
        
         | eyberg wrote:
         | This is very very common misconception. Just yesterday I was
         | helping someone out with a networking issue they were having on
         | AWS stemming from this concept.
         | 
         | Coming from k8s/firecracker it is common to think that you need
         | to orchestrate your unikernels with a framework of some kind.
         | In our case (Nanos/OPS) a lot of people think that means
         | spinning up an ec2 linux, sshing in and using 'ops run' on top
         | of that but that is never suggested for prod deploys. Instead
         | we suggest doing an 'image create' followed by an 'instance
         | create'.
         | 
         | What does this mean? Essentially every time you hit the deploy
         | button a new ami is made and a brand new ec2 instance spins up
         | without any linux inside. So instead of adding layers through
         | containers we actually subtract them. That means you can still
         | configure the instance to your hearts content but you don't
         | have to manage it - the cloud does for you and this is a huge
         | win for many teams that don't want to deal with all the ops/SRE
         | work that something like k8s brings (or even normal vanilla
         | linux does).
         | 
         | It is important to realize that containers extract heavy
         | performance penalties when running on top of existing
         | infrastructure (like the cloud) since they duplicate storage
         | and networking layers. They also have severe security issues -
         | the shared kernel being the main one.
        
         | einpoklum wrote:
         | > Why would you spin up a full kernel to run a single
         | application on top of a hypervisor that is balancing resources?
         | 
         | What if your application involves multiple processes and
         | threads?
        
           | eyberg wrote:
           | Nanos supports multiple threads but not multiple processes so
           | you can have as much performance as you have underlying
           | hardware but if you are using something like an interpreted
           | language where is normal to spin up X app-workers behind a
           | reverse proxy those become vms. (I should point out that
           | those languages are single-thread/single-process to begin
           | with.)
        
             | Brian_K_White wrote:
             | dude we get it
             | 
             | How many more comments will we read yet another copy of a
             | nanos sales pitch?
             | 
             | The post is about unikernels, so, obviously every single
             | comment will say something about them.
        
         | igorkraw wrote:
         | What is the functional difference of os=>hypervisor=>unikernel
         | vm vs. os=>capabilities and pledges or containers? I would
         | _get_ if we use a unikernel approach running on bare metal for
         | high security, specialised applications but this doesn 't seem
         | to exist?
        
           | eyberg wrote:
           | The difference is that the vast majority of people are
           | deploying to the cloud so they are already deploying to a
           | hypervisor. Every single cloud is built on top of
           | virtualization. AWS used to use Xen, now they use KVM. Google
           | Cloud is entirely built on KVM. Azure uses Hyper-V. The cloud
           | is just an API for virtualization.
           | 
           | Instead of AWS (hypervisor) => linux => k8s => containers
           | unikernels advocate for AWS (hypervisor) => unikernel and
           | that makes them run much faster in general (we've clocked
           | upwards of 300% req/sec for go/rust webservers on AWS for
           | instance) and a lot safer.
        
       | UltraViolence wrote:
       | Microkernels are a much more viable way of solving security
       | problems in an operating system. Windows and Linux could both be
       | rewritten as microkernels within a couple of months or years.
        
         | muricula wrote:
         | Windows and Darwin (MacOS) were originally designed to be
         | hybrid kernels, but compromised by allowing more and more stuff
         | into kernel space until they were the monolithic kernels we
         | know today. Changing code built up over 20-30 years while
         | maintaining compatibility, security, and performance guarantees
         | is not something which could be accomplished in a couple of
         | months.
        
           | lazyier wrote:
           | Early versions of Windows NT WAS a 100% honest microkernel
           | OS. Microsoft abandoned that approach when they realized they
           | had zero chance of being competitive with Unix with a
           | microkernel architecture.
           | 
           | Darwin was never intended to be anything except what it is,
           | which is a monolithic kernel. The XNU kernel was based on
           | FreeBSD kernel and Mach kernels. Some versions of the Mach
           | kernel were microkernel, but many were not.
           | 
           | Both NT and XNU incorporate message passing features from
           | microkernels, but they are monolithic in that they are
           | essentially a single large process.
           | 
           | "Hybrid kernel" is more of a marketing thing than an
           | engineering term.
           | 
           | Microkernels are a dead-end and never stopped being a dead
           | end. It's a lovely idea that didn't work out. They had
           | limited commercial success in embedded systems, but only
           | because those embedded systems didn't actually do very much
           | and what they did was largely not performance critical.
        
             | pjmlp wrote:
             | In what concerns Windows, it is surely hybrid, specially
             | since secure kernel was introduced.
             | 
             | And Apple's long term roadmap to move all kexts to
             | userspace is a means to improve the current state.
        
           | eyberg wrote:
           | The Hurd is actually older than Linux now at a ripe age of
           | 32, although I think it was Mendel Rosenblaum that said
           | "hypervisors/machine monitors were microkernels done right".
           | 
           | https://www.usenix.org/legacy/events/hotos05/final_papers/fu.
           | ..
        
         | fsflover wrote:
         | Another approach is to use virtualization and there will be no
         | need to rewrite anything. See: https://qubes-os.org.
        
       | seeker89 wrote:
       | The lightweight-ness argument is enticing, but I'm wondering if
       | the fact that now you have to give these VMs enough RAM to run
       | the app won't mean that you end up with worse flatpacking in
       | terms of RAM than if you used containers?
        
         | eyberg wrote:
         | It depends. A t2.nano has 512mb of memory. If you are using Go
         | or something like that you could go much lower but any runtime
         | environment such as ruby/python/node are going to want a
         | minimum of a few hundred meg. If you are using the JVM you most
         | definitely are going to want an instance with much more memory.
         | 
         | At the end of the day your application decides how much memory
         | it wants and the sysadmin/SRE/devops person just ensures it has
         | enough so it doesn't crash.
         | 
         | If you are hosting your own workloads and those workloads only
         | need tens of megs of ram than you can pack as much as your
         | hypervisor can handle.
         | 
         | Alfred talks about booting 110,000 vms on one host before
         | memory exhaustion:
         | 
         | https://ieeexplore.ieee.org/document/6753801
        
       | kristianpaul wrote:
       | You have Alpine Linux container image that is around 2 Mb in
       | size.
       | 
       | And its a shame AWS doesn't allow volumes with sizes smaller than
       | 1 GiB that i understand you can get really small images with
       | Unikernels
        
         | eyberg wrote:
         | There are benefits to this if you are deploying something like
         | a 5G microservice or some other single-purpose, small binary
         | that takes few resources but at the end of the day if you are
         | deploying a JVM application that is gigabytes in size it
         | doesn't really matter how small the base image is. Same thing
         | applies in unikernel-land.
         | 
         | The 1 gig limitation on clouds is not such a huge deal though
         | as we can upload your image, however small it is as that size
         | and then tell the cloud to provision the disk the size you need
         | which acts kind of like a sparse file (but technically not
         | one).
        
       | marmarama wrote:
       | https://github.com/google/gvisor gives you essentially the same
       | benefits as a unikernel without having to compromise on
       | compatibility or recompile your apps, and integrates nicely with
       | Kubernetes already. It also doesn't require a hypervisor at all.
        
         | bitcharmer wrote:
         | Since this looks like an intermediary layer between userspace
         | and the host kernel (at least if I'm reading it correctly),
         | does anyone know what its performance impact is?
        
           | marmarama wrote:
           | The gVisor documentation has performance comparisons vs.
           | cgroup-style 'traditional' containers at
           | https://gvisor.dev/docs/architecture_guide/performance/
           | 
           | There is definitely some performance overhead, but in most
           | cases it is less than hypervisor-based approaches.
        
             | eyberg wrote:
             | In gVisor if you aren't using hardware acceleration (eg:
             | virtualization) then you are using ptrace which is
             | incredibly slow.
        
               | marmarama wrote:
               | The benchmarks in the gVisor docs above are using ptrace,
               | and they don't look too shabby.
        
               | eyberg wrote:
               | Using redis as an example it's basically half in every
               | benchmark:
               | 
               | https://gvisor.dev/docs/architecture_guide/performance/#s
               | yst...
               | 
               | IO bound tasks can be up to 10x slower using ptrace. I
               | think using hardware acceleration gives you acceptable
               | performance but ptrace is just a non-starter for prod.
        
         | zamadatix wrote:
         | I always thought of gVisor as being in the opposite direction
         | of the main point of unikernels in that it's another layer
         | between the app and the kernel rather than removing the
         | separation completely.
        
           | marmarama wrote:
           | Unikernels never really removed the layer between the app and
           | the kernel, they just made the hypervisor the kernel and
           | invented a layer to handle IO to/from the virtual devices
           | presented by the hypervisor, inside the same memory space as
           | the app.
           | 
           | If the hypervisor is KVM, which they are if running on modern
           | AWS EC2 instances or GCP, unikernel apps are literally just
           | Linux processes; the underlying Linux host is doing all the
           | heavy lifting. Conceptually, they're essentially the same as
           | a sandboxed ordinary Linux process with an in-process IO
           | stack, but without the ability to monitor or debug them as if
           | they were an ordinary Linux process.
        
             | zamadatix wrote:
             | App --syscall --> kernel --hypercall--> kernel
             | 
             | App --"syscall"--> gVisor --syscall--> kernel
             | 
             | Unikernel --hypercall--> kernel
             | 
             | Though ideally something like SR-IOV would come into play
             | and they hypervisor is just scheduling shared compute. Of
             | course there is theory and reality and reality is
             | unikernels never really caught on for many reasons while
             | the normal stack just got optimized enough.
        
             | eyberg wrote:
             | I agree with most of this except for the
             | monitoring/debugging part.
             | 
             | GDB works great as does printf and others:
             | https://docs.ops.city/ops/debugging
             | 
             | Prometheus and other open source monitoring solutions work
             | out of the box and we even have a custom APM service that
             | is unikernel-tuned https://nanovms.com/radar .
        
             | pjmlp wrote:
             | If they are running on Hyper-V on Azure, there is no
             | underlying kernel doing anything.
             | 
             | It is a matter of who is offering what, not what unikernels
             | are capable of.
        
               | vlovich123 wrote:
               | Can you explain this? Afaict hyper-v is the same as
               | VMware or virtual box where you have a host OS and
               | multiple guest OSes (which makes sense because you still
               | need something to run the OS drivers). It sounds like
               | what you're implying is it behaves differently but I'm
               | not sure how. Can you elaborate?
        
               | pjmlp wrote:
               | Windows runs as guest OS on top of Hyper-V as well, it is
               | a type 1 hypervisor.
               | 
               | Basically when you activate Hyper-V, you will be getting
               | one VM running where the host is only a guest with
               | special privileges known as root partition.
        
               | zamadatix wrote:
               | Even though Hyper-V is also a type-1 hypervisor in terms
               | of CPU execution something still needs to mediate the
               | virtual devices to the physical hardware and that's done
               | by the hypervisor's kernel. In Hyper-V's case that is NT
               | which mediates the vNIC with the virtual switch and
               | physical NIC "uplink".
               | 
               | Some devices they can also support hardware assisted
               | virtualization like for PCIe devices (NICs/NVMe
               | storage/GPUs) via SR-IOV but it's been pretty rare to see
               | that in practice with unikernels as they typically have
               | limited physical device driver support on top of that not
               | really being an option everywhere all the time as it
               | places limitations on the cloud provider that paravirtual
               | devices don't.
        
               | pjmlp wrote:
               | Windows is also a guest OS on Hyper-V, running on
               | privileged guest know as root partition.
               | 
               | https://docs.microsoft.com/en-us/virtualization/hyper-v-
               | on-w...
        
             | [deleted]
        
       | amouat wrote:
       | It's worth looking at https://kontain.app/ - seems to address a
       | lot of the mentioned pain points in the article.
        
         | kalmi10 wrote:
         | Took some clicking around to find a good description of what
         | this actually is:
         | https://kontainapp.github.io/guide/overview/#a-new-approach-...
        
       ___________________________________________________________________
       (page generated 2022-02-16 23:01 UTC)