[HN Gopher] Unikernels
___________________________________________________________________
Unikernels
Author : seeker89
Score : 116 points
Date : 2022-02-16 09:48 UTC (13 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| seeker89 wrote:
| It looks like around 2015 there was a lot of hype around the
| topic, but as I tried to document the state of the art, I noticed
| there hasn't been that much pick up yet. What's your take?
| gwd wrote:
| The two main advantages of unikernels are performance (reduced
| CPU and/or memory requirements) and security (hypervisor rather
| than container boundary).
|
| It turns out that basically nobody cares about either of those.
| I know someone who for a while worked at a company that was
| trying to make money off unikernels. They ended up re-
| implementing a database in a unikernel, in a way that gave them
| a 2.5x performance improvement over the original project -- or
| to put it a different way, switching should allow companies to
| cut their hosting costs in half. Even with such a clear "win",
| it was still a difficult sell.
|
| Think about how much backend code is written in interpreted
| languages like Python, or PHP, or Javascript, rather than in
| compiled languages like Go or Rust. It's just simpler to start
| with the simple solution and then throw money at the problem as
| you scale. And while performance may be _one of the reasons_
| that people are choosing Go or Rust for backends, if it were
| the _only_ advantage, it 's unlikely that would be compelling
| enough.
| gjvc wrote:
| I suspect it was because containers sufficed and using and
| creating the tooling around them consumed the attention of
| those who might have otherwise looked at unikernels.
| seeker89 wrote:
| >I suspect it was because containers sufficed and using and
| creating the tooling around them consumed the attention of
| those who might have otherwise looked at unikernels.
|
| I think you might be right. Right place at the right time for
| containers.
| walterbell wrote:
| Some marketing material on docker vs unikernels,
| https://nanovms.com/learn/docker-vs-unikernels
| kitd wrote:
| I think micro VMs solved a lot of the issues that people had with
| regular containers and that unikernels were going to fix. There
| is still probably a performance improvement to be had with
| unikernels, but not enough to throw away all the investment
| companies made into containers.
|
| That said, there are options for running unikernels as K8s
| workloads if you want, eg NanoVMs: https://docs.ops.city/ops/k8s
| fwsgonzo wrote:
| There seems to be a lot of guessing by the author. VMs that run
| on the CPU have the same performance as everyone else, except
| when you trap into the hypervisor, and especially on buggy CPUs.
| Now, there are a lot of buggy CPUs out there right now, so I
| won't say any more. But just imagine a world where we have a
| fully working io_uring (all system calls that make sense), and
| less buggy CPUs.
| goodpoint wrote:
| Counterpoint: the absence of logging, monitoring, host-based IDS,
| and all the system engineering tools you have on Linux is a big
| negative for security.
| wyuenho wrote:
| You can compile in these metric collection facilities into the
| unikernel if you need them. The whole point of unikernel is to
| allow you to mix and match only the things you need.
| dimitar wrote:
| You can log and send metrics from your app over a network
| (which you should probably be doing instead of writing on
| disk), monitor a VM, IDS makes no sense when you don't have
| logins and such. And the tools are rarely installed on "cattle"
| anyway.
| goodpoint wrote:
| > You can log and send metrics from your app over a network
|
| That's far from enough. It's conceptually and practically
| wrong to rely on the application to monitor itself.
|
| > IDS makes no sense when you don't have logins and such
|
| On the contrary, there is plenty that IDS can do for webapps.
| eyberg wrote:
| I'm not sure anyone would advocate the application to
| monitor itself. Many companies have entire teams of people
| that have to deal with keeping machines up and they get
| paid big bucks to do so.
|
| As for the IDS question/statement - can you explain in more
| detail? Are you talking about file integrity checks or?
| Unikernels don't have the concept of users or shells or
| remote login or many of the things that an IDS would
| actually be looking at.
|
| If it was something such as an attacker overwriting a
| shared library and you want to monitor or ensure that can't
| happen both of those operations are feasible in unikernels.
| seeker89 wrote:
| Also, some interesting perspectives on Linux unikernels:
| https://www.bu.edu/rhcollab/files/2019/04/unikernel.pdf
| api wrote:
| I want a unikernel that runs as a process with no special
| privileges. Huge bonus points if its portable to many common
| operating systems.
|
| Since recompilation is necessary anyway for unikernels, syscalls
| could be replaced by function calls or some other user mode thing
| instead of trapped. It would allow entire containers to run as
| processes. Not that interesting for cloud, but very interesting
| for distribution to endpoints or self-hostable apps.
| bluepizza wrote:
| Because they are a silly idea. Why would you spin up a full
| kernel to run a single application on top of a hypervisor that is
| balancing resources?
|
| Operating systems already solve this problem relatively well,
| without the overhead, via processes and containers.
| jasode wrote:
| _> Why would you spin up a full kernel to run a single
| application_
|
| You're misunderstanding the levels of abstraction probably
| because the word _" kernel"_ within _" unikernel"_ is throwing
| you off. The idea is to use a _partial_ kernel (only the
| minimum services one needs). The so-called "kernel" is a
| library of code where you compile the minimum bits into the
| single-process image.
|
| _> Operating systems already solve this problem relatively
| well, without the overhead, via processes _
|
| A full operating system like Linux is _expending extra overhead
| to schedule /prioritize/monitor_ processes (plural) -- because
| Linux is designed to be more general purpose and open-ended
| than a specialized unikernel. In contrast, a unikernel with
| only 1 singular process (say a specialized db engine) doesn't
| need to expend extra cpu on processe(s) scheduling.
|
| All that said, it doesn't seem like unikernels have enough
| advantages to attract widespread adoption like containers.
| hypertele-Xii wrote:
| So it's less of an operating system and more of a single app
| that runs on metal?
| chakkepolja wrote:
| Another way to describe it is OS-functionality-as-library.
| packetlost wrote:
| Yes. Think of it like depending on a small kernel directly
| in your build step. So your application gets compiled with
| everything (including OS interface) that it needs and
| nothing more. The result is a bootable image that is only
| capable of running your app.
|
| I think the value isn't in the containerization vs
| unikernel comparison. If you're using containerization
| you've accepted certain security risks. Where unikernels
| have a lot of potential IMO is in high security
| environments where the security risks of containerization
| are not acceptable.
| eyberg wrote:
| This is very very common misconception. Just yesterday I was
| helping someone out with a networking issue they were having on
| AWS stemming from this concept.
|
| Coming from k8s/firecracker it is common to think that you need
| to orchestrate your unikernels with a framework of some kind.
| In our case (Nanos/OPS) a lot of people think that means
| spinning up an ec2 linux, sshing in and using 'ops run' on top
| of that but that is never suggested for prod deploys. Instead
| we suggest doing an 'image create' followed by an 'instance
| create'.
|
| What does this mean? Essentially every time you hit the deploy
| button a new ami is made and a brand new ec2 instance spins up
| without any linux inside. So instead of adding layers through
| containers we actually subtract them. That means you can still
| configure the instance to your hearts content but you don't
| have to manage it - the cloud does for you and this is a huge
| win for many teams that don't want to deal with all the ops/SRE
| work that something like k8s brings (or even normal vanilla
| linux does).
|
| It is important to realize that containers extract heavy
| performance penalties when running on top of existing
| infrastructure (like the cloud) since they duplicate storage
| and networking layers. They also have severe security issues -
| the shared kernel being the main one.
| einpoklum wrote:
| > Why would you spin up a full kernel to run a single
| application on top of a hypervisor that is balancing resources?
|
| What if your application involves multiple processes and
| threads?
| eyberg wrote:
| Nanos supports multiple threads but not multiple processes so
| you can have as much performance as you have underlying
| hardware but if you are using something like an interpreted
| language where is normal to spin up X app-workers behind a
| reverse proxy those become vms. (I should point out that
| those languages are single-thread/single-process to begin
| with.)
| Brian_K_White wrote:
| dude we get it
|
| How many more comments will we read yet another copy of a
| nanos sales pitch?
|
| The post is about unikernels, so, obviously every single
| comment will say something about them.
| igorkraw wrote:
| What is the functional difference of os=>hypervisor=>unikernel
| vm vs. os=>capabilities and pledges or containers? I would
| _get_ if we use a unikernel approach running on bare metal for
| high security, specialised applications but this doesn 't seem
| to exist?
| eyberg wrote:
| The difference is that the vast majority of people are
| deploying to the cloud so they are already deploying to a
| hypervisor. Every single cloud is built on top of
| virtualization. AWS used to use Xen, now they use KVM. Google
| Cloud is entirely built on KVM. Azure uses Hyper-V. The cloud
| is just an API for virtualization.
|
| Instead of AWS (hypervisor) => linux => k8s => containers
| unikernels advocate for AWS (hypervisor) => unikernel and
| that makes them run much faster in general (we've clocked
| upwards of 300% req/sec for go/rust webservers on AWS for
| instance) and a lot safer.
| UltraViolence wrote:
| Microkernels are a much more viable way of solving security
| problems in an operating system. Windows and Linux could both be
| rewritten as microkernels within a couple of months or years.
| muricula wrote:
| Windows and Darwin (MacOS) were originally designed to be
| hybrid kernels, but compromised by allowing more and more stuff
| into kernel space until they were the monolithic kernels we
| know today. Changing code built up over 20-30 years while
| maintaining compatibility, security, and performance guarantees
| is not something which could be accomplished in a couple of
| months.
| lazyier wrote:
| Early versions of Windows NT WAS a 100% honest microkernel
| OS. Microsoft abandoned that approach when they realized they
| had zero chance of being competitive with Unix with a
| microkernel architecture.
|
| Darwin was never intended to be anything except what it is,
| which is a monolithic kernel. The XNU kernel was based on
| FreeBSD kernel and Mach kernels. Some versions of the Mach
| kernel were microkernel, but many were not.
|
| Both NT and XNU incorporate message passing features from
| microkernels, but they are monolithic in that they are
| essentially a single large process.
|
| "Hybrid kernel" is more of a marketing thing than an
| engineering term.
|
| Microkernels are a dead-end and never stopped being a dead
| end. It's a lovely idea that didn't work out. They had
| limited commercial success in embedded systems, but only
| because those embedded systems didn't actually do very much
| and what they did was largely not performance critical.
| pjmlp wrote:
| In what concerns Windows, it is surely hybrid, specially
| since secure kernel was introduced.
|
| And Apple's long term roadmap to move all kexts to
| userspace is a means to improve the current state.
| eyberg wrote:
| The Hurd is actually older than Linux now at a ripe age of
| 32, although I think it was Mendel Rosenblaum that said
| "hypervisors/machine monitors were microkernels done right".
|
| https://www.usenix.org/legacy/events/hotos05/final_papers/fu.
| ..
| fsflover wrote:
| Another approach is to use virtualization and there will be no
| need to rewrite anything. See: https://qubes-os.org.
| seeker89 wrote:
| The lightweight-ness argument is enticing, but I'm wondering if
| the fact that now you have to give these VMs enough RAM to run
| the app won't mean that you end up with worse flatpacking in
| terms of RAM than if you used containers?
| eyberg wrote:
| It depends. A t2.nano has 512mb of memory. If you are using Go
| or something like that you could go much lower but any runtime
| environment such as ruby/python/node are going to want a
| minimum of a few hundred meg. If you are using the JVM you most
| definitely are going to want an instance with much more memory.
|
| At the end of the day your application decides how much memory
| it wants and the sysadmin/SRE/devops person just ensures it has
| enough so it doesn't crash.
|
| If you are hosting your own workloads and those workloads only
| need tens of megs of ram than you can pack as much as your
| hypervisor can handle.
|
| Alfred talks about booting 110,000 vms on one host before
| memory exhaustion:
|
| https://ieeexplore.ieee.org/document/6753801
| kristianpaul wrote:
| You have Alpine Linux container image that is around 2 Mb in
| size.
|
| And its a shame AWS doesn't allow volumes with sizes smaller than
| 1 GiB that i understand you can get really small images with
| Unikernels
| eyberg wrote:
| There are benefits to this if you are deploying something like
| a 5G microservice or some other single-purpose, small binary
| that takes few resources but at the end of the day if you are
| deploying a JVM application that is gigabytes in size it
| doesn't really matter how small the base image is. Same thing
| applies in unikernel-land.
|
| The 1 gig limitation on clouds is not such a huge deal though
| as we can upload your image, however small it is as that size
| and then tell the cloud to provision the disk the size you need
| which acts kind of like a sparse file (but technically not
| one).
| marmarama wrote:
| https://github.com/google/gvisor gives you essentially the same
| benefits as a unikernel without having to compromise on
| compatibility or recompile your apps, and integrates nicely with
| Kubernetes already. It also doesn't require a hypervisor at all.
| bitcharmer wrote:
| Since this looks like an intermediary layer between userspace
| and the host kernel (at least if I'm reading it correctly),
| does anyone know what its performance impact is?
| marmarama wrote:
| The gVisor documentation has performance comparisons vs.
| cgroup-style 'traditional' containers at
| https://gvisor.dev/docs/architecture_guide/performance/
|
| There is definitely some performance overhead, but in most
| cases it is less than hypervisor-based approaches.
| eyberg wrote:
| In gVisor if you aren't using hardware acceleration (eg:
| virtualization) then you are using ptrace which is
| incredibly slow.
| marmarama wrote:
| The benchmarks in the gVisor docs above are using ptrace,
| and they don't look too shabby.
| eyberg wrote:
| Using redis as an example it's basically half in every
| benchmark:
|
| https://gvisor.dev/docs/architecture_guide/performance/#s
| yst...
|
| IO bound tasks can be up to 10x slower using ptrace. I
| think using hardware acceleration gives you acceptable
| performance but ptrace is just a non-starter for prod.
| zamadatix wrote:
| I always thought of gVisor as being in the opposite direction
| of the main point of unikernels in that it's another layer
| between the app and the kernel rather than removing the
| separation completely.
| marmarama wrote:
| Unikernels never really removed the layer between the app and
| the kernel, they just made the hypervisor the kernel and
| invented a layer to handle IO to/from the virtual devices
| presented by the hypervisor, inside the same memory space as
| the app.
|
| If the hypervisor is KVM, which they are if running on modern
| AWS EC2 instances or GCP, unikernel apps are literally just
| Linux processes; the underlying Linux host is doing all the
| heavy lifting. Conceptually, they're essentially the same as
| a sandboxed ordinary Linux process with an in-process IO
| stack, but without the ability to monitor or debug them as if
| they were an ordinary Linux process.
| zamadatix wrote:
| App --syscall --> kernel --hypercall--> kernel
|
| App --"syscall"--> gVisor --syscall--> kernel
|
| Unikernel --hypercall--> kernel
|
| Though ideally something like SR-IOV would come into play
| and they hypervisor is just scheduling shared compute. Of
| course there is theory and reality and reality is
| unikernels never really caught on for many reasons while
| the normal stack just got optimized enough.
| eyberg wrote:
| I agree with most of this except for the
| monitoring/debugging part.
|
| GDB works great as does printf and others:
| https://docs.ops.city/ops/debugging
|
| Prometheus and other open source monitoring solutions work
| out of the box and we even have a custom APM service that
| is unikernel-tuned https://nanovms.com/radar .
| pjmlp wrote:
| If they are running on Hyper-V on Azure, there is no
| underlying kernel doing anything.
|
| It is a matter of who is offering what, not what unikernels
| are capable of.
| vlovich123 wrote:
| Can you explain this? Afaict hyper-v is the same as
| VMware or virtual box where you have a host OS and
| multiple guest OSes (which makes sense because you still
| need something to run the OS drivers). It sounds like
| what you're implying is it behaves differently but I'm
| not sure how. Can you elaborate?
| pjmlp wrote:
| Windows runs as guest OS on top of Hyper-V as well, it is
| a type 1 hypervisor.
|
| Basically when you activate Hyper-V, you will be getting
| one VM running where the host is only a guest with
| special privileges known as root partition.
| zamadatix wrote:
| Even though Hyper-V is also a type-1 hypervisor in terms
| of CPU execution something still needs to mediate the
| virtual devices to the physical hardware and that's done
| by the hypervisor's kernel. In Hyper-V's case that is NT
| which mediates the vNIC with the virtual switch and
| physical NIC "uplink".
|
| Some devices they can also support hardware assisted
| virtualization like for PCIe devices (NICs/NVMe
| storage/GPUs) via SR-IOV but it's been pretty rare to see
| that in practice with unikernels as they typically have
| limited physical device driver support on top of that not
| really being an option everywhere all the time as it
| places limitations on the cloud provider that paravirtual
| devices don't.
| pjmlp wrote:
| Windows is also a guest OS on Hyper-V, running on
| privileged guest know as root partition.
|
| https://docs.microsoft.com/en-us/virtualization/hyper-v-
| on-w...
| [deleted]
| amouat wrote:
| It's worth looking at https://kontain.app/ - seems to address a
| lot of the mentioned pain points in the article.
| kalmi10 wrote:
| Took some clicking around to find a good description of what
| this actually is:
| https://kontainapp.github.io/guide/overview/#a-new-approach-...
___________________________________________________________________
(page generated 2022-02-16 23:01 UTC)