[HN Gopher] Why Oxide Chose Illumos
       ___________________________________________________________________
        
       Why Oxide Chose Illumos
        
       Author : kblissett
       Score  : 209 points
       Date   : 2024-09-11 21:22 UTC (1 days ago)
        
 (HTM) web link (rfd.shared.oxide.computer)
 (TXT) w3m dump (rfd.shared.oxide.computer)
        
       | Rendello wrote:
       | I like the RFDs. Oxide just did a podcast episode on the process:
       | 
       | https://oxide.computer/podcasts/oxide-and-friends/2065190
        
       | rtpg wrote:
       | > There is not a significant difference in functionality between
       | the illumos and FreeBSD implementations, since pulling patches
       | downstream has not been a significant burden. Conversely, the
       | more advanced OS primitives in illumos have resulted in certain
       | bugs being fixed only there, having been difficult to upstream to
       | FreeBSD.
       | 
       | curious about what bugs are being thought of there. Sounds like a
       | very interesting situation to be in
        
       | taspeotis wrote:
       | I kagi'd Illumos and apparently Bryan Cantrill was a maintainer.
       | 
       | Bryan Cantrill is CTO of Oxide [1].
       | 
       | I assume that has no bearing on the choice, otherwise it would be
       | mentioned in the discussion.
       | 
       | [1] https://bcantrill.dtrace.org/2019/12/02/the-soul-of-a-new-
       | co...
        
         | gyre007 wrote:
         | Yeah I came here to say that Bryan worked at Sun so why do they
         | even need to write this post (yes, I appreciate the techinical
         | reasons, just wanted to highlight the fact via a subtle dig
         | :-))
        
           | sausagefeet wrote:
           | This isn't a blog post from an Oxide, it's a link to their
           | internal RFD which they use to make decisions.
        
             | gyre007 wrote:
             | I never said it was a post by Oxide.
        
         | sausagefeet wrote:
         | Early Oxide founders came from Joyent which was an illumos shop
         | and Cantrill is quite vocal about the history of Solaris,
         | OpenSolaris, and illumos.
        
           | codetrotter wrote:
           | > Joyent which was an illumos shop
           | 
           | And before that, they used to run FreeBSD.
           | 
           | Mentioned for example in this comment by Bryan Cantrill a
           | decade ago:
           | 
           | https://news.ycombinator.com/item?id=6254092
           | 
           | > [...] Speaking only for us (I work for Joyent), we have
           | deployed hundreds of thousands of zones into production over
           | the years -- and Joyent was running with FreeBSD jails before
           | that [...]
           | 
           | And I've seen some other primary sources (people who worked
           | at Joyent) write that online too.
           | 
           | And Bryan Cantrill, and several other people, came from Sun
           | Microsystems to Joyent. Though I've never seen it mentioned
           | which order that happened in; was it people from Sun that
           | joined Joyent and then Joyent switched from FreeBSD to
           | Illumos and creating SmartOS? Or had Joyent already switched
           | to Illumos before the people that came from Sun joined?
           | 
           | I would actually really enjoy a long documentary or talk from
           | some people that worked at Joyent about the history of the
           | company, how they were using FreeBSD and when they switched
           | to Illumos and so on.
        
             | panick21_ wrote:
             | Joyent was using Solaris before Bryan worked there. Listen
             | to the this podcast with Bryan and his co-founder about
             | their origin story:
             | 
             | https://www.youtube.com/watch?v=eVkIKm9pkPY
             | 
             | This is about as good as you are gone get on the topic of
             | Joyant history.
        
               | codetrotter wrote:
               | Thank you, I will watch that right away :)
        
             | selykg wrote:
             | Joyent also merged with TextDrive, which is where the
             | FreeBSD part came from. TextDrive was an early Rails host,
             | and could even do it in a shared hosting environment, which
             | is where I think a lot of the original user base came from
             | (also TextPattern)
             | 
             | As I recall they were also the original host of Twitter,
             | which if I recall was Rails back in the day.
        
               | throw0101b wrote:
               | > _As I recall [Joyent] were also the original host of
               | Twitter, which if I recall was Rails back in the day._
               | 
               | Up until 2008:
               | 
               | * https://web.archive.org/web/20080201142828/http://www.j
               | oyeur...
        
         | mzi wrote:
         | @bcantrill is the CTO of Oxide.
        
           | taspeotis wrote:
           | Yup, thanks
        
         | panick21_ wrote:
         | Bryan Cantrill also ported KVM to Illumos. At Joyent they had
         | plenty of experience with KVM. See:
         | 
         | https://www.youtube.com/watch?v=cwAfJywzk8o
         | 
         | As far as I know, Bryan didn't personally work on the porting
         | of bhyve (this might be wrong).
         | 
         | So if anything, that would point to KVM as the 'familiar' thing
         | given how many former Joyant people were there.
        
           | bonzini wrote:
           | KVM got more and more integrated with the rest of Linux as
           | more virtualization features became general system features
           | (e.g. posted interrupts). Also Google and Amazon are working
           | more upstream and the pace of development increased a lot.
           | 
           | Keeping a KVM port up to date is a huge effort compared to
           | bhyve, and they probably had learnt that in the years between
           | the porting of KVM and the founding of Oxide.
        
           | elijahwright wrote:
           | Where is Max Bruning these days?
        
       | bonzini wrote:
       | > QEMU is often the subject of bugs affecting its reliability and
       | security.
       | 
       | {{citation needed}}?
       | 
       | When I ran the numbers in 2019, there hadn't been guest
       | exploitable vulnerabilities that affected devices normally used
       | for IaaS for 3 years. Pretty much every cloud outside the big
       | three (AWS, GCE, Azure) runs on QEMU.
       | 
       | Here's a talk I gave about it that includes that analysis:
       | 
       | slides - https://kvm-forum.qemu.org/2019/kvmforum19-bloat.pdf
       | 
       | video - https://youtu.be/5TY7m1AneRY?si=Sj0DFpRav7PAzQ0Y
        
         | _rs wrote:
         | I thought AWS uses KVM, which is the same VM that QEMU would
         | use? Or am I mistaken?
        
           | bonzini wrote:
           | AWS uses KVM in the kernel but they have a different, non-
           | open source userspace stack for EC2; plus Firecracker which
           | is open source but is only used for Lambda, and runs on EC2
           | bare metal instances.
           | 
           | Google also uses KVM with a variety of userspace stacks: a
           | proprietary one (tied to a lot of internal Google
           | infrastructure but overall a lot more similar to QEMU than
           | Amazon's) for GCE, gVisor for AppEngine or whatever it is
           | called these days, crosvm for ChromeOS, and QEMU for Android
           | Emulator.
        
             | tptacek wrote:
             | Lambda and Fargate.
        
               | dastbe wrote:
               | unless something has changed in the past year, fargate
               | still runs each task in a single use ec2 vm with no
               | further isolation around containers in a task.
        
               | my123 wrote:
               | It was true for Fargate some time ago, but is not true
               | anymore since quite a while. All Fargate tasks run on EC2
               | instances today.
        
               | easton wrote:
               | ...which is probably the reason why task launches take
               | 3-5 business weeks
        
               | tptacek wrote:
               | Ah, interesting. Thanks for the correction!
        
             | 9front wrote:
             | EC2 instances are using the Xen hypervisor. At least that's
             | what reported by hostnamectl.
        
               | wmf wrote:
               | EC2 migrated off Xen around ten years ago. Only really
               | old instances should be using Xen or Xen emulation.
        
               | 9front wrote:
               | I'm puzzled by your comment. On an EC2 instance of AL2023
               | deployed on us-east-1 region this is the output of
               | hostnamectl:                 [ec2-user][~]$ hostnamectl
               | Static hostname: ip-x-x-x-x.ec2.internal
               | Icon name: computer-vm                Chassis: vm
               | Machine ID: ec2d54f27fc534ea74980638ccc33d96
               | Boot ID: 6caf18b7ed3647819c1985c11f128142
               | Virtualization: xen       Operating System: Amazon Linux
               | 2023.5.20240903            CPE OS Name:
               | cpe:2.3:o:amazon:amazon_linux:2023
               | Kernel: Linux 6.1.106-116.188.amzn2023.x86_64
               | Architecture: x86-64        Hardware Vendor: Xen
               | Hardware Model: HVM domU       Firmware Version:
               | 4.11.amazon
        
               | wmf wrote:
               | What instance type is it?
        
               | bonzini wrote:
               | KVM can emulate the Xen hypercall interface. Amazon is
               | not using Xen anymore.
        
               | simcop2387 wrote:
               | I'm not quite sure the status of it at least, but
               | reported back in 2017 that they are moving off Xen
               | 
               | https://www.theregister.com/2017/11/07/aws_writes_new_kvm
               | _ba...
               | 
               | It could be that it's not all over and tied to specific
               | machine types still, or there's something they've done to
               | make it report to the guest still that it's xen based for
               | some compatibility reasons.
        
               | blaerk wrote:
               | I think some older instance types are still on xen, later
               | types run kvm (code named nitro.. perhaps?). I can't
               | remember the exact type but last year we ran into some
               | weird issues related to some kernel regression that only
               | affected some instances in our fleet, turns out they
               | where all the same type and apparently ran on xen
               | according to aws support
        
           | daneel_w wrote:
           | QEMU can use a number of different hypervisors, KVM and Xen
           | being the two most common ones. Additionally it can also
           | _emulate_ any architecture if one would want /need that.
        
         | TimTheTinker wrote:
         | > When I ran the numbers in 2019, there hadn't been guest
         | exploitable vulnerabilities that affected devices normally used
         | for IaaS for 3 years.
         | 
         | So there existed known guest-exploitable vulnerabilities as
         | recently as 8 years ago. Maybe that, combined with the fact
         | that QEMU is _not_ written in Rust, is what is causing Oxide to
         | decide against QEMU.
         | 
         | I think it's fair to say that any sufficiently large codebase
         | originally written in C or C++ has memory safety bugs. Yes, the
         | Oxide RFD author may be phrasing this using weasel words; and
         | memory safety bugs may not be exploitable at a given point in a
         | codebase's history. But I don't think that makes Oxide's
         | decision invalid.
        
           | bonzini wrote:
           | That would be a damn good record though, isn't it? (I am
           | fairly sure that more were found since, but the point is that
           | these are pretty rare). Firecracker, which is written in
           | Rust, had one in 2019:
           | https://www.cve.org/CVERecord?id=CVE-2019-18960
           | 
           | Also QEMU's fuzzing is very sophisticated. Most recent
           | vulnerabilities were found that way rather than by security
           | researchers, which I don't think it's the case for
           | "competitors".
        
             | TimTheTinker wrote:
             | You're not wrong, and that is very impressive. There's
             | nothing like well-applied fuzzing to improve security.
             | 
             | But I still don't think that makes Oxide's decision or my
             | comment necessarily invalid, if only because of an _a
             | priori_ decision to stick with Rust system-wide -- it
             | raises the floor on software quality.
        
               | akira2501 wrote:
               | > it raises the floor on software quality.
               | 
               | Languages cannot possibly do this.
        
               | TimTheTinker wrote:
               | I believe TypeScript and Rust are both strong examples of
               | languages that do this (for different reasons and in
               | different ways).
               | 
               | It's also possible for a language to raise the ceiling of
               | software quality, and Zig is an excellent example.
               | 
               | I'm thinking of "floors" and "ceilings" as the outer
               | bounds of what happens in real, everyday life within
               | particular software ecosystems in terms of software
               | quality. By "quality" I mean all of capabilities,
               | performance, and absence of problems.
               | 
               | It takes a team of _great_ engineers (and management
               | willing to take a risk) to benefit from a raised ceiling.
               | TigerBeetle[0] is an example of what happens when you
               | pair a great team, great research, and a high-ceiling
               | language.
               | 
               | [0] https://tigerbeetle.com/
        
               | akira2501 wrote:
               | > possible for a language to raise the ceiling of
               | software quality
               | 
               | Cargo is widely recognized as low quality. The thesis
               | fails within it's own standard packaging. It's possible
               | for a language to be used by _more people_ and thus raise
               | the quality _in aggregate_ of produced software but the
               | language itself has no bearing on quality in any
               | objective measure.
               | 
               | > to benefit from a raised ceiling
               | 
               | You're explicitly putting the cart before the horse here.
               | The more reasonable assertion is that it takes good
               | people to get good results regardless of the quality of
               | the tool. Acolytes are uncomfortable saying this because
               | it also destroys the negative case, which is, it would be
               | impossible to write quality software in a previous
               | generation language.
               | 
               | > TigerBeetle[0] is an example
               | 
               | Of a protocol and a particular implementation of that
               | protocol. It has client libraries in multiple languages.
               | This has no bearing on this point.
        
               | sophacles wrote:
               | > Cargo is widely recognized as low quality.
               | 
               | Can you point me to both of:
               | 
               | * why it's considered low quality
               | 
               | * evidence of this "wide regard"
               | 
               | Other than random weirdos who think allowing dependencies
               | is a bad practice because you could hurt yourself, while
               | extolling the virtues of undefined behavior - I've never
               | heard much serious criticism of it.
        
               | akira2501 wrote:
               | > why it's considered low quality
               | 
               | Other software providing the same features produce better
               | results for those users. It's dependency management is
               | fundamentally broken and causes builds to be much slower
               | than they could otherwise be. Lack of namespaces which is
               | a lesson well learned before the first line of Cargo was
               | ever written.
               | 
               | I could go on.
               | 
               | > evidence of this "wide regard"
               | 
               | We are on the internet. If you doubt me you can easily
               | falsify this yourself. Or you could discover something
               | you've been ignorant of up until now. Try "rust cargo
               | sucks" as a search motif.
               | 
               | > random weirdos
               | 
               | Which may or may not be true, but you believe it, and yet
               | you use your time to comment to us. This is more of a
               | criticism of yourself than of me; however, I do
               | appreciate your attempt to be insulting and dismissive.
        
               | sophacles wrote:
               | Im not attempting to insult you, i didn't know you held
               | such a hypocritical position - sorry pointing out that it
               | is weird for someone working a field that is so dependent
               | on logic to hold such a self-contradictory position
               | insults you. Maybe instead of weird i should use the
               | words unusual and unexpected. My bad.
               | 
               | You're right, I'm being dismissive of weasely unbacked
               | claims of "wide regard". It's very clear now that you
               | can't back your claim and I can safely ignore your entire
               | argument as unfounded. Thanks for confirming!
        
             | otabdeveloper4 wrote:
             | Heresy! Software written in Rust _never_ has security
             | vulnerabilities or bugs. The borrow checker means you don
             | 't have to worry about security, Rust handles it for you
             | automatically so you can go shopping.
        
               | ameliaquining wrote:
               | I do think that only having one CVE in six years is a
               | pretty decent record, especially since that vulnerability
               | probably didn't grant arbitrary code execution in
               | practice.
               | 
               | Rust is an important part of how Firecracker pulls this
               | off, but it's not the only part. Another important part
               | is that it's a much smaller codebase than QEMU, so there
               | are fewer places for bugs to hide. (This, in turn, is
               | possible in part because Firecracker deliberately doesn't
               | implement any features that aren't necessary for its core
               | use case of server-side workload isolation, whereas QEMU
               | aims to be usable for anything that you might want to use
               | a VM for.)
        
               | sophacles wrote:
               | Why is it the only people who say this at all are people
               | saying it sarcastically or quoting fictional strawmen
               | (and can never seem to provide evidence of it being said
               | in earnest)?
        
         | dvdbloc wrote:
         | What do the big three use?
        
           | paxys wrote:
           | AWS - Nitro (based on KVM)
           | 
           | Google - "KVM-based hypervisor"
           | 
           | Azure - Hyper-V
           | 
           | You can of course assume that all of them heavily customize
           | the underlying implemenation for their own needs and for
           | their own hardware. And then they have stuff like
           | Firecracker, GVisor etc. layered on top depending on the
           | product line.
        
           | daneel_w wrote:
           | Some more data:
           | 
           | Oracle Cloud - QEMU/KVM
           | 
           | Scaleway - QEMU/KVM
        
             | bonzini wrote:
             | IBM cloud, DigitalOcean, Linode, OVH, Hetzner,...
        
         | anonfordays wrote:
         | >Pretty much every cloud outside the big three (AWS, GCE,
         | Azure) runs on QEMU.
         | 
         | QEMU typically uses KVM for the hypervisor, so the
         | vulnerabilities will be KVM anyway. The big three all use KVM
         | now. Oxide decided to go with bhyve instead of KVM.
        
           | bonzini wrote:
           | No, QEMU is a huge C program which can have its own
           | vulnerabilities.
           | 
           | Usually QEMU runs heavily confined, but remote code execution
           | in QEMU (remote = "from the guest") can be a first step
           | towards exploiting a more serious local escalation via a
           | kernel vulnerability. This second vulnerability can be in KVM
           | or in any other part of the kernel.
        
           | 6c696e7578 wrote:
           | Azure uses hyper-v, unless things have changed massively, the
           | linux they run for infra and customers is in hyper-v.
        
           | cmeacham98 wrote:
           | > The big three all use KVM now.
           | 
           | This isn't true - Azure uses Hyper-V
           | (https://learn.microsoft.com/en-
           | us/azure/security/fundamental...), and AWS uses an in-house
           | hypervisor called Nitro (https://aws.amazon.com/ec2/nitro/).
        
             | anonfordays wrote:
             | >This isn't true - Azure uses Hyper-V
             | 
             | I thought Azure was moving/moved to KVM for Linux, but I
             | was wrong.
             | 
             | >AWS uses an in-house hypervisor called Nitro
             | 
             | Nitro uses KVM under the hood.
        
         | hinkley wrote:
         | If they are being precise, then "reliability and security"
         | means something different than "security and reliability".
         | 
         | How many reliability bugs has QEMU experienced in this time?
         | 
         | The man power to go on site and deal with in the field problems
         | could be crippling. You often pick the boring problems for this
         | reason. High touch is super expensive. Just look at Ferrari.
        
       | sausagefeet wrote:
       | While it's fair to say this does describe why Illumos was chosen,
       | the actual RFD title is not presented and it is about Host OS +
       | Virtualization software choice.
       | 
       | Even if you think it's a foregone conclusion given the history of
       | bcantrill and other founders of Oxide, there absolutely is value
       | in putting decision to paper and trying to provide a rational
       | because then it can be challenged.
       | 
       | The company I co-founded does an RFD process as well and even if
       | there is 99% chance that we're going to use the thing we've
       | always used, if you're a serious person, the act of expressing it
       | is useful and sometimes you even change your own mind thanks to
       | the process.
        
       | transpute wrote:
       | _> Xen: Large and complicated (by dom0) codebase, discarded for
       | KVM by AMZN_                 1. Xen Type-1 hypervisor is smaller
       | than KVM/QEMU.       2. Xen "dom0" = Linux/FreeBSD/OpenSolaris.
       | KVM/bhyve also need host OS.       3. AMZN KVM-subset: x86
       | cpu/mem virt, blk/net via Arm Nitro hardware.       4. bhyve is
       | Type-2.       5. Xen has Type-2 (uXen).       6. Xen dom0/host
       | can be disaggregated (Hyperlaunch), unlike KVM.       7. pKVM
       | (Arm/Android) is smaller than KVM/Xen.
       | 
       | _> The Service Management Facility (SMF) is responsible for the
       | supervision of services under illumos.. a [Linux] robust
       | infrastructure product would likely end up using few if any of
       | the components provided by the systemd project, despite there now
       | being something like a hundred of them. Instead, more traditional
       | components would need to be revived, or thoroughly bespoke
       | software would need to be developed, in order to avoid the
       | technological and political issues with this increasingly
       | dominant force in the Linux ecosystem._
       | 
       | Is this an argument for Illumos over Linux, or for translating
       | SMF to Linux?
        
         | bonzini wrote:
         | Talking about "technological and political issues" without
         | mentioning any, or without mentioning _which_ components would
         | need to be revived, sounds a lot like FUD unfortunately. Mixing
         | and matching traditional and systemd components is super
         | common, for example Fedora and RHEL use chrony instead of
         | timesyncd, and NetworkManager instead of networkd.
        
           | actionfromafar wrote:
           | I read it as "we can sit in this more quiet room where people
           | don't rave about systemd all day long".
        
             | bonzini wrote:
             | But do they? Oxide targets the enterprise, and people there
             | don't care that much about how the underlying OS works.
             | It's been ten years since a RHEL release started using
             | systemd and there has been no exodus to either Windows or
             | Illumos.
             | 
             | I don't mean FUD in a disparaging sense, more like literal
             | fear of the unknown causing people to be excessively
             | cautious. I wouldn't have any problem with Oxide saying "we
             | went for what we know best", there's no need to fake that
             | so much more research went into a decision.
        
               | panick21_ wrote:
               | The underlying hyperwiser on oxide isn't exposed to the
               | consumers of the API. Just like on Amazon.
               | 
               | I think arguably the bhyve over KVM was the more
               | fundamental reason, and bhyve doesn't run on linux
               | anyway.
        
               | bonzini wrote:
               | Exactly, then why would they be dragged into systemd-or-
               | not-systemd discussion? If you want to use Linux, use
               | either Debian or the CentOS hyperscaler spin (the one
               | that Meta uses) and call it a day.
               | 
               | I am obviously biased as I am a KVM (and QEMU) developer
               | myself, but I don't see any other plausible reason other
               | than "we know the Illumos userspace best". Founder mode
               | and all that.
               | 
               | As to their choice of hypervisor, to be honest KVM on
               | Illumos was probably not a great idea to begin with,
               | therefore they used bhyve.
        
               | jclulow wrote:
               | FWIW, founder mode didn't exist five years ago when we
               | were getting started! More seriously, though, this
               | document (which I helped write) is an attempt
               | specifically to avoid classic FUD tropes. It's not
               | perfect, but it reflects certainly aspects of my lived
               | experience in trying to get pieces of the Linux ecosystem
               | to work in production settings.
               | 
               | While it's true that I'm a dyed in the wool illumos
               | person, being in the core team and so on, I have Linux
               | desktops, and the occasional Linux system in lab
               | environments. I have been supporting customers with all
               | sorts of environments that I don't get to choose for most
               | of my career, including Linux and Windows systems. At
               | Joyent most of our customers were running hardware
               | virtualised Linux and Windows guests, so it's not like I
               | haven't had a fair amount of exposure. I've even spent
               | several days getting SCO OpenServer to run under our KVM,
               | for a customer, because I apparently make bad life
               | choices!
               | 
               | As for not discussing the social and political stuff in
               | any depth, I felt at the time (and still do today) that
               | so much ink had been split by all manner of folks talking
               | about LKML or systemd project behaviour over the last
               | decade that it was probably a distraction to do anything
               | other than mention it in passing. As I believe I said in
               | the podcast we did about this RFD recently: I'm not sure
               | if this decision would be right for anybody else or not,
               | but I believe it was and is right for us. I'm not trying
               | to sell you, or anybody else, on making the same calls.
               | This is just how we made our decision.
        
               | bonzini wrote:
               | Founder mode existed, it just didn't have a catchy name.
               | And I absolutely believe that it was the right choice for
               | your team, exactly for "founder mode" reasons.
               | 
               | In other words, I don't think that the social or
               | technological reasons in the document were that strong,
               | _and that 's fine_. Rather, my external armchair
               | impression is simply that OS and hypervisor were not
               | something where you were willing to spend precious "risk
               | points", and that's the right thing to do given that you
               | had a lot more places that were an absolute jump in the
               | dark.
        
               | InvaderFizz wrote:
               | I would agree with that. Given the history of the Oxide
               | team, they chose what they viewed was the best technology
               | for THEM, as maintainers. The rest is mostly
               | justification of that.
               | 
               | That's just fine, as long as they're not choosing a
               | clearly inferior long term option. The technically
               | superior solution is not always the right solution for
               | your organization given the priorities and capabilities
               | of your team, and that's just fine! (I have no opinion on
               | KVM vs bhyve, I don't know either deep enough to form
               | one. I'm talking in general.)
        
             | wmf wrote:
             | Instead people rave about Solaris.
        
           | packetlost wrote:
           | The Oxide folks are rather vocal about their distaste for the
           | Linux Foundation. FWIW I think they went with the right
           | choice for them considering they'd rather sign up for
           | maintaining the entire thing themselves than saddling
           | themselves with the baggage of a Linux fork or upstreaming
        
           | netbsdusers wrote:
           | > Talking about "technological and political issues" without
           | mentioning any
           | 
           | I don't know why you think none were mentioned - to name one,
           | they link a GitHub issue created against the systemd
           | repository by a Googler complaining that systemd is
           | inappropriately using Google's NTP servers, which at the time
           | were not a public service, and kindly asking for systemd to
           | stop using them.
           | 
           | This request was refused and the issue was closed and locked.
           | 
           | Behaviour like this from the systemd maintainers can only
           | appear bizarre, childish, and unreasonable to any
           | unprejudiced observer, putting their character and integrity
           | into question and casting doubt on whether they should be
           | trusted with the maintenance of software so integral to at
           | least a reasonably large minority of modern Linux systems.
        
             | inftech wrote:
             | And people forget that this behavior of systemd devs is
             | present in lots of other core projects of the Linux
             | ecosystem.
             | 
             | Unfortunately this makes modern Linux not reliable.
        
             | suprjami wrote:
             | systemd made time servers a compile-time option and a warn
             | if distros are using the default time servers:
             | 
             | https://github.com/systemd/systemd/pull/554
             | 
             | What's your suggested alternative?
             | 
             | Using pool.ntp.org requires a vendor zone. systemd does not
             | consider itself a vendor, it's the distros shipping systemd
             | which are the vendor and should register and use their own
             | vendor zone.
             | 
             | I don't care about systemd either way, but your own false
             | representation of facts makes your last paragraph apply to
             | your "argument".
        
               | bigfatkitten wrote:
               | > What's your suggested alternative?
               | 
               | That if they do not wish to ship a safe default, they do
               | not ship a default at all.
        
               | suprjami wrote:
               | That would have been my preference too.
        
         | evandrofisico wrote:
         | I've been using xen in production for at least 18 years, and
         | although there is been some development, it is extremely hard
         | to get actual documentation on how to do things with it.
         | 
         | There is no place documenting how to integrate the
         | Dom0less/Hyperlaunch in a distribution or how to build
         | infrastructure with it, at best you will find a github repo,
         | with the last commit dated 4 years ago, with little to no
         | information on what to do with the code.
        
         | dijit wrote:
         | Honestly, SMF is superior to SystemD and it's ironic it came
         | earlier (and, that shows based on the fact that it uses XML as
         | its configuration language.. ick).
         | 
         | However, two things are an issue:
         | 
         | 1) The CDDL license of SMF makes it difficult to use, or at
         | least that's what I was told when I asked someone why SMF
         | wasn't ported to Linux in 2009.
         | 
         | 2) SystemD is _it_ now. It's too complicated to replace and
         | software has become hopelessly dependent on its existence,
         | which is what I mentioned was my largest worry with a
         | monoculture and I was routinely dismissed.
         | 
         | So, to answer your question. The argument must be: IllumOS over
         | Linux.
        
           | transpute wrote:
           | _> software has become hopelessly dependent on its existence_
           | 
           | With some effort, Devuan has managed to support multiple init
           | systems, at least for the software packaged by Devuan/Debian.
           | 
           |  _> SMF is superior to SystemD ... [CDDL]_
           | 
           | OSS workalike opportunity, as new Devuan init system?
           | 
           |  _> The argument must be: IllumOS over Linux._
           | 
           | Thanks :)
        
             | jclulow wrote:
             | SMF is OSS. The CDDL is an OSI approved licence. I'm not
             | aware of any reason one couldn't readily ship user mode
             | CDDL software in a Linux distribution; you don't even have
             | the usual (often specious) arguments about linking and
             | derivative works and so on in that case.
        
       | saagarjha wrote:
       | Unrelated, but is this a homegrown publishing platform?
        
         | crb wrote:
         | Yes; it's referred to in Oxide's RFD about RFDs [1]
         | https://rfd.shared.oxide.computer/rfd/0001 but the referenced
         | URL is 404 unless you're an Oxide employee.
         | 
         | [1]
         | https://rfd.shared.oxide.computer/rfd/0001#_shared_rfd_rende...
         | [2] https://github.com/oxidecomputer/rfd/blob/master/src
        
           | dcre wrote:
           | That link is out of date. The site and backend are now open
           | source. Only the repo containing the RFD contents is private.
           | 
           | https://github.com/oxidecomputer/rfd-site
           | 
           | https://github.com/oxidecomputer/rfd-api
        
         | benjaminleonard wrote:
         | Yep, you can see a little more on the
         | [blog](https://oxide.computer/blog/a-tool-for-discussion) or
         | the most recent
         | [podcast](https://oxide.computer/podcasts/oxide-and-
         | friends/2065190). The API and site repos are also public.
        
       | ReleaseCandidat wrote:
       | Instead of stating more or less irrelevant reasons, I'd prefer to
       | read something like "I am (have been?) one of the core
       | maintainers and know Illumos and Bhyve, so even if there would be
       | 'objectively' better choices, our familiarity with the OS and
       | hypervisor trump that". A "I like $A, always use $A and have
       | experience using $A" is almost always a better argument than "$A
       | is better than $B because $BLA", because that doesn't tell me
       | anything about the depth of knowledge of using $A and $B or the
       | knowledge of the subject of decision - there is a reason half of
       | Google's results is some kind of "comparison" spam.
        
         | actionfromafar wrote:
         | But everyone at Oxide already knows that back story. At least
         | if you list some other reasons list you can have a discussion
         | about technical merits if you want to.
        
           | ReleaseCandidat wrote:
           | But that doesn't make sense if you have specialists for $A
           | that also like to work with $A. Why should I as a customer
           | trust Illumos/Bhye developers that are using Linux/KVM
           | instead of "real" Linux/KVM developers? The only thing that
           | such a decision would tell me is to not even think about
           | using Illumos or Bhyve.
           | 
           | The difference between                   "Buy our
           | Illumos/Bhye solution! Why? I have been an Illumos/Bhyve
           | Maintainer!"
           | 
           | and                   "Buy our Linux/KVM solution! Why? I
           | have been an Illumos/Bhyve Maintainer!"
           | 
           | should make my point a bit clearer
        
             | panick21_ wrote:
             | Those are not the only options. You can have KVM on
             | Illumos, or Bhye on FreeBSD.
             | 
             | And finding people to heir that know Linux/KVM wouldn't be
             | a problem for them.
             | 
             | This evaluation was done years ago and they added like 50
             | people since then.
             | 
             | Saying 'We have a great KVM Team but our CEO was once an
             | Illumos developer' is perfectly reasonable.
             | 
             | And as I point out in my other comment, the former Joyant
             | people like know more about KVM then anything else anyway.
             | So it would be:
             | 
             | "Buy our KVM Solution, we have KVM experts"
             | 
             | But they evaluated that Bhyve was better then KVM despite
             | that.
        
               | ReleaseCandidat wrote:
               | > "Buy our KVM Solution, we have KVM experts"
               | 
               | Of course, but that is less of unique selling point.
               | 
               | > But they evaluated that Bhyve was better then KVM
               | despite that.
               | 
               | If you are selling Bhyve you better say that whether it's
               | true or not. So why should I, as a reader or employee or
               | customer, trust them?
        
         | panick21_ wrote:
         | But Bryan also ported KVM to Illumos. And Joyand used KVM and
         | they supported KVM there for years, I assume Bryan knows more
         | about KVM then Bhyve as he seemed very hands on in the
         | implementation (there is nice talk on youtube). So the idea
         | that he isn't familiar with KVM isn't the case. So based on
         | that KVM or Bhyve on Illumos, KVM would suggest itself.
         | 
         | In the long term if $A is actually better then $B, then it
         | makes sense to start with $A even if you don't know $A. Because
         | if you are trying to building a company that is hopefully
         | making billions in revenue in the future, then long term
         | matters a great deal.
         | 
         | Now the question is can you objectively figure out if $A or $B
         | is better. And how much time does it take to figure out.
         | Familiarity of the team is one consideration but not the most
         | important one.
         | 
         | Trying to be objective about this, instead of just saying 'I
         | know $A' seems quite like a smart thing to do. And writing it
         | down also seems smart.
         | 
         | In a few years you can look back and actually say, was our
         | analysis correct, if no what did we misjudge. And then you can
         | learn from that.
         | 
         | If you just go with familiarity you are basically saying 'our
         | failure was predetermined so we did nothing wrong', when you
         | clearly did go wrong.
        
           | jclulow wrote:
           | For what it's worth, we at _Joyent_ were seriously investing
           | in bhyve as our next generation of hypervisor for quite a
           | while. We had been diverging from upstream KVM, and most
           | especially upstream QEMU, for a long time, and bhyve was a
           | better fit for us for a variety of reasons. We adopted a port
           | that had begun at Pluribus, another company that was doing
           | things with OpenSolaris and eventually illumos, and Bryan
           | lead us through that period as well.
        
             | ComputerGuru wrote:
             | Are you/will you be upstreaming fixes and/or improvements
             | to Bhyve?
        
               | jclulow wrote:
               | Yes, my personal goal is to ensure that basically
               | everything we do in the Oxide "stlouis" branch of illumos
               | eventually goes upstream to illumos-gate where it filters
               | down to everyone else!
        
               | pmooney wrote:
               | Improvements and fixes to illumos bhyve are almost
               | entirely done in upstream illumos-gate, rather than the
               | Oxide downstream.
               | 
               | Upstreaming those changes into FreeBSD bhyve is a more
               | complicated situation, given that illumos has diverged
               | from upstream over the years due to differing opinions
               | about certain interfaces.
        
           | specialist wrote:
           | > _Trying to be objective about this... And writing it down
           | also seems smart._
           | 
           | Mosdef.
           | 
           | IIRC, these RFDs are part of Oxide's commitment to FOSS and
           | radical openness.
           | 
           | Whatever decision is ultimately made, for better or worse,
           | having that written record allows the future team(s) to pick
           | up the discussion where it previously left off.
           | 
           | Working on a team that didn't have sacred cows, an
           | inscrutible backstory ("hmmm, I dunno why, that's just how it
           | is. if it ain't broke, don't fix it."), and gatekeepers would
           | be _so great_.
        
       | blinkingled wrote:
       | I guess that's one way of keeping Solaris alive :)
        
       | daneel_w wrote:
       | _" * Emerging VMMs (OpenBSD's vmm, etc): Haven't been proven in
       | production"_
       | 
       | It's a small operation, but https://openbsd.amsterdam/ have
       | absolutely proven that OpenBSD's hypervisor is production-capable
       | _in terms of stability_ - but there are indeed other problems
       | that rule against it on scale.
       | 
       | For those who are unfamiliar with OpenBSD: the primary caveat is
       | that its hypervisor can so far only provide guests with a single
       | CPU core.
        
         | jclulow wrote:
         | Yes, to be clear this is not meant to be a criticism of
         | software quality at OpenBSD! Though I don't necessarily always
         | agree with the leadership style I have big respect for their
         | engineering efforts and obviously as another relatively niche
         | UNIX I feel a certain kinship! That part of the document was
         | also written some years ago, much closer to 2018 when that
         | service got started than now, so it's conceivable that we
         | wouldn't have said the same thing today.
         | 
         | I will say, though, that single VCPU guests would not have met
         | our immediate needs in the Oxide product!
        
           | notaplumber1 wrote:
           | > I will say, though, that single VCPU guests would not have
           | met our immediate needs in the Oxide product!
           | 
           | Could Oxide not have helped push multi-vcpu guests out the
           | door by sponsoring one of the main developers working on it,
           | or contributing to development? From a secure design
           | perspective, OpenBSD's vmd is a lot more appealing than bhyve
           | is today.
           | 
           | I saw recently that AMD SEV (Secure Encrypted Virtualization)
           | was added, which seems compelling for Oxide's AMD based
           | platform. Has Oxide added support for that to their bhyve
           | fork yet?
        
             | pmooney wrote:
             | > Could Oxide not have helped push multi-vcpu guests out
             | the door by sponsoring one of the main developers working
             | on it, or contributing to development?
             | 
             | Being that vmd's values are aligned with OpenBSD's
             | (security above all else), it is probably not a good fit
             | for what Oxide is trying to achieve. Last I looked at vmd
             | (circa 2019), it was doing essentially all device emulation
             | in userspace. While it makes total sense to keep as much
             | logic as possible out of ring-0 (again, emphasis on
             | security), doing so comes with some substantial performance
             | costs. Heavily used devices, such as the APIC, will incur
             | pretty significant overhead if the emulation requires round
             | trips out to userspace on top of the cost of VM exits.
             | 
             | > I saw recently that AMD SEV (Secure Encrypted
             | Virtualization) was added, which seems compelling for
             | Oxide's AMD based platform. Has Oxide added support for
             | that to their bhyve fork yet?
             | 
             | SEV complicates things like the ability to live-migrate
             | guests between systems.
        
       | tonyg wrote:
       | > Nested virtualisation [...] challenging to emulate the
       | underlying interfaces with flawless fidelity [...] dreadful
       | performance
       | 
       | It is so sad that we've ended up with designs where this is the
       | case. There is no _intrinsic_ reason why nested virtualization
       | should be hard to implement or should perform poorly. Path
       | dependence strikes again.
        
         | bonzini wrote:
         | It doesn't perform poorly in fact. It can be tuned at 90% of
         | non-nested virtualization, and for workloads where it doesn't,
         | that's more than anything else a testimony to how close
         | virtualized performance is to bare-metal.
         | 
         | That said, it does add a lot of complexity.
        
       | fefe23 wrote:
       | These sound like reason you retconned so it sounds like you
       | didn't choose Illumos because your founder used to work at Sun
       | and Joyent before. :-)
       | 
       | Frankly I don't understand why they blogged that at all. It reeks
       | of desperation, like they feel they need to defend their choice.
       | They don't.
       | 
       | It also should not matter to their customers. They get exposed
       | APIs and don't have to care about the implementation details.
        
         | jclulow wrote:
         | It's not a blog post, it's an RFD. We have a strong focus on
         | writing as part of thinking and making decisions, and when we
         | can, we like to publish our decision making documents in the
         | spirit of open source. This is not a defence of our position so
         | much as a record of the process through which we arrived at it.
         | This is true of our other RFDs as well, which you can see on
         | the site there.
         | 
         | > It also should not matter to their customers. They get
         | exposed APIs and don't have to care about the implementation
         | details.
         | 
         | Yes, the whole product is definitely designed that way
         | intentionally. Customers get abstracted control of compute and
         | storage resources through cloud style APIs. From their
         | perspective it's a cloud appliance. It's only from our
         | perspective as the people building it that it's a UNIX system.
        
           | stonogo wrote:
           | So at no point did anyone even suspect that Illumos was under
           | consideration because it's been corporate leadership's pet
           | project for decades? That seems like a wild thing to omit
           | from the "RFD" process. Or were some topics not open to the
           | "RFD" process?
        
             | jclulow wrote:
             | We are trying to build a business here. The goal is to sell
             | racks and racks of computers to people, not build a
             | menagerie of curiosities and fund personal projects.
             | Everything we've written here is real, at least from our
             | point of view. If we didn't think it would work, why would
             | we throw our own business, and equity, and so on, away?
             | 
             | The reason I continue to invest myself, if nothing else, in
             | illumos, is because I genuinely believe it represents a
             | better aggregate trade off for production work than the
             | available alternatives. This document is an attempt to
             | distill _why that is_ , not an attempt to cover up a
             | personal preference. I do have a personal preference, and
             | I'm not shy about it -- but that preference is based on
             | tangible experiences over twenty years!
        
       | leoh wrote:
       | I'd love to use Illumos, but a lack of arm64 support is a non-
       | starter
        
         | jclulow wrote:
         | Folks are working on it! I believe it boots on some small
         | systems and under QEMU, but it's still relatively early days.
         | I'm excited for the port to eventually make it into the gate,
         | though!
        
           | geerlingguy wrote:
           | In before someone asks about riscv64 ;)
        
         | ComputerGuru wrote:
         | I don't mean to downplay the importance for you personally but
         | I do want to clarify that while it might be a non-starter for
         | you, all of arm64 is so new that it's hardly a non-starter for
         | anyone considering putting it into (traditional) production.
        
           | timenova wrote:
           | You're right, however I was looking for the same information
           | to maybe try it on a RPi to learn more about Illumos.
        
       | BirAdam wrote:
       | I think a bigger reason for Oxide using Illumos is that many of
       | the people over there are former Sun folks.
        
       | throw0101b wrote:
       | Somewhat related, they discussed why they chose to use ZFS for
       | their storage backend as opposed to (say) Ceph in a podcast
       | episode:
       | 
       | * https://www.youtube.com/watch?v=UvEKSqBBcZw
       | 
       | Certainly they already had experience with ZFS (as it is built
       | into Illumos/Solaris), but as it was told to them by someone they
       | trusted who ran a lot of Ceph: " _Ceph is operated, not shipped
       | [like ZFS]_ ".
       | 
       | There's more care-and-feeding required for it, and they probably
       | don't want that as they want to treat product in a more
       | appliance/toaster-like fashion.
        
         | pclmulqdq wrote:
         | Ceph is sadly not very good at what it does. The big clouds
         | have internal versions of object store that are _far_ better
         | (no single point of failure, much better error recovery story,
         | etc.). ZFS solves a different problem, though. ZFS is a full-
         | featured filesystem. Like Ceph it is also vulnerable to single
         | points of failure.
        
           | throw0101b wrote:
           | > _The big clouds have internal versions of object store that
           | are far better (no single point of failure, much better error
           | recovery story, etc.)._
           | 
           | There are different levels of scalability needs. CERN has
           | over a dozen (Ceph) clusters with over 100PB of total data as
           | of 2023:
           | 
           | * https://www.youtube.com/watch?v=bl6H888k51w
           | 
           | Certainly there are some number of folks that need more than
           | that, but I don't there are many.
           | 
           | > _Like Ceph it is also vulnerable to single points of
           | failure._
           | 
           | The SPOF for ZFS is the host (unless you replicate, e.g.,
           | _zfs send_ ).
           | 
           | What is SPOF of Ceph? You can have multiple monitors,
           | managers, and MDSes.
        
             | pclmulqdq wrote:
             | Single-monitor is a common way to run Ceph. On top of that,
             | many cluster configurations cause the whole thing to slow
             | to a crawl when a very small minority of nodes go down.
             | Never mind packet loss, bad switches, and other sorts of
             | weird failure mechanisms. Ceph in general is pretty bad at
             | operating in degraded modes. ZFS and systems like Tectonic
             | (FB) and Colossus (Google) do much better when things
             | aren't going perfectly.
             | 
             | Do you know how many administrators CERN has for its Ceph
             | clusters? Google operates Colossus at ~1000x that size with
             | a team of 20-30 SREs (almost all of whom aren't spending
             | their time doing operations).
        
         | anonfordays wrote:
         | ZFS and Ceph is apples to oranges. ZFS is scoped to a single
         | host, Ceph can span data centers.
        
           | ComputerGuru wrote:
           | It's very possible to run a light/small layer on top of ZFS
           | (either userspace daemon or via FUSE) to get you most of the
           | way to scaling ZFS-backed object storage within or across
           | data centers depending on what specific availability metrics
           | you need.
        
             | anonfordays wrote:
             | That's true for any filesystem, not specific to ZFS. ZFS is
             | not a clustered or multi-host filesystem.
        
             | seabrookmx wrote:
             | What does this light/small layer look like?
             | 
             | In my experience you need something like GlusterFS which I
             | wouldn't call "light".
        
           | throw0101b wrote:
           | > _ZFS and Ceph is apples to oranges._
           | 
           | Oxide is shipping an on-prem 'cloud appliance'. From the
           | customer's/user's perspective of calling an API asking for
           | storage, it does not matter what the backend is--apple or
           | orange--as long as "fruit" (i.e., a logical bag of a certain
           | size to hold bits) is the result that they get back.
        
             | anonfordays wrote:
             | Yes, it could be NTFS behind the scenes, but this is still
             | an apples to oranges comparison because the storage service
             | Oxide created is Crucible[0], not ZFS. Crucible is more of
             | an apples to apples comparison with Ceph.
             | 
             | [0] https://github.com/oxidecomputer/crucible
        
         | wmf wrote:
         | You mean they use Crucible instead of Ceph?
        
       | jonstewart wrote:
       | Illumos makes sense as a host OS--it's capable, they know it,
       | they can make sure it works well on their hardware, and
       | virtualization means users don't need that much familiarity with
       | it.
       | 
       | If I were Oxide, though, I'd be _sprinting_ to seamless VMWare
       | support. Broadcom has turned into a modern-day Oracle (but
       | dumber??) and many customers will migrate in the next two years.
       | Even if those legacy VMs aren't "hyperscale", there's going to be
       | lots of budget devoted to moving off VMWare.
        
         | parasubvert wrote:
         | Oracle is a $53 billion company, and never had a mass exodus,
         | just less greenfield deployments.
         | 
         | Broadcom also isn't all that dumb, VMware was fat and lazy and
         | customers were coddled for a very long time. They've made a bet
         | that it's sticky. The competition isn't as weak as they
         | thought, that's true, but it will take 5+ years to catch up,
         | not 2 years, in general. Broadcom was betting on it taking 10
         | years: plenty of time to squeeze out margins. Customers have
         | been trying and failing to eliminate the vTax since OpenStack.
         | Red Hat and Microsoft are the main viable alternatives.
        
       | kayo_20211030 wrote:
       | Is the date on this piece correct?
       | 
       | The section about Rust as a first class citizen seems to contain
       | references to its potential use in Linux that are a few years out
       | of date; with nothing more current than 2021.
       | 
       | > As of March 2021, work on a prototype for writing Linux drivers
       | in Rust is happening in the linux-next tree.
        
         | kayo_20211030 wrote:
         | nm, I read the postscript. The RFD was from 2021. I wonder how
         | correct it was, and whether decisions made, based on it, were
         | good ones or bad ones.
        
       | alberth wrote:
       | Isn't it simply Oxide founders are old Sun engineers, and Illumos
       | is the open source spinoff of their old work.
        
         | sophacles wrote:
         | According to the founders and early engineers on their podcast
         | - no, they tried to fairly evaluate all the oses and were
         | willing to go with other options.
         | 
         | Practically speaking, its hard to do it completely objectively
         | and the in-house expertise probably colored the decision.
        
           | pclmulqdq wrote:
           | Tried to, sure, but when you evaluate other products strictly
           | against the criteria under which you built your own version,
           | you know what the conclusion will be. Never mind that you are
           | carrying your blind spots with you. I would say that there
           | was an attempt to evaluate other products, but not so much an
           | attempt to be objective in that evaluation.
           | 
           | In general, being on your own private tech island is a tough
           | thing to do, but many engineers would rather do that than
           | swallow their pride.
        
       | Aissen wrote:
       | Point 1.1 about QEMU seems even less relevant today, with QEMU
       | adding support for the microvm machines, hence greatly reducing
       | the amount of exposed code. And as bonzini said in the thread,
       | the recent vulnerability track record is not so bad.
        
       | magicalhippo wrote:
       | Been running Bhyve on FreeBSD (technically FreeNAS). Found PCIe
       | pass-through of NMVe drives was fairly straight forward once the
       | correct incantations were found, but network speed to host has
       | been fairly abysmal. On my admittedly aging Threadripper 1920X, I
       | can only get ~2-3 Gbps peak from a Linux guest.
       | 
       | That's with virtio, the virtual intel "card" is even slower.
       | 
       | They went with Illumos though, so curious if the poor performance
       | is a FreeBSD-specific thing.
        
         | bitfilped wrote:
         | It's been a minute since I messed with bhyve on FreeBSD, but
         | I'm pretty sure you have to switch out the networking stack to
         | something like Netgraph if you intend to use fast networking.
        
           | craftkiller wrote:
           | Hmmm I'm not the OP, but I run my personal site on a
           | kubernetes cluster hosted in bhyve VMs running Debian on a
           | FreeBSD machine using netgraph for the networking. I just
           | tested by launching iperf3 on the FreeBSD host and launching
           | an alpine linux pod in the cluster, and I only got ~4Gbit/s.
           | This is surprising to me since netgraph is supposed to be
           | capable of much faster networking but I guess this is going
           | through multiple additional layers that may have slowed it
           | down (off the top of my head: kubernetes with flannel,
           | iptables in the VM, bhyve, and pf on the FreeBSD host).
        
             | bitfilped wrote:
             | Do you know if you're still using if_bridge? I remembered
             | this article from klara that goes a bit more into the
             | details. https://klarasystems.com/articles/using-netgraph-
             | for-freebsd...
        
         | ComputerGuru wrote:
         | I just spun up a VNET jail (so it should be essentially using
         | the same network stack and networking isolation level as a
         | bhyve guest would) and tested with iperf3 and without any
         | tweaking or optimization and without even using jumbo frames
         | I'm able to get 24+ Gbps with iperf3 (32k window size, tcp,
         | single stream) between host/guest over the bridged and
         | virtualized network interface. My test hardware is older than
         | yours, it's a Xeon E5-1650 v3 and this is even with nested
         | virtualization since the "host" is actually an ESXi guest
         | running pf!
         | 
         | But I think you might be right about something because, playing
         | with it some more, I'm seeing an asymmetry in network I/O
         | speeds; when I use `iperf3 -R` from the VNET jail to make the
         | host connect to the guest and send data instead of the other
         | way around, I get very inconsistent results with bursts of 2
         | Gbps traffic and then entire seconds without any data
         | transferred (regardless of buffer size). I'd need to do a
         | packet capture to figure out what is happening but it doesn't
         | look like the default configuration performs very well at all!
        
       | tcdent wrote:
       | Linux has a rich ecosystem, but the toolkit is haphazard and a
       | little shakey. Sure, everyone uses it, because when we last
       | evaluated our options (in like 2009) it was still the most robust
       | solution. That may no longer be the case.
       | 
       | Given all of that, and taking into account building a product on
       | top of it, and thus needing to support it and stand behind it,
       | Linux wasn't the best choice. Looking ahead (in terms of decades)
       | and not just shipping a product now, it was found that an
       | alternate ecosystem existed to support that.
       | 
       | Culture of the community, design principles, maintainability are
       | all things to consider beyond just "is it popular".
       | 
       | Exciting times in computing once again!
        
       | craftkiller wrote:
       | I wonder if CockroachDB abandoning the open source license[0]
       | will have an impact on their choice to use it. It looks like the
       | RFD was posted 1 day before the license switch[1], and the RFD
       | has a section on licenses stating they intended to stick to the
       | OSS build:
       | 
       | > To mitigate all this, we're intending to stick with the OSS
       | build, which includes no CCL code.                   [0]
       | https://news.ycombinator.com/item?id=41256222         [1]
       | https://rfd.shared.oxide.computer/rfd/0110
        
         | implr wrote:
         | They already have another RFD for this:
         | https://rfd.shared.oxide.computer/rfd/0508
         | 
         | and on HN: https://news.ycombinator.com/item?id=41268043
        
           | Rendello wrote:
           | And a podcast episode: https://oxide.computer/podcasts/oxide-
           | and-friends/2052742
        
       | yellowapple wrote:
       | I'm surprised that KVM on Illumos wasn't in the running,
       | especially with SmartOS setting that as precedent (even if bhyve
       | is preferred nowadays).
        
       | mechanicker wrote:
       | Wonder if this is more due to Bhyve being developed on FreeBSD
       | and Illumos derives from a common ancestor BSD?
       | 
       | I know NetApp (stack based on FreeBSD) contributed significantly
       | to Bhyve when they were exploring options to virtualize Data
       | ONTAP (C mode)
       | 
       | https://forums.freebsd.org/threads/bhyve-the-freebsd-hypervi...
        
         | jclulow wrote:
         | While we have a common ancestor in the original UNIX, so much
         | of illumos is really more from our SVR4 heritage -- but then
         | also so much of _that_ has been substantially reworked since
         | then anyway.
        
       | computersuck wrote:
       | Because CTO Bryan Cantrill, who was a core contributor to illumos
        
       | anonnon wrote:
       | Ctrl+f Cantrill >Phrase not found
       | 
       | Bryan Cantrill, ex-Sun dev, ex-Joyent CTO, now CTO of Oxide, is
       | the reason they chose Illumos. Oxide is primarily an attempt to
       | give Solaris (albeit Rustified) a second life, similar to Joyent
       | before. The company even cites Sun co-founder Scott McNealy for
       | its principles:
       | 
       | https://oxide.computer/principles
       | 
       | >"Kick butt, have fun, don't cheat, love our customers and change
       | computing forever."
       | 
       | >If this sounds familiar, it's because it's essentially Scott
       | McNealy's coda for Sun Microsystems.
        
       | JonChesterfield wrote:
       | Illumos and ZFS sounds completely sensible for a company that
       | runs on specific hardware. They mention the specific epyc cpu
       | their systems are running on which suggests they're all ~
       | identical.
       | 
       | Linux has a massive advantage where it comes to hardware support
       | for all kinds of esoteric devices. If you don't need that, and
       | you've got engineers that are capable of patching the OS to
       | support your hardware, yep, have at it. Good call.
        
       ___________________________________________________________________
       (page generated 2024-09-12 23:01 UTC)