[HN Gopher] Writing a Unix clone in about a month
       ___________________________________________________________________
        
       Writing a Unix clone in about a month
        
       Author : drewdevault
       Score  : 241 points
       Date   : 2024-05-24 15:41 UTC (7 hours ago)
        
 (HTM) web link (drewdevault.com)
 (TXT) w3m dump (drewdevault.com)
        
       | andsoitis wrote:
       | Impressive, super cool, and inspiring!
       | 
       | Example of "creating something impressive in X days" requires a
       | lot of experience and talent that is built over _years_.
        
         | PaulDavisThe1st wrote:
         | ... and also a previously kernel implementation called Helios
         | to provide a lot of the lowest level code. Not trying to knock
         | down the accomplishment, but DD is pretty open about the fact
         | that a lot of the speed of this project was dependent on having
         | done Helios first (and reusing code from it).
        
           | palata wrote:
           | ...which is part of the "experience and talent built over
           | years", I guess? :-)
        
         | pushedx wrote:
         | Also the creator of KnightOS, written entirely in Z80 assembly,
         | more than 12 years ago!
         | 
         | https://www.ticalc.org/archives/files/fileinfo/463/46387.htm...
        
           | ruined wrote:
           | holy shit. they're already a living legend but somehow i
           | didn't make this connection
        
         | beryilma wrote:
         | Versus now... I changed the text on a button with an
         | internationalized string. It only took me about a week.
         | 
         | I put the English string in the catalog, updated a number of
         | tests, run the tests on the local system, pushed the change to
         | staging cluster, fix unanticipated test failures, push the
         | change to production, contact the translators to have the
         | string translated to a number of languages, and have
         | documentation updated.
        
           | Muromec wrote:
           | So... It goes to production _before_ you get translations to
           | all the languages?
        
             | beryilma wrote:
             | In my case, the "production" does not really become visible
             | to users right away. Perhaps, I should have called it "pre-
             | production".
        
         | saagarjha wrote:
         | Drew is smart and his timeline is short but I think it's the
         | wrong way to look at it if you just put him on a pedestal for
         | it. Making a UNIX clone is a typical undergrad project at most
         | universities. Extending that to something that is complete is
         | something that requires perseverance, not special genius.
        
         | ezconnect wrote:
         | He only have Helios so he just integrated a few missing parts.
        
       | lupusreal wrote:
       | Missed opportunity to call it Drewnix.
        
         | jrpelkonen wrote:
         | Fun fact: Linus Torvalds originally named his fledgling OS as
         | "Freax", but it was an FTP site admin who came up with "Linux"
         | and the rest is history. So perhaps the opportunity is not
         | completely missed...
        
           | davisr wrote:
           | No he didn't.
           | 
           | "I called it Linux originally as a working name. That was
           | just because "Linus" and the X has to be there--it's UNIX,
           | it's like, a law--and what happened was that I initially
           | thought that I can't call it "Linux" publicly because it's
           | just too egotistical. That was before I had a big ego."
           | 
           | https://yewtu.be/watch?v=kZlOCHYu1Vk
        
       | 8organicbits wrote:
       | Code is here: https://git.sr.ht/~sircmpwn/bunnix/tree/master
       | 
       | GPLv3 license.
        
       | pjmlp wrote:
       | Quite cool, by making use of Hare instead.
        
       | samatman wrote:
       | I was interested in Hare until I found this immensely self-
       | defeating FAQ item:
       | https://harelang.org/documentation/faq.html#will-hare-suppor...
       | 
       | As a baseline, I support developers using whatever license they
       | would like, and targeting whatever operating systems, indeed,
       | writing _whatever_ code they would like in the process.
       | 
       | That doesn't make this specific policy a good idea. Even FSF,
       | generally considered the most extreme (or, if you prefer,
       | principled) exponents of the Free Software philosophy, support
       | Windows and POSIX. They may grumble and call it Woe32, but
       | Stallman has said some cogent things about how the fight for a
       | world free of proprietary software is more readily advanced by
       | making sure that Free Software projects run on proprietary
       | systems.
       | 
       | They do at least license the library code under MPL, so merely
       | using Hare doesn't lock you into a license. But I wonder about
       | the longevity of a language where the attitude toward 95+% of the
       | desktop is "unsupported, don't ask questions on our forums, we
       | don't want you here".
       | 
       | Ironically, a Google search for "harelang repo" has as the first
       | hit an unofficial macOs port, and the actual SourceHut repo
       | doesn't show up in the first page of results.
       | 
       | Languages either snowball or fizzle out. I'm typing this on a
       | Mac, but I could pick up a Linux machine right now if I were of a
       | mind to. But why would I invest in learning a language which
       | imposes a purity test on developers, when even the FSF doesn't? A
       | great deal of open source _and_ free software gets written on
       | Macs, and in fact, more than you might think on Windows as well.
       | 
       | From where I sit, what differentiates Hare from Odin and Zig, is
       | just this attitude of purity and exclusion. I wish you all happy
       | hacking, of course, and success. But I'm pessimistic about the
       | latter.
        
         | stonogo wrote:
         | Sounds like you and the Hare people have different definitions
         | of success. As for "languages either snowball or fizzle out," I
         | feel like that's pretty dismissive of a lot of languages that
         | have been steadily marching on for decades even without this
         | rockstar status.
         | 
         | Not every band has to hit the Billboard charts to be worth
         | listening to.
        
         | jampekka wrote:
         | "The goal of Hare is not to achieve the broadest possible
         | reach, but to be a part of a broader system which effectively
         | achieves Hare's goals."
        
         | bee_rider wrote:
         | It says they won't officially support Windows or MacOS. Some
         | other project can try to port it if they want, right? It seems
         | good of them to be honest about their intended level of
         | support.
         | 
         | Supporting an OS the devs don't use is a big ask.
        
         | kbolino wrote:
         | On the one hand, I can respect the authors for sticking to what
         | they want to accomplish and not accommodating every demand.
         | 
         | On the other hand, that is hardly the only thing from the FAQ
         | that raises one's eyebrows:
         | 
         | > we have no package manager and encourage less code reuse as a
         | shared value
         | 
         | > qbe generates slower code compared to LLVM, with the
         | performance ranging from 25% to 75% the runtime performance of
         | comparable LLVM-generated code
         | 
         | > Can I use multithreading in Hare? Probably not.
         | 
         | > So I need to implement hash tables myself? Indeed. Hash
         | tables are a common data structure that many Hare programs will
         | need to implement from scratch.
         | 
         | As it stands, this is definitely not a language designed for
         | mass adoption. Which is fine, and at least they're upfront
         | about it.
        
         | skydhash wrote:
         | > _But why would I invest in learning a language which imposes
         | an arbitrary purity test on developers?_
         | 
         | While I understand your concerns, I disagree with your the idea
         | of "imposition". Someone doing something for free doesn't owe
         | anyone to do it in a particular way (as long as it's not
         | malevolent). You're free to express your opinion, but if the
         | developer has already established his guidelines, criticisms
         | like this is not constructive.
        
         | sramsay wrote:
         | "We cannot effectively study, understand, debug, or improve,
         | the underlying operating system if it is non-free. We actively
         | work with the source code for the systems on which we depend,
         | and we are not interested in supporting any platforms for which
         | this is not possible."
         | 
         | I understand that you don't like it, but how do you come to
         | regard a statement like this as "arbitrary?" It's exclusive,
         | for sure. "Purity test" is one way to characterize it. But do
         | you really think that statements like this are just the product
         | of individual caprice? That it's not someone's attempt at a
         | principled intervention, but just an "attitude?"
        
           | PhilipRoman wrote:
           | Ouch, I hadn't really considered it before but that quote
           | deeply resonates with me. The experience of trying to debug
           | windows wifi system is day and night compared to
           | wpa_supplicant/mac80211.
        
           | samatman wrote:
           | You're right, it isn't arbitrary. I removed that word from
           | the post and edited it to express my opinion more clearly.
        
           | apantel wrote:
           | I was going to post the same quote. If you have no visibility
           | into the layer you depend on, you really can't reason about
           | it or write optimized code for it.
           | 
           | The Hares are saying they require that, which I totally
           | understand and respect.
        
         | palata wrote:
         | I don't think that Apple particularly cares about porting their
         | software to Linux. Do you feel the same about Apple? That with
         | such an attitude, they surely cannot succeed?
        
           | samatman wrote:
           | Apple releases a great deal of _open source_ software, which,
           | so far as I 'm aware, all runs on Linux as well. At least
           | Swift, clang, and LLVM, all run on Windows as well. So does
           | their Objective C compiler, so of Apple's _programming
           | languages_ , that leaves AppleScript. I would not describe
           | AppleScript as robustly successful.
           | 
           | I believe Apple could probably _get away with_ keeping Swift
           | proprietary, or only supporting Apple platforms. But they don
           | 't. I have no inside-track information on why that is, but I
           | suspect the reason is fairly simple: developers wouldn't like
           | it.
        
             | palata wrote:
             | > so of Apple's programming languages
             | 
             | So the whole part of your message about "even the FSF
             | saying that free software should run on proprietary system"
             | works when you want to criticize Hare, but not when looking
             | at Apple proprietary software, right?
             | 
             | A language is just another piece of software, I don't see
             | why you should apply different rules to a programming
             | language than, e.g. to a serializing system like Protobuf.
             | And I don't think Google actively supports swift-protobuf
             | (https://github.com/apple/swift-protobuf).
             | 
             | Hare upstream just says "we are not interested in
             | supporting non-free OSes, but we won't prevent you from
             | doing it". It's your choice to not use Hare because of
             | this, but it's their choice to not support macOS.
        
               | samatman wrote:
               | > _As a baseline, I support developers using whatever
               | license they would like, and targeting whatever operating
               | systems, indeed, writing whatever code they would like in
               | the process._
               | 
               | > _That doesn 't make this specific policy a good idea._
        
             | saagarjha wrote:
             | You will note that Apple invests approximately zero effort
             | in making those projects portable.
        
         | 2pEXgD0fZ5cF wrote:
         | > Languages either snowball or fizzle out.
         | 
         | This is not true and a naive statement. There are quite few
         | languages which are not popular across the board but have a
         | very firm niche in which they thrive and fulfill critical
         | roles.
        
       | anta40 wrote:
       | Very cool. Most of these Unix clones are usually written in C.
       | This one is written in a new programming language.
        
         | balder1991 wrote:
         | I only read part of the FAQ. I find the desire to keep the
         | complexity low by limiting the compiler lines of code and not
         | using LLVM interesting, but I wonder how practical it is. The
         | FAQ admits that because of this, it generates slower code. So
         | it shifts the complexity to the software codebase, by telling
         | the users to "use assembly where needed".
         | 
         | Seems a bit like Python's philosophy of not introducing too
         | much optimizations to prevent the runtime complexity from
         | spiraling out of control.
        
           | PhilipRoman wrote:
           | I doubt it is a real problem for anything other than number
           | crunching. I like to use tcc during development (which does
           | very little, if any optimizations) to speed up compilation
           | and I never noticed any regressions in performance, even for
           | GUI software. Throughput just isn't that big of a deal for
           | most applications (although latency and resource usage is,
           | but that's not affected by choice of compiler).
        
         | pjmlp wrote:
         | There were UNIX written in Ada and Pascal, naturally C has a
         | special relationship.
        
           | trollerator23 wrote:
           | Really??
        
             | JPLeRouzic wrote:
             | In France, in the 1980" there was a Unix clone written in
             | Pascal by the CNET (the R&D of the incumbent phone
             | operator). The CPU was a M68K and the hard disk had
             | 20Mbytes (if I recall correctly). I don't remember the name
             | of the beast.
        
             | pjmlp wrote:
             | Yep,
             | 
             | https://en.m.wikipedia.org/wiki/Apollo_Computer
             | 
             | https://marte.unican.es/
        
       | nickcw wrote:
       | Hare looks like an interesting language.
       | 
       | Though this limitation will limit its adoption in this multicore
       | age I think:
       | 
       | From the FAQ https://harelang.org/documentation/faq.html
       | 
       | ....
       | 
       | Can I use multithreading in Hare?
       | 
       | Probably not.
       | 
       | We prefer to encourage the use of event loops (see unix::poll or
       | hare-ev) for multiplexing I/O operations, or multiprocessing with
       | shared memory if you need to use CPU resources in parallel.
       | 
       | It is, strictly speaking, possible to create threads in a Hare
       | program. You can link to libc and use pthreads, or you can use
       | the clone(2) syscall directly. Operating systems implemented in
       | Hare, such as Helios, often implement multi-threading.
       | 
       | However, the upstream standard library does not make reentrancy
       | guarantees, so you are solely responsible for not shooting your
       | foot off.
        
         | senkora wrote:
         | > multiprocessing with shared memory if you need to use CPU
         | resources in parallel
         | 
         | This is actually pretty powerful. I personally prefer it for
         | most purposes, because it restricts the possibility of data
         | races to only the shared memory regions. It's a little like an
         | "unsafe block" of memory with respect to data races.
        
           | pjmlp wrote:
           | I changed from a strong threads believer and dynamic
           | libraries plugins, exactly because of attack vector and host
           | program stability.
        
         | packetlost wrote:
         | I just wish it had closures
        
       | mtillman wrote:
       | This is really cool. Reminds me of the original Unix was invented
       | in a couple weeks while Ritchie's family went on vacation to CA
       | to visit his in-laws.
       | 
       | Source: UNIX: A History and a Memoir Paperback - October 18, 2019
       | by Brian W Kernighan (Author)
        
         | balder1991 wrote:
         | But I think it's relevant to say that before writing Unix he
         | was working on Multics for a long time already. Unix was a
         | "simplified" version of it, if I remember well. So it didn't
         | "spring out of thin air."
        
           | trollerator23 wrote:
           | Absolutely.
        
           | fuzztester wrote:
           | >So it didn't "spring out of thin air."
           | 
           | Right. Almost nothing does.
           | 
           | You see, it's
           | https://en.m.wikipedia.org/wiki/Turtles_all_the_way_down
        
           | eichin wrote:
           | Mmm, even early versions ended up being more the "anti-
           | multics" than actually simplified-from, despite the name
           | pun...
        
         | naitgacem wrote:
         | i thought that story was about 3 programs that were missing, a
         | text editor being one of them.
         | 
         | I'll have to check because my memory is failing me atm.
        
           | Zambyte wrote:
           | > I allocated a week each to the operating system, the shell,
           | the editor and the assembler
           | 
           | http://www.groklaw.net/article.php?story=20050414215646742
        
         | laxd wrote:
         | I think you mean Ken Thompson. I can't be bothered searching
         | through youtube interviews but I'm pretty shure that on more
         | than one occasion, he tells a story something along the lines
         | of having a disk driver, some programs, and maybe some other
         | components. His wife went on a trip and he figured it would be
         | enough time to fill in the gaps and make a complete OS.
        
           | jasone wrote:
           | I'm pretty sure that is mentioned in this interview:
           | 
           | https://www.youtube.com/watch?v=wqI7MrtxPnk
           | 
           | By the way the CHM oral history video series is full of gems.
        
       | LightFog wrote:
       | It was really cool watching the ~daily updates on this on
       | Mastodon - seeing how someone so skilled gradually pieces
       | together a complex piece of software.
        
         | herodoturtle wrote:
         | Link to the mastodon thread (from Drew's article), for those
         | that are interested:
         | 
         | https://fosstodon.org/@drewdevault/112319697309218275
        
       | kpw94 wrote:
       | > I also finally learned how signals work from top to bottom, and
       | boy is it ugly. I've always felt that this was one of the weakest
       | points in the design of Unix and this project did nothing to
       | disabuse me of that notion.
       | 
       | Would love any resources that goes in more details, if any HN-er
       | or the author himself knows of some!
        
         | palata wrote:
         | I wanted to say the exact same thing! I would love to get more
         | details about that.
        
         | eterps wrote:
         | Would love to read a blog post about that.
        
         | retrac wrote:
         | Signals are at the intersection of asynchronous IO/syscalls,
         | and interprocess communication. Async and IPC are _also_ weak
         | points in the original Unix design, not originally present.
         | Signals are an awkward attempt to patch some async IPC into the
         | design. They 're prone to race conditions. What happens when
         | you get a signal when handling a signal? And what to do with a
         | signal when the process is in the middle of a system call, is
         | also a bit unclear. Delay? Queue? Pull process out of the
         | syscall?
         | 
         | If all syscalls are async (a design principle of many modern
         | OSes) then that aspect is solved. And if there is a reliable
         | channel-like system for IPC (also a design principle of many
         | modern OSes) then you can implement not only signals but also
         | more sophisticated async inter-process communication/procedure
         | calls.
        
           | Joker_vD wrote:
           | As I wrote in some older discussion about UNIX signals on HN,
           | the root problem (IMHO, of source) is that signals conflate
           | three different useful concepts. The first is asynchronous
           | external events (SIGHUP, SIGINT) that the process should be
           | notified about in a timely manner and given an opportunity to
           | react; the second is synchronous internal events (SIGILL,
           | SIGSEGV) caused by the process itself, so it's basically low-
           | level exceptions; and the third is process/scheduling
           | management (SIGKILL, SIGSTOP, SIGCONT) to which the process
           | has no chance to react so it's basically a way to save up on
           | syscalls/ioctls on pidfds. An interesting special case is
           | SIGALRM which is an _a_ synchronous _in_ ternal event.
           | 
           | See the original comment [0] for slighlty more spellt out
           | ideas on better designs for those three-and-a-half concepts.
           | 
           | [0] https://news.ycombinator.com/item?id=39595904
        
           | chasil wrote:
           | IPC was actually introduced in "Columbus UNIX."
           | 
           | https://en.wikipedia.org/wiki/CB_UNIX
        
         | NikkiA wrote:
         | I always felt VMS' mailbox system was much more elegant, but I
         | imagine it's an ugly mess under the surface too.
         | 
         | https://wiki.vmssoftware.com/Mailbox
        
           | MisterTea wrote:
           | I like Plan 9's notes: http://man.postnix.pw/9front/2/notify
        
         | pcwalton wrote:
         | "signalfd is useless" is a good article:
         | https://ldpreload.com/blog/signalfd-is-useless
         | 
         | It goes into the problems with Unix signals, and then explains
         | why Linux's attempt to solve them, signalfd, doesn't work well.
        
         | chubot wrote:
         | If you haven't already, I would start with Advanced Programming
         | in the Unix Environment by Stevens
         | 
         | https://www.amazon.com/Advanced-Programming-UNIX-Environment...
         | 
         | It is about using all Unix APIs from user space, including
         | signals and processes.
         | 
         | (I am not sure what to recommend if you want to implement
         | signals in the kernel, maybe
         | https://pdos.csail.mit.edu/6.828/2012/xv6.html )
         | 
         | ---
         | 
         | It's honestly a breath of fresh air to simply read a book that
         | explains clearly how Unix works, with self-contained examples,
         | and which is comprehensive and organized. (If you don't know C,
         | that can be a barrier, but that's also a barrier reading blog
         | posts)
         | 
         | I don't believe the equivalent information is anywhere on the
         | web. (I have a lot of Unix trivia on my blog, which people
         | still read, but it's not the same)
         | 
         | IMO there are some things for which it's really inefficient to
         | use blog posts or Google or LLMs, and if you want to understand
         | Unix signals that's probably one of them.
         | 
         | (This book isn't "cheap" even used, but IMO it survives with a
         | high price precisely because the information is valuable. You
         | get what you pay for, etc. And for a working programmer it is
         | cheap, relatively speaking.)
        
           | balder1991 wrote:
           | I believe this was the 3rd time I've seen this book being
           | recommended this week. It must mean something.
        
             | Terr_ wrote:
             | It might mean the Baader-Meinhof effect.
        
             | madhadron wrote:
             | It's been the standard reference for decades for a reason.
             | I learned from it, too. There's really nothing else quite
             | like it available.
        
             | pjmlp wrote:
             | It is a must for anyone serious about UNIX programming.
             | 
             | Additionally one should get the TCP/IP and UNIX streams
             | books from the same collection.
        
         | chasil wrote:
         | There were differences between BSD and SYSV signal handling
         | that were problematic in writing portable applications.
         | 
         | https://pubs.opengroup.org/onlinepubs/009604499/functions/bs...
         | 
         | It's important to remember that code in a signal handler must
         | be re-enterant. "Nonreentrant functions are generally unsafe to
         | call from a signal handler."
         | 
         | https://man7.org/linux/man-pages/man7/signal-safety.7.html
        
           | convolvatron wrote:
           | reentrancy is not sufficient here - at least that provided by
           | mutex style exclusion. the interrupted thread may have
           | actually been the one holding the lock, so if the signal
           | handler enters a queue to wait for it, it may be waiting
           | quite a while
        
             | tedunangst wrote:
             | That's why the word reentrant is used, not thread safe.
        
       | amelius wrote:
       | Waiting for an OS that treats GPU(s) as a first class citizen ...
        
         | eterps wrote:
         | That wouldn't be too hard if GPU's would have a stable
         | interface. Try programming a GPU in Assembly language and see
         | how that goes. The experience sucks, but that's the level that
         | needs to be targeted in case of an OS.
        
           | eterps wrote:
           | For example, in the past Amiga computers had a 'GPU'
           | (although much less powerful than todays GPUs) with a stable
           | interface. It was a first class citizen in its OS. It also
           | was incredibly easy to target in Assembly language.
        
             | pjmlp wrote:
             | Blitter was great, but those were simpler times.
             | 
             | The best we have nowadays is using compute shaders for the
             | same purpose.
             | 
             | Just like when using a TMS34010 with its C SDK.
        
             | mepian wrote:
             | Amiga died because it was stuck with the same old "GPU" for
             | too long, among other reasons.
        
         | saagarjha wrote:
         | What should this OS do?
        
       | calvinmorrison wrote:
       | hey drew! did writing this project give you any Hare-y situations
       | you hadn't run into before, or maybe - reached into corners not
       | yet probed by Hare and gave you ideas for a new feature or edge
       | case that was scary?
        
       | thefaux wrote:
       | Impressive work but I feel this approach is the hard and brittle
       | way to write an os. The easier and more portable way is to write
       | the os as a guest in a host language. You start with a simple
       | shell with the print command and build from there.
        
         | palata wrote:
         | I hope it's not too easy then... imagine what he could do in 27
         | days if this was the "hard and brittle way" :-).
        
         | tmountain wrote:
         | Where's your OS? How long did it take?
        
       ___________________________________________________________________
       (page generated 2024-05-24 23:01 UTC)