[HN Gopher] FreeBSD spends 7% of its boot time running a bubbles...
       ___________________________________________________________________
        
       FreeBSD spends 7% of its boot time running a bubblesort on its
       SYSINITs
        
       Author : gslin
       Score  : 292 points
       Date   : 2023-05-19 14:44 UTC (8 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | jgrahamc wrote:
       | Many years ago I worked on a piece of software that simulated
       | networks. As networks got bigger it would run out of memory.
       | Turned out it did something like                  CONNECTION *c =
       | malloc(num_nodes * num_nodes * sizeof(CONNECT_STRUCT));
       | 
       | And almost the entire matrix was empty because most things in a
       | network aren't connected to each other. I spent a a few minutes
       | replacing that with a hash table on the (i, j) node pair.
        
       | loeg wrote:
       | So you might be shocked that 7% is so high, but the reason it is
       | so high is that cperciva has been driving down the overall boot
       | time to the tune of ~99%+ faster than two major releases ago.
       | Bubblesort used to be <1% of total boot time because the rest of
       | boot was so slow. Now that many of the gratuitous sleeps are
       | gone, the next thing to look at is stupid algorithms like this.
        
       | selimnairb wrote:
       | When I was doing my PhD in hydrology modeling, I found a bubble
       | sort in one of our pre-processing programs, which calculated the
       | topographic flow network. This worked okay when the watersheds
       | being modeled were small or the grid resolution was coarse. I
       | however, was modeling at much finer resolution (3m^2) resulting
       | in runtimes of more than an hour. I replaced the bubble sort with
       | merge sort from libc and my runtimes fell to less than 5 minutes.
       | Fun times.
        
         | tombert wrote:
         | Tangential, but in my Java class recently I was teaching
         | teaching the students some basic Big O stuff. I wrote a few
         | versions of a program that found the "biggest delta" of two
         | numbers in a list. The first version was O(n^2) time, the next
         | version was just O(n).
         | 
         | I showed them how long it took to run the O(n^2) on 100000
         | elements, about two minutes on my beefy server. Then I showed
         | them the O(n) version running on my comparatively weaker
         | laptop, and it took about ten seconds to run a _million_
         | elements. The students were genuinely kind of impressed, and I
         | stressed the point to say that  "these are the kind of numbers
         | that make it important to take theory seriously."
        
           | saagarjha wrote:
           | O(n) for 1 million elements taking ten seconds is pretty
           | long. What's the constant factor?
        
       | tonetheman wrote:
       | [dead]
        
       | xp84 wrote:
       | Naive question here: Sorting algorithms aside, is there a reason
       | why the ordering of these even needs to be (re)determined at
       | runtime every boot as opposed to the order being recomputed when
       | the kernel is updated, and written to a file?
        
       | Tade0 wrote:
       | I've seen this referred to as the "BBBB problem", as in: "nobody
       | types BBBB into the search box, so it makes no sense to optimize
       | the algorithm so it doesn't have worst-case complexity with
       | repeated characters".
       | 
       | That is, until someone searches for e.g. git conflict markers.
        
         | arnsholt wrote:
         | At a previous job, I debugged a close relative of that exact
         | problem: sorting list views was very slow when you wanted to
         | sort on a column with very few distinct values in a large
         | table. The culprit was of course a quick sort degenerating to n
         | squared performance.
        
           | epx wrote:
           | Had this problem in xBase (Clipper with NTX indexes): sorting
           | by date but there were many null (empty) dates in that
           | particular column. Added another date column (that was never
           | null) to the index just to provide entropy.
        
           | kzrdude wrote:
           | And pattern-defeating quicksort is one example of a solution
           | to that
        
       | daniel-thompson wrote:
       | Dawson's law strikes again!
       | 
       | > O(n^2) is the sweet spot of badly scaling algorithms: fast
       | enough to make it into production, but slow enough to make things
       | fall down once it gets there.
       | 
       | https://randomascii.wordpress.com/2021/02/16/arranging-invis...
        
         | thomasmg wrote:
         | O(n^2) algorithms often cause performance issues. The main
         | cases I have seen in business logic are: (A) offset pagination
         | (select ... offset n) and then paginate over all entries, and
         | (B) read a text value, append something, store, repeat.
        
           | djbusby wrote:
           | So, how to do offset w/o using OFFSET?
        
             | paulddraper wrote:
             | Use a sorted index (e.g. Btree), and use those values to
             | quickly find the start of the next page of results.
             | 
             | For good performance this also requires that your search
             | criteria (if any) be compatible with this index.
        
             | vbezhenar wrote:
             | If you need to iterate over all records, just do it? Why do
             | you need offset.
             | 
             | Otherwise using offset usually is OK idea. Because users
             | very rarely will inspect page #2153. They're interested
             | with page 1, sometimes page 2. limit/offset works fine for
             | those cases and it'll work for page 2153 for those who
             | visit it once in a decade. Using ids makes logic to trac
             | prev/next/page number incredibly complex and generally you
             | don't need it.
        
               | wruza wrote:
               | I regularly change 3 to 2 in /new/345689 links when bored
               | with today's content.
               | 
               |  _Using ids makes logic to trac prev /next/page number
               | incredibly complex and generally you don't need it._
               | 
               | When it's a public site, users may post so fast that this
               | "next" can show some previous page. Paging via ids is a
               | must there.
        
               | marcosdumay wrote:
               | > "next" can show some previous page
               | 
               | That is usually a non-issue. The cost in DB operations is
               | usually much more relevant than it.
               | 
               | When people do actually care about fully enumeration and
               | unicity of the items they they are querying, "pagination"
               | itself tends to be a too messy concept.
        
               | wruza wrote:
               | _The cost in DB operations is usually much more relevant
               | than it._
               | 
               | As a result, a user (all of them) hits "next" again until
               | a page looks like containing new posts. It's multiple
               | requests wasted.
               | 
               | Anyway, what exactly becomes messy?
        
               | marcosdumay wrote:
               | What a "page" means when enough things appear between
               | them that they push stuff a page away? Are those things
               | reordered? Do people expect to see the new things before
               | they get to the old?
               | 
               | A "page" is a very ill-defined thing that can only exist
               | if your stuff is mostly static. Queries are very
               | different on dynamic content.
        
               | derefr wrote:
               | > If you need to iterate over all records, just do it?
               | 
               | Who is "you" here?
               | 
               | Usually what happens is that party A builds a REST API
               | (or other connectionless protocol) for fetching a list of
               | some entity-type; and they limit the number of items that
               | can be fetched in one request (because they have a
               | billion such entities, and they don't want to even try to
               | imagine what kind of system would be necessary to
               | generate and stream back a 5TB JSON response); which
               | implies pagination, to get at "the rest of" the entities.
               | 
               | Then party B, a user of this API, decides that they want
               | to suck party A's whole billion-entity database out
               | through the straw of that REST API, by scraping their way
               | through each page.
               | 
               | > it'll work for page 2153 for those who visit it once in
               | a decade
               | 
               | To be looking at page 2153, the user probably first
               | visited every page _before_ 2153. Which means they didn
               | 't do _one_ O(N) query (which would by itself be fine);
               | but rather, they made O(N) requests that each did an O(N)
               | query.
        
             | kdmytro wrote:
             | Make a query whose parameters exclude the previous page
             | results altogether. I learned about this from here:
             | https://www.citusdata.com/blog/2016/03/30/five-ways-to-
             | pagin...
        
             | eatonphil wrote:
             | Include the last id (which should be indexed) of the
             | previous page in the next where filter.
             | 
             | https://use-the-index-luke.com/sql/partial-results/fetch-
             | nex...
        
           | xorcist wrote:
           | Ummm.. Sure, but what's happening here is that FreeBSD stores
           | a list in a file _not_ in the desired order, then sorts it
           | before use.
           | 
           | It seems to be any prolonged discussion about _which_ sorting
           | algorithm should be used is sort of skipping the elephant.
           | Why is the list not sorted to begin with?
           | 
           | Without that basic knowledge it isn't very productive to
           | focus on implementation details, no matter how fun the
           | sorting exercise is. Deleted code is fast code.
        
         | dangerlibrary wrote:
         | For a collection of similar stories:
         | 
         | https://accidentallyquadratic.tumblr.com/
        
           | tgv wrote:
           | About the regexp one: I once wrote a regexp for CSS. It
           | wasn't complete, but you would be able to pinpoint a
           | syntactic error. I hooked it up to a text field, and started
           | entering CSS. All went fine, until I hit a limit, and adding
           | a single character froze Chrome for more than a minute (at
           | which point I killed it).
           | 
           | I don't think it was accidentally quadratic. More likely, it
           | was exponential.
        
           | ouid wrote:
           | the funniest comment ever posted on HN was something like:
           | 
           | "everytime this blog is linked, I end up reading the whole
           | thing"
        
             | [deleted]
        
         | dheera wrote:
         | Probably because it's MUCH easier to code bubblesort without
         | making mistakes that cause it to not terminate or some such.
         | Especially if they are writing the bootloader in assembly.
         | 
         | For something mission critical like a bootloader that's more
         | valuable than turning O(n^2) into O(n log n). People running
         | systems like BSD largely don't care how long the system takes
         | to boot, once it's booted the system runs for years.
        
           | tinus_hn wrote:
           | You shouldn't write a sort algorithm because much smarter
           | people have already done so and their work is included in the
           | standard library.
        
             | toast0 wrote:
             | Sure. In this case, the smarter people wrote the sort in
             | ~1995 and it was good enough for nearly 30 years, but now
             | someone has to step up and be smart again.
             | 
             | You can't always rely on smarter people to be there to do
             | things for you. And you also can't rely on the standard
             | library to have been written by smarter people anyway, and
             | even if so, to have been written by smarter people in order
             | to handle the situation you find yourself in. There's lots
             | of ways to sort, and lots of good reasons to choose ways
             | that are less than optimal for the use cases they end up
             | being used in.
        
             | rrdharan wrote:
             | The "standard library" often isn't available in early boot
             | scenarios.
        
               | thomastjeffery wrote:
               | The standard library can be statically linked...
        
               | xorvoid wrote:
               | People who think this way haven't written boot code.. I
               | suppose you're gonna link the c runtime too and it's
               | assumption of being a "process" under some "operating
               | system".. oh wait.
               | 
               | Compared to the rest of the task, writing a sort is
               | pretty darn trivial.
        
               | tedunangst wrote:
               | This is true if we're talking about the first stage bios
               | boot that needs to fit in 512 bytes, but there aren't any
               | particular restraints on kernel size at the point in
               | question. Link in anything you want, including qsort.
        
               | pjmlp wrote:
               | True, but there are plenty of CS books about algoriths
               | and data structures.
        
               | saagarjha wrote:
               | Converting those into code is not always trivial.
        
               | pjmlp wrote:
               | Depends on the quality of the degree.
               | 
               | Besides I would assume anyone getting ready for their
               | leetcode assignments goes through them anyway.
        
               | naniwaduni wrote:
               | Copying a convenient-looking one out of a CS book is _how
               | you end up with a bubble sort_. Approximately nobody
               | comes up with a bubble sort from scratch; it 's
               | obviously, gratuitously bad in a way that there's no
               | practical reason to think of on your own. The sorts that
               | people come up with without reference are usually
               | insertion and selection sorts--those two were discovered
               | _before_ bubble sort.
        
               | pjmlp wrote:
               | Only if they never managed beyond first chapter.
        
               | [deleted]
        
               | tedunangst wrote:
               | Bubble sort occupies a weird space where people assume
               | it's dumb and simple so it'll be the least code, but even
               | insertion sort tends to beat it.
        
               | naniwaduni wrote:
               | I mean yeah, bubble sort is basically insertion sort but
               | weirdly pessimized in a way that makes negative sense.
        
               | anikan_vader wrote:
               | Plenty of children come up with bubble sort as a sorting
               | algorithm. It's intuitive to go down the list swapping
               | any pairs that happen to be in the wrong order.
        
               | naniwaduni wrote:
               | It's also very intuitive to pick out a misplaced item and
               | put it in the right position. In the context of physical
               | objects (e.g. playing cards), it's _even more intuitive_
               | to come up with a (two-way) insertion sort than a bubble
               | sort.
        
           | dmitrygr wrote:
           | The easiest sort to implement in assembly is probably
           | selection sort. Bubble sort is actually messier :)
           | 
           | Source: written sorts in assembly more times than I would
           | care to count.
        
           | naniwaduni wrote:
           | The funny thing is that, in my experience, bubble sort is
           | actually pretty hard to write, because the finicky details of
           | the swap step _don 't make sense_ (in a "wait, this can't
           | possibly be right, there's no way you'd want to write it like
           | that" kind of way). Better than even odds that if you have
           | someone write a "bubble sort" without reference, you get an
           | accidentally insertion sort variant.
        
         | Avshalom wrote:
         | Off Topic: "Laying out icons on a grid should be an inherently
         | linear operation"
         | 
         | it doesn't seem mentioned in the HN thread the cause here is
         | probably the same thing O(n^2): sorting. laying out icons is
         | only linear if the icons are read in the order that they're
         | placed. It's been a long time since I used windows regularly
         | but my memory is the default placement is by creation time. So
         | if they're read off disk by filename (or some sort of hash)
         | they'd need to be sorted by timestamp.
        
         | TYMorningCoffee wrote:
         | GTA online was struck too
         | 
         | How I cut GTA Online loading times by 70%
         | https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...
        
           | szatkus wrote:
           | Lesson here is to profile your software at some point.
           | 
           | I'm playing a game (Fenyx Rising) and after launching it I
           | always wonder why "Checking for additional content" screen
           | takes 20-30 seconds. I'm pretty sure it should be just a
           | single request.
        
             | canucker2016 wrote:
             | I doubt that GTA Online released the game with 60K+ items
             | in their online store.
             | 
             | The runtime for the release day store inventory count may
             | have been okay, but I don't think that Rockstar kept
             | profiling their game every time that they modified their
             | store inventory.
             | 
             | The amount of pain Rockstar inflicted on their 90K+ users
             | shows that Rockstar didn't care that their game took more
             | than 3 mins to startup for the majority of their users.
        
           | ahns wrote:
           | Every few months or so I come back to that article always in
           | awe
        
         | IshKebab wrote:
         | True but I think the real cause of this is surely that C makes
         | it too hard to use a sorting library that someone competent has
         | written. I would not be surprised if the author was fully aware
         | of the N^2 complexity but opted for a simpler implementation
         | anyway.
        
           | somat wrote:
           | Slight scarcasm. yes too hard.
           | 
           | http://man.openbsd.org/qsort
           | 
           | having said that this specific sort is somewhere deep in
           | kernel boot land. and kernel code can't really use the
           | standard library. I am not sure if there is a standard kernel
           | sort.
        
           | pjc50 wrote:
           | qsort() not good enough for you?
           | 
           | (More realistically, below people are discussing that in the
           | kernel environment the set of standard or third party library
           | available may be unavoidably limited)
        
             | jart wrote:
             | qsort() is pretty slow if you're sorting something like
             | long[]. In that case, radix sort goes 5x faster and
             | vectorized quicksort goes 10x faster. If you're sorting
             | int[] then vectorized quicksort goes 25x faster than
             | qsort(). Nothing goes faster. The issue is it's a big ugly
             | c++ monstrosity that's a huge pain to compile.
        
               | rcxdude wrote:
               | That's fair if the constant factor is relevant, but if
               | bubble sort is terminating in any reasonable timescale
               | then the difference between qsort, C++ std::sort, and a
               | custom implementation is really not a factor.
        
               | jart wrote:
               | People who don't compute much data don't need computer
               | science.
        
               | paulddraper wrote:
               | But if you're comparing to bubblesort.....
        
             | JdeBP wrote:
             | Nonetheless, qsort() is available in libkern.
        
               | canucker2016 wrote:
               | Not at the time that the bubble-sort code was used
               | though. see https://news.ycombinator.com/item?id=36005209
        
               | JdeBP wrote:
               | You mean written. It's been available for many of the
               | years that it has been _used_. (-:
        
             | nice2meetu wrote:
             | There is a quirk (almost a bug?) in FreeBSD's qsort where
             | it will switch to insertsort for the whole array under
             | certain conditions, which we hit in production given how
             | our data was arranged.
             | 
             | (I think this was the check)
             | https://github.com/freebsd/freebsd-
             | src/blob/main/lib/libc/st...
        
             | IshKebab wrote:
             | No. In most languages sorting a container is `foo.sort()`
             | or something similar. `qsort()` is much more faff.
             | 
             | I mean, clearly it wasn't good enough otherwise they would
             | have used it, no? Perhaps integrating it was a huge faff.
        
           | cultofmetatron wrote:
           | downvote me if you want but this is probably the best
           | argument for rust I've ever seen
        
             | EuropeOverlords wrote:
             | world is not simple, rust wont help you make world simpler.
             | so moot point. read OP again, you misunderstood.
        
               | cultofmetatron wrote:
               | in what way did I misunderstand?
               | 
               | > C makes it too hard to use a sorting library that
               | someone competent has written.
               | 
               | rust makes it easy to create generic data structures via
               | generics. It would be trivial to swap between different
               | underlying sorting algos as long as it was given a type
               | compatible function for how to compare the contained
               | items.
               | 
               | what exactly am I not understanding here?
        
               | TingPing wrote:
               | The fact that this was a developers mistake and has
               | nothing to do with the language.
        
           | JohnFen wrote:
           | > that C makes it too hard to use a sorting library that
           | someone competent has written.
           | 
           | This makes no sense to me. What about C makes it hard to use
           | a good library?
        
             | IshKebab wrote:
             | There's no standard build system. Think about how you add a
             | dependency in Rust, Go, JavaScript, even Python.
             | 
             | Now do it in C/C++. Absolute nightmare.
             | 
             | Look at how many header-only C++ libraries there are out
             | there. They're making compilation time drastically worse
             | _purely_ to avoid the need to faff with a shitty build
             | system. It 's a good trade-off too. I use header-only
             | libraries where possible. They're convenient and popular.
             | 
             | Actually vcpkg seems to be going some way to fixing that
             | but I have yet to convince my co-workers to use it.
        
               | JohnFen wrote:
               | > to avoid the need to faff with a shitty build system.
               | 
               | Then maybe don't use a shitty build system?
               | 
               | It's true, C is not trying to be a programming
               | environment or tech stack. It's a language, that's it.
               | Whether or not that's desirable depends on what you're
               | trying to do, it's not something that is good or bad in
               | some absolute sense.
               | 
               | You have your choice of build systems, so pick one that
               | meets your needs.
               | 
               | Vcpkg isn't for me, either, because it doesn't solve any
               | problem I have. If it does for you, awesome!
        
               | IshKebab wrote:
               | > You have your choice of build systems, so pick one that
               | meets your needs.
               | 
               | And can I pick the one that my dependencies use too?
               | Didn't think so.
        
               | JohnFen wrote:
               | Your dependencies have already been built. You're just
               | linking to them. So yes, you can.
        
               | inferiorhuman wrote:
               | And what if the dependencies haven't been built yet?
        
               | icedchai wrote:
               | Build them and install them in /usr/local, just like we
               | did 30 years ago.
        
             | jjoonathan wrote:
             | C arrays suck on the best of days and qsort requires that
             | you understand function pointers and types which are some
             | of the hairiest C syntax in common use. The C Clockwise
             | Spiral rule is truly special.
             | 
             | It's easy to lose sight of the climb once you're at the
             | top.
        
               | devnullbrain wrote:
               | I'd suspect kernel devs know about function pointers.
        
               | jjoonathan wrote:
               | I'd expect kernel devs to carry lots of bad habits and
               | poorly calibrated intuition from their noob days.
               | Example: "for loops are fine."
        
               | JohnFen wrote:
               | Well, we could debate this, but it's all irrelevant to
               | the assertion that C makes it hard to use good libraries.
        
               | jjoonathan wrote:
               | Oh, right, I forgot that C's library/package management
               | situation sucks so hard that makes the awful syntax look
               | like a comparatively small problem.
        
               | JohnFen wrote:
               | This actually made me laugh out loud. Yes, if your
               | problem with C is that it doesn't need a package
               | management mechanism like some other languages, then C is
               | clearly not for you. But C is very far from the only
               | language like this.
               | 
               | It's a bit like criticizing a fish for having no legs.
        
               | IshKebab wrote:
               | > if your problem with C is that it doesn't need a
               | package management mechanism like some other languages
               | 
               | The problem isn't that it doesn't _need_ one, it 's that
               | it doesn't _have_ one. I have no idea why you would think
               | that it doesn 't need one.
               | 
               | Well there is vcpkg now anyway so it finally does have
               | one.
        
               | JohnFen wrote:
               | I sense that we have some sort of real miscommunication
               | going on here, because the only response I can think of
               | to
               | 
               | > I have no idea why you would think that it doesn't need
               | one.
               | 
               | Is that I have no idea why anyone would think that it
               | _does_ need one.
               | 
               | Perhaps the disconnect is that you are wishing C
               | addresses different use cases than it addresses? That you
               | wish it were a different language? If so, that's fine.
               | Use a more appropriate language for your task. I just
               | find it odd if the criticism of C is that it isn't a
               | different kind of language.
        
               | cozzyd wrote:
               | C package management works great on my system.
               | # dnf install foo-devel
        
             | zrm wrote:
             | > This makes no sense to me. What about C makes it hard to
             | use a good library?
             | 
             | It doesn't have templates/generics.
        
               | acuozzo wrote:
               | The lack of namespaces is of far greater consequence.
        
               | Jorengarenar wrote:
               | LIBNAME_actual_function_name()
        
               | [deleted]
        
               | kevin_thibedeau wrote:
               | C11 added (limited) generic support
        
               | zrm wrote:
               | They used "_Generic" as a keyword but it doesn't really
               | do that.
               | 
               | Suppose I need to define a copy assignment operator for
               | the library's sort function to use. Is there a good way
               | to overload it? Can the library know what the size of
               | each element is based on its type without having to pass
               | it in as a parameter?
               | 
               | You can pass function pointers to the library, but that
               | quickly becomes awful.                 /* sort for basic
               | integer types */       void lib_sort_u8(uint8_t *a,
               | size_t count);       void lib_sort_i8(int8_t *a, size_t
               | count);       void lib_sort_u16(uint16_t *a, size_t
               | count);       void lib_sort_i16(int16_t *a, size_t
               | count);       void lib_sort_u32(uint32_t *a, size_t
               | count);       void lib_sort_i32(int32_t *a, size_t
               | count);       void lib_sort_u64(uint64_t *a, size_t
               | count);       void lib_sort_i64(int64_t *a, size_t
               | count);       /* specify a comparator */       void
               | lib_sort_comp(void *start, size_t nmemb, size_t size, int
               | (*compar), const void *, const void *));       /* specify
               | a comparator and an assignment operator */       void
               | lib_sort_assign_comp(void *start, size_t nmemb, size_t
               | size, void (*assign)(void *, const void *), int
               | (*compar)(const void *, const void *));       /* specify
               | a comparator, an assignment operator, and ... */
               | 
               | Or, you get one function that takes all of the arguments
               | and have to define and pass in a bunch of function
               | pointers and type size parameters that are each an
               | opportunity for bugs or UB in order to sort a simple
               | array of integers.
               | 
               | If my type needs a custom assignment operator, I need
               | each library I use to take that as an argument. One
               | expects the function pointer to take the arguments in the
               | order (src, dst), another as (dst, src), a third
               | specifies the return value as int instead of void, a
               | fourth takes the source argument as "void *" instead of
               | "const void *" in case you want to implement move
               | semantics and a fifth doesn't support custom assignment
               | operators at all.
               | 
               | It's no surprise that people prefer to avoid this.
        
               | remexre wrote:
               | You can't specialize a generic algorithm for specific
               | operations and data structures; in C++ terms, it gives
               | you overloading, not templates.
        
             | me_again wrote:
             | I think it comes down to ecosystem fragmentation.
             | 
             | A lot of languages now have common tooling (cargo for rust,
             | pip for python, etc) which makes it easier to find and
             | incorporate the libraries you want. Apparently there are
             | tools like https://conan.io/ but they're not as widely-
             | adopted.
             | 
             | C's build system is similarly non-uniform. Many packages
             | use Makefiles, others use different build mechanisms.
             | 
             | C has no universally-agreed error handling mechanism or
             | convention. Whether exceptions or golang's error interface,
             | you can generally assume that a random package in most
             | languages will handle errors the way you expect. In C it's
             | a lot more varied.
             | 
             | Similarly memory allocation - sometimes in a larger
             | application you want to customize how malloc and friends
             | work. How and whether you can do that for a C package is
             | non-uniform.
             | 
             | Mind you the C standard library has a sort() function which
             | will have sensible big-O behavior on pretty much any
             | platform. I suspect this specific problem is more to do
             | with this being kernel-mode code which has a lot of special
             | conditions.
        
               | JohnFen wrote:
               | But none of that supports the assertion that C makes it
               | hard to use good libraries. You can even use libraries
               | not written in C if you want.
               | 
               | If the argument is really "it's impossible to make a good
               | library in C", that's different. I'd very much disagree
               | with that, but it would be to the point.
        
               | giantrobot wrote:
               | The original assertion was about difficulty of using C
               | libraries in the kernel or bootloader. In the bootloader
               | you're the OS. There's no file system, no dynamic linker,
               | and no processes. There's no guarantee some third party
               | library will work in that environment. It might but it's
               | work to make sure it does or adapt it if it doesn't.
        
               | me_again wrote:
               | I'm saying "it is harder to consume good libraries in C,
               | because it is harder to find them & harder to build them;
               | and once you have done both, you find that good library A
               | and good library B work in very different ways, so you
               | have to do more work to adapt".
               | 
               | And I haven't mentioned the lack of a strong universal
               | string type, the way many libraries typedef their own
               | family of different kinds of integer, the way one library
               | will require you to free() a returned structure and
               | another will require you to call my_library_free()...
               | 
               | It all adds up to additional friction.
               | 
               | You don't have to agree! Maybe I am out of date, I
               | haven't really dealt with this since the mid 2000's. I'd
               | be thrilled to hear this isn't an issue any more.
        
               | pjmlp wrote:
               | Although I usually rant about C, using libraries is
               | surely not a problem specifically in the world of UNIX
               | package managers.
        
               | JohnFen wrote:
               | > You don't have to agree!
               | 
               | It's not really a matter of whether or not I agree. I was
               | just trying to understand what the assertion was!
               | 
               | I was baffled by the notion because I couldn't think of
               | anything inherent in the language that made it hard to
               | use good libraries. Now I understand that's not really
               | what the assertion was.
        
               | dmitrygr wrote:
               | > cargo for rust, pip for python, etc
               | 
               | GOOD! I love that C lacks this pollution. Means that code
               | is written for-purpose and tuned for-purpose.
        
               | palata wrote:
               | I am always amazed by arguments that say that not having
               | a language package manager like cargo or pip makes it
               | hard.
               | 
               | Really? Is it really considered hard to link a library
               | without them? Am I so old that somehow I grew up in a
               | world where linking a library was not considered black
               | magic?
        
               | Dylan16807 wrote:
               | Custom memory allocation is pretty optional and a lot of
               | the time could be handled with a single buffer.
               | 
               | Outside of that you don't need to deal with exceptions in
               | a sorting library, and you can happily make it a single
               | .c and .h
        
         | markus_zhang wrote:
         | Thanks, that's a fun read although I don't understand much of
         | it. I do understand the gist.
        
         | jug wrote:
         | This made me recall modern AI and the issue with quadratic
         | complexity in its transformers. Ooof! A breakthrough here would
         | be a true Breakthrough(tm) with remarkably larger context
         | sizes. Like it would barely even be a limit anymore and be
         | transformative (har har) to what they can be used for.
        
         | 908B64B197 wrote:
         | Discoveries and analysis like this blog post and the parent
         | show the difference between programmers and engineers.
        
       | bombcar wrote:
       | I would love to see a detailed analysis of various OS boot
       | sequences and times, it is kind of insane that modern computers
       | and phones can take minutes to boot; what ever are they doing?
        
         | silverwind wrote:
         | Could be worse. Some network routers take up to 30 minutes to
         | cold boot from what I hear, but they essentially never cold
         | boot because they are on UPS.
        
         | MarkSweep wrote:
         | The author of the tweet has given talks and made written
         | articles about his work on speed up the FreeBSD boot. Some
         | examples:
         | 
         | https://www.daemonology.net/papers/bootprofiling.pdf
         | 
         | https://youtu.be/10XRCiBtyhA
        
         | PhilipRoman wrote:
         | Mostly just questionable hardware design and spoiled
         | programmers who want to do too much in early boot. There is no
         | technical reason why you couldn't have a modern desktop booting
         | in a couple of milliseconds. It's a tradeoff between being fast
         | and being compatible with the rest of the world.
         | 
         | At the end of the day, boot times are determined not by
         | technical reasons but by how much the users are willing to
         | tolerate. If the market demanded low boot times, the engineers
         | would deliver.
         | 
         | But seriously, there is a router that I shall not name here
         | that spends 10 seconds sleeping to avoid a race condition after
         | initializing some irrelevant service. And of course the entire
         | boot process depends on this service.
        
           | xp84 wrote:
           | > not by technical reasons but by how much the users are
           | willing to tolerate.
           | 
           | Realizing this one clause is what helped me understand so
           | much, including economics and traffic (induced demand). I
           | call it the "misery constant."
           | 
           | Number of lanes doesn't matter: People will adjust their
           | lifestyles until they arrive at the highest level of misery
           | they can tolerate. Add a lane and wait for equilibrium to
           | reestablish at exactly the same level of congestion as
           | before.
           | 
           | Add a 10,000% (more??) increase in CPU and disk speed from
           | the 128K Macintosh to the MacBook Pro, and oh look, it takes
           | basically the same amount of time to boot or to launch an
           | application as it did then, as the developers adjusted their
           | wastefulness until they achieved the maximum level of delay
           | the customers were willing to bear.
        
           | pjc50 wrote:
           | > there is a router that I shall not name here that spends 10
           | seconds sleeping to avoid a race condition after initializing
           | some irrelevant service
           | 
           | Reliability is an issue here. There's all sorts of bits and
           | pieces where it's miserably difficult to get a signal or
           | event that something has come up, or there is such a
           | mechanism and it's unreliable. Especially when there's third-
           | party closed drivers or firmware involved. So somebody ends
           | up giving up and putting the delay in, testing it a few
           | hundred times, and declaring it done.
           | 
           | But yes, lots of systems could come up faster if enough
           | people cared.
        
           | ilyt wrote:
           | > Mostly just questionable hardware design and spoiled
           | programmers who want to do too much in early boot. There is
           | no technical reason why you couldn't have a modern desktop
           | booting in a couple of milliseconds. It's a tradeoff between
           | being fast and being compatible with the rest of the world.
           | 
           | You wouldn't even be able to initialize CPU & DRAM in "couple
           | of milliseconds"
           | 
           | Loading BIOS from flash alone would eat that budget few times
           | over
        
         | secondcoming wrote:
         | initialising hardware mostly.
        
         | EscapeFromNY wrote:
         | On linux you can use systemd-analyze and get a nice svg
         | waterfall chart of the boot-time dependency chains.
        
           | HeckFeck wrote:
           | So systemd comes with its own SVG, and presumably some sort
           | of XML composer. TIL.
        
           | yjftsjthsd-h wrote:
           | On the subset of Linux distros that use systemd, rather.
           | Thankfully, there are other options
           | (https://elinux.org/Bootchart seems a decent overview)
        
             | freedomben wrote:
             | > _On the subset of Linux distros that use systemd,
             | rather._
             | 
             | Worth noting that Fedora, RHEL/CentOS/et al, Debian,
             | Ubuntu, Arch and derivatives use systemd, so odds are
             | pretty high if you use "Linux" you also get systemd
        
               | taneliv wrote:
               | Android seems to be the most popular Linux distro on the
               | planet, though? Does it use systemd? I was under the
               | impression that it doesn't, but as it often is, I might
               | be wrong.
        
               | freedomben wrote:
               | Technically Android runs the Linux kernel, but I wouldn't
               | consider it a "Linux distro" in anywhere near the same
               | sense as I would an Ubuntu or Fedora, etc. The vast
               | majority of Android users don't even know they're running
               | the Linux kernel, which is not true of "distros."
        
               | taneliv wrote:
               | I would imagine the vast majority of OpenWRT users don't
               | know they're running the Linux kernel. It was installed
               | by the manufacturer or the operator. Assuming that's
               | true, does it make OpenWRT not a distro?
               | 
               | I understand your usage of the word "distro". I think
               | it's useful, but also quite a bit misleading (for
               | example, Wikipedia[1] lists Android as a distro).
               | 
               | Perhaps there ought to be another word for distributions
               | of Linux that require user to take active part in either
               | replacing a manufacturers' choice or setting it up from
               | scratch on an otherwise "empty" system. But I don't know
               | what it would be.
               | 
               | [1]
               | https://en.wikipedia.org/wiki/List_of_Linux_distributions
        
               | freedomben wrote:
               | > _I would imagine the vast majority of OpenWRT users don
               | 't know they're running the Linux kernel. It was
               | installed by the manufacturer or the operator. Assuming
               | that's true, does it make OpenWRT not a distro?_
               | 
               | My gut is to not consider OpenWRT a distro since it's
               | usually just the software part of a network appliance and
               | many/most users don't know they're running it, although I
               | concede that it does qualify in most senses as a distro:
               | downloadable image, a package manager, is a complete
               | product that uses the linux kernel, etc.
               | 
               | I fully concede that my opinion of "what makes a distro"
               | is subjective and arbitrary. It's very much the Linux
               | distro version of Stewart's test for obscenity[1] and is
               | a pretty unhelpful standard which value is quite low :-)
               | 
               | [1]:
               | https://en.wikipedia.org/wiki/I_know_it_when_I_see_it
        
               | yjftsjthsd-h wrote:
               | That's true; I think the only major user-facing Linux
               | distros without systemd (or with non-systemd versions)
               | these days are Gentoo, Void, Alpine, and Android (which
               | may or may not count). Embedded stuff is probably more of
               | a mixture (openwrt doesn't use systemd, yocto and
               | buildroot _can_ use it but optionally), and firecracker
               | feels closer to embedded than desktop but both ends are
               | interesting. I dunno, I mostly just object to conflating
               | "Linux" with systemd even if it is the most popular
               | option (but bear in mind that I have strong opinions
               | about "Linux" vs "GNU/Linux", so YYMV;]).
        
               | bombcar wrote:
               | I'd just like to interject for a moment. What you're
               | referring to as Linux, is in fact, systemd/Linux, or as
               | I've recently taken to calling it, systemd plus Linux.
        
             | ilyt wrote:
             | So 95% of the market.
        
         | rwmj wrote:
         | PCI enumeration turns out to be surprisingly slow, especially
         | when virtualized. Access to PCI config space is done by writing
         | an address to one register and reading the data at that address
         | from another register, repeat for every word in the PCI config
         | space for every device. Each of those register reads and writes
         | takes a VMEXIT, so goes through the host kernel and into qemu
         | and back. Oh yeah, and it used to be done twice, once by
         | SeaBIOS and once by the kernel (the SeaBIOS stuff has been
         | optimized since.)
        
           | xp84 wrote:
           | I'm basically a 5yo when it comes to hardware - so my
           | apologies for the dumb question - but I wonder naively, could
           | this enumeration be completely cached by default? Most
           | computers have most of the important stuff either soldered
           | down (laptops) or at least not swapped regularly. I think I'd
           | sign up immediately for a system where I had to hold a key at
           | startup the first time I installed a new GPU or something,
           | but otherwise it could boot into Windows in 3 seconds. Since
           | most devices that are likely to change are hot-pluggable, I
           | would imagine that those don't need to be checked on at boot
           | time. I'd love to know where I'm wrong on this, though.
           | 
           | Perhaps this is part of what Windows does -- I know that it
           | basically keeps a hibernated image of the non-logged-in RAM
           | state, for use when you shut down (and it throws it out if
           | you do an explicit Restart, for troubleshooting reasons)
        
           | toast0 wrote:
           | If you're dealing with modern systems (which would hopefully
           | include virtual machines), the PCI config space can be memory
           | mapped, and then you don't need to read it through the PCI
           | config registers.
        
         | jeffbee wrote:
         | Large systems need to initialize a lot of DRAM at power-on but
         | a reboot of a running Linux desktop these days should take
         | single-digit seconds.
        
       | tombert wrote:
       | Makes me realize how spoiled I am by standard libraries nowadays.
       | 
       | Outside of toy projects, interviews, and teaching them, I've
       | never actually written a sort function. I never saw the point;
       | the sort function built into the standard library of whatever
       | language I'm using is probably faster and/or more space efficient
       | than whatever thing I whip up in a few seconds.
       | 
       | Of course, that's a privileged opinion that I am afforded by the
       | fact that my entire career has been "Java and higher" languages.
        
         | repiret wrote:
         | In my first computer science class in college, after finishing
         | the unit on Red-Black Trees, the professor said something like
         | "Now you should never implement a Red-Black Tree again. If you
         | need a balanced binary tree, use std::map or whatever your
         | standard library provides. Same goes for all the other sorting
         | and data structures we've done in this class."
         | 
         | Relatedly:
         | 
         | Q: When is Python faster than C?
         | 
         | A: When the C programmer used a linked list or some O(n^2)
         | algorithm because they didn't have a standard library full of
         | easy to use well implemented data structures and algorithms.
        
           | norir wrote:
           | Even though I kind of mostly agree that you almost never
           | should implement your own rb tree, this advice is also kind
           | of disempowering and I worry that it is part of a mindset
           | that accepts the status quo when it is so often the case that
           | we could have better things if we just got our hands a little
           | dirty.
        
             | Dylan16807 wrote:
             | If you want to reinvent a wheel, you had better be
             | expecting and willing to spend significant time on your new
             | wheel.
             | 
             | It's not just getting your hands dirty, you need to be
             | strategic about where you want to spend your time.
        
             | tombert wrote:
             | I think better verbiage might be "99% of the time you
             | probably don't want to write one of these from scratch...at
             | least not for production".
        
           | tombert wrote:
           | That's actually something I've been saying for quite awhile
           | when people bring up microbenchmarks complaining about how a
           | language like Haskell is slow.
           | 
           | Like, yes, no question, if you try and implement a tightloop
           | in Haskell vs C, C is going to come out a lot faster.
           | 
           | However, no one writes Haskell like that! People rely pretty
           | heavily on standard libraries (or even third party
           | libraries), and on the optimizations afforded to them by the
           | "rich" language of Haskell.
           | 
           | C might be _capable_ of being faster, but that doesn 't imply
           | that a C program will _inherently_ be faster than a similar
           | Haskell program, especially if the programs are relatively
           | large.
        
       | sn9 wrote:
       | I see a lot of discussion here about quicksort as an alternative.
       | 
       | If I'm understanding correctly, we're talking about sorting
       | integers. Would something like radix sort not be more
       | appropriate? Or is there a reason to still prefer a more generic
       | sorting algorithm?
        
       | klyrs wrote:
       | > O(N^2) can bite hard when you're sorting over a thousand items.
       | Time to replace the bubblesort with something faster.
       | 
       | ... downthread ...
       | 
       | > I suspect quicksort will be enough though.
       | 
       | Somebody hasn't taken a freshman complexity theory class in a
       | while...
       | 
       | (Note, this is tongue in cheek. Worst case is not average case
       | and all that. I'm sure quicksort will do fine)
        
         | cperciva wrote:
         | [Deleted, humour doesn't come through in text.]
        
           | cpp_frog wrote:
           | Not even noon and Colin is already roasting. Reminds me of
           | [0], also by him.
           | 
           | [0] https://news.ycombinator.com/item?id=35079
        
             | cperciva wrote:
             | I haven't recovered from jetlag yet. Sorry, my snark went
             | overboard.
        
           | klyrs wrote:
           | I was mostly joking, but worst-cases include sorted and
           | reverse-sorted lists. I've seen those happen accidentally, in
           | practice, resulting in accidentally-quadratic runtime. This
           | is why, when I taught that class, I made students compare
           | sorting algorithms on real world data sets.
        
             | sn9 wrote:
             | Isn't the point of quicksort that the randomization of the
             | pivot selection means that you're extremely unlikely to run
             | into a pathological case for the specific pivot chosen?
             | 
             | And perhaps even more unlikely if you permute the input
             | sequence before sorting it.
             | 
             | I'm basing this off of my understanding of Skiena's
             | discussion of quicksort in _The Algorithm Design Manual_.
        
             | cperciva wrote:
             | Sure. In this case though the list doesn't change from boot
             | to boot, or for that matter from system to system; if we
             | hit a worst-case we would notice before the release
             | shipped.
             | 
             | Also, the worst case would be "systems take 2 ms extra time
             | to boot" so it's not exactly a disaster.
        
               | whimsicalism wrote:
               | > though the list doesn't change from boot to boot, or
               | for that matter from system to system; if we hit a worst-
               | case we would notice before the release shipped.
               | 
               | Wait what - why don't you just sort it at build time
               | then? I'm probably misunderstanding what you are saying
               | or maybe just a premature optimization.
               | 
               | e: Nevermind, read further in the thread and saw you
               | answered [0]
               | 
               | [0]:
               | https://twitter.com/cperciva/status/1659561673278226432
        
               | cperciva wrote:
               | We might end up doing that. A quicksort takes it from 2
               | ms down to 40 us though, and doing the sorting at build
               | time would take a lot of linker magic.
               | 
               | (The SYSINITs are records in separate object files which
               | are marked as going into a particular ELF section, so we
               | don't actually have a complete list of them until we
               | finish linking the kernel.)
        
         | pimbrah wrote:
         | he means quicksort will be enough to make further boot time
         | improvements due to precomputing and storing the sorted list
         | insignificants
        
         | whimsicalism wrote:
         | What freshman complexity class would teach you to conflate how
         | fast an algorithm is in practice with the asymptotic
         | complexity?
        
           | CaptainNegative wrote:
           | Not to mention that QuickSort isn't a single algorithm but
           | rather a class of algorithms (as the choice of pivot is left
           | unspecified), and there are methods to choose the pivot that
           | give O(n log n) worst case or O(n log n) expected running
           | time (e.g. median of medians or soft heaps for the former,
           | random choice for the latter).
        
           | yjftsjthsd-h wrote:
           | Most of them? If the instructor doesn't call out the
           | difference and point out that all terms of a O() expression
           | have invisible but important constants tacked on, it's easy
           | to accidentally conflate scaling complexity with actual total
           | speed.
        
             | bee_rider wrote:
             | Why would the instructor skip that part of the the
             | definition? Giving out wrong definitions as a prank or
             | something?
        
               | yjftsjthsd-h wrote:
               | They're not _intentionally_ avoiding it, they just get
               | caught up in the mathematical angle and only mention its
               | (dis)connection to real execution time in a passing
               | comment if at all. Or they might even mention that there
               | are constants in play... and then say that of course
               | since they 're constants we can ignore them for the
               | purposes of O() notation. It's the difference between
               | learning from PHDs and engineers, I suppose.
        
       | freedomben wrote:
       | At the risk of tempting the systemd flame wars back to life, is
       | there data on how systemd performs? I know faster boot time was a
       | major goal of the project[1].
       | 
       | [1]: http://0pointer.de/blog/projects/systemd.html
        
         | ilyt wrote:
         | My personal/corporate, n=~500 experience is that it boots
         | faster, shuts down slower (technically because it is doing it
         | "properly"), and if you have any problem during shutdown it is
         | absolute PITA to debug why.
         | 
         | We did managed to throw away few thousand lines of code of
         | fixed init scripts after moving to systemd and get rid of few
         | other stuff so it has been positive but definitely some of the
         | bad with a lot of good. Journald logging format is also kinda
         | fucked that causes some nasty edge cases
         | 
         | Except of systemd-resolved, this utterly radioactive piece of
         | shit can go fuck itself.
        
           | taneliv wrote:
           | Do you have more details on the journald issues?
        
             | ilyt wrote:
             | Here is 9 year old bug that is still not fixed:
             | 
             | https://github.com/systemd/systemd/issues/2460
             | 
             | Long story short: there is no useful indexing in journald
             | DB format which means looking for say "last message app
             | emitted" means looking thru every file.
             | 
             | As long as it is mmaped in or on tmpfs it is reasonably
             | fast, but if you say have last 1GB of logs stored on
             | spinning rust on NAS.... you either have 3-4 second wait
             | till it finds those few log lines, or buffer cache littered
             | with that GB of logs, pushing out actually useful stuff.
             | 
             | It literally have worse performance than grepping entire
             | /var/log last time I checked.
             | 
             | And it seems in its entirety it was "Lennart wanted to make
             | a binary log format" vs something actually useful... just
             | having SQLite DB instead would on top of being faster far
             | more useful (ability to SQL query logs on local node with
             | no extra fuss would be awesome)
        
       | [deleted]
        
       | RcouF1uZ4gsC wrote:
       | This is one of the cases where C's low-level nature bites.
       | 
       | Both C++ and Rust have pretty nice sorting algorithms in the
       | standard library. C does not (qsort doesn't really count as a
       | nice sorting algorithm which is why so many C programmers took
       | their own).
        
         | [deleted]
        
         | thomasmg wrote:
         | Why do you think qsort is not nice?
         | 
         | I could imagine that qsort is not an option in FreeBSD for
         | example because it is recursive and there is no stack at that
         | level, or there is a risk of stack overflow. If that's the
         | case, shell sort should be fine I guess.
        
         | BearOso wrote:
         | Can't use the standard library in the kernel, though. That's
         | why there's special implementations of a lot of things you take
         | for granted in kernel source, things like "print".
        
           | floxy wrote:
           | Seems like we are due for the cyclic interest in
           | microkernels. I wonder what's the latest research on that
           | front.
        
           | erk__ wrote:
           | As PHK points out [0] qsort was imported [1] into the kernel
           | in 1995, it was simply never used for the boot process, but
           | it is availible.
           | 
           | [0]: https://twitter.com/bsdphk/status/1659568675899228163
           | 
           | [1]: https://github.com/freebsd/freebsd-
           | src/commit/d89a1b600fc071...
        
             | canucker2016 wrote:
             | Looks like the code to bubble-sort the SYSINIT objects was
             | committed a few months BEFORE this kernel qsort code was
             | committed (late Aug 1995 vs early Nov 1995).
             | 
             | So that's why the sort-SYSINIT objects code didn't use
             | qsort.
             | 
             | I'd also guess that there were fewer SYSINIT objects to
             | sort back in 1995 than today.
             | 
             | see ~line 150 of sys/kern/init_main.c in
             | https://github.com/freebsd/freebsd-
             | src/commit/2b14f991e64ebe...
        
           | tczMUFlmoNk wrote:
           | Rust's `sort_unstable` is available in `core` and does not
           | require the standard library or the `alloc` crate, so, if my
           | understanding is correct, even `no_std` programs like the
           | kernel could use it?
           | 
           | https://doc.rust-
           | lang.org/core/primitive.slice.html#method.s...
           | 
           | (It's interesting that the _stable_ `sort`, by contrast, is
           | only available in `std` because it uses an allocating
           | timsort-variant.)
        
             | steveklabnik wrote:
             | So you're not wrong, but also, the Linux kernel uses a
             | modified variant of alloc, so not only can it use
             | sort_unstable, it also has access to sort https://github.co
             | m/torvalds/linux/blob/2d1bcbc6cd703e64caf8d...
        
             | tialaramex wrote:
             | Most relevant element of what's going on here is that
             | Rust's standard library is deliberately split three ways:
             | Core has stuff that should "just work" on your target
             | platform, Alloc adds stuff that further requires a Memory
             | Allocator, which is something target hardware doesn't come
             | with, but any non-trivial system is likely to build, then
             | Std adds stuff which requires an Operating System, such as
             | Files, and Threading
             | 
             | C is technically also split, so called "freestanding" C
             | implementations don't provide features like stdio but they
             | do have e.g. math functions. C++ attempts something similar
             | but it's a bit hit-and-miss as a truly "free standing" C++
             | doesn't entirely make sense, as systems that try to use C++
             | for very low level work often discover.
             | 
             | The next less relevant observation is that Rust chooses to
             | call these sort and sort_unstable - the one that's
             | guaranteed to be available doesn't get the easier name,
             | that's reserved for the one that definitely doesn't have
             | unexpected behaviour. If you don't know what an "unstable"
             | sort is, then sort_unstable's behaviour might in some cases
             | be astonishing. Some languages make the opposite choice,
             | which means programmers may use sort, not realising it's
             | actually an _unstable_ sort because the entire concept was
             | never explained to them.
             | 
             | This is one of many examples of Rust's safety achieved not
             | via some sophisticated technology but just by thinking
             | about why people screw up. Every Rust programmer who
             | without thinking writes sort gets a stable sort, if that
             | was too slow, they can choose sort_unstable, but doing so
             | prompts them to consider, wait, _is_ an unstable sort
             | correct here?
             | 
             | Finally the least relevant but hopefully still interesting
             | observation is that it is quite possible to have a stable
             | sort which doesn't require allocation, but the performance
             | is not so hot. It seems pretty unlikely that you must have
             | a stable sort and you can't afford an allocator and yet you
             | don't mind the performance hit. As I understand it though
             | some C++ libraries do offer this.
        
         | aidenn0 wrote:
         | FWIW, qsort (despite its name) does _not_ specify a quicksort
         | be used.
         | 
         | IIRC qsort will use a merge-sort on glibc when it can do so
         | without growing the heap.
        
           | JdeBP wrote:
           | However FreeBSD qsort(), in both libc and libkern, is
           | quicksort, falling back to insertion sort when there are
           | fewer than 7 items.
        
       | [deleted]
        
       | daneel_w wrote:
       | So a few milliseconds in total. Big whoop. OpenBSD's ahci(4)
       | driver stalls _5-6 seconds for each SATA device on the bus_ ,
       | just to make sure the device really is there and that the link
       | speed is correct...or something. My two OpenBSD machines, which
       | incidentally have 3 SATA drives each, spend almost 20 seconds on
       | that one segment alone during kernel startup.
        
         | 0x457 wrote:
         | Well, FreeBSD has other issues too. I had re-cable all of my
         | USB devices because it was taking forever (linux and windows
         | booted a lot faster).
        
           | ilyt wrote:
           | Huh ? Were they resetting under FreeBSD or something ?
        
             | 0x457 wrote:
             | I had it setup like this: some devices plugged into PC
             | directly, KVM plugged into the PC, hub plugged into PC, hub
             | on monitor plugged into hub and a bunch of things plugged
             | into hubs.
             | 
             | Half of the devices were just powered by USB and not using
             | any data, but only needed to be ON when the PC is ON.
             | 
             | The way it worked:
             | 
             | - FreeBSD would scan all USB devices
             | 
             | - get held up on some for no reason
             | 
             | - find a hub
             | 
             | - scan everything on a hub
             | 
             | - find another hub
             | 
             | - scan everything on that hub as well
             | 
             | Anytime during the scan it could run into a device that
             | takes longer (no clue which or why, only happened on
             | FreeBSD and never on Linux or Windows)
             | 
             | I'm sure windows and linux did the same thing, but both
             | shapely continued booting while FreeBSD waited and waited
             | and waited and waited.
             | 
             | Bias light on a monitor would sometimes completely stop the
             | process until unplugged.
             | 
             | USB stack on FreeBSD is forever cursed, I still remember
             | kernel panics when you unplug a keyboard or a mouse.
        
               | toast0 wrote:
               | > I'm sure windows and linux did the same thing, but both
               | shapely continued booting while FreeBSD waited and waited
               | and waited and waited.
               | 
               | My understanding is FreeBSD will by default wait until
               | it's found all the fixed disks before it figures out
               | which one to mount as root. This could be a USB drive, so
               | it needs to enumerate all the USB first. Adding
               | hw.usb.no_boot_wait=1 to /boot/loader.conf.local or
               | thereabouts if you don't need that will skip that wait
               | and speed up the boot process a lot (especially on
               | hardware where USB is slow to finish enumeration).
        
               | ilyt wrote:
               | IIRC most linux distros just get what they need to boot
               | from initrd/fstab and only complain if something from
               | there is missing
        
         | count wrote:
         | Colin's focus has been on speeding up EC2 boot time. You pay
         | _per second_ from time on EC2. A few milliseconds at scale
         | probably ads up to a decent amount of savings - easily
         | imaginable it 's enough to cover the work it took to find the
         | time savings.
        
           | taeric wrote:
           | Yes, you pay per second. But, per other notes, this is slated
           | to save at most 2ms. Realistically, it won't save that much,
           | as they are still doing a sort, not completely skipping it,
           | but lets assume qsort does get this down to 0 and they
           | somehow saved it all. That is trivially 500 boots before you
           | see a second. Which... is a lot of reboots. If you have
           | processes that are spawning off 500 instances, you would
           | almost certainly see better savings by condensing those down
           | to fewer instances.
           | 
           | So, realistically, assuming that Colin is a decently paid
           | software engineer, I find it doubtable the savings from this
           | particular change will ever add up to mean anything even
           | close to what it costs to have just one person assigned to
           | it.
           | 
           | Now, every other change they found up to this point may help
           | sway that needle, but at this point, though 7% is a lot of
           | percent, they are well into the area where savings they are
           | finding are very unlikely to ever be worth finding.
           | 
           | Edit: I saw that qsort does indeed get it into basically zero
           | at this range of discussion. I'd love to see more of the
           | concrete numbers.
        
             | vidarh wrote:
             | Missing the point, which is to shave off cold start times
             | for things like Lambdas, where shorter start times means
             | you can more aggressively shut them down when idle, which
             | means you can pack more onto the same machine.
        
         | jhalstead wrote:
         | Colin (and others?) have been working to reduce the boot time
         | of FreeBSD for a while now. At this point shaving off a few
         | milliseconds is probably a nice win for them.
         | 
         | https://twitter.com/cperciva/status/1659391725427720195?t=0y...
        
           | taeric wrote:
           | I mean, I get it. But shaving 4.5ms does seem to fall into
           | the realm of not most peoples problems?
           | 
           | Note that I want to stress that that is no reason for some
           | folks to stop focusing on it. And kudos on the improvement.
           | Criticism, if any, here is that statistics once again rears
           | its head to make something feel bigger by using a percent.
        
             | aidenn0 wrote:
             | My math says ~2ms (28ms*0.07 = 1.96ms)? Still, if it
             | mattered to get it down to 28ms, it might matter to get it
             | down to 26ms...
        
               | taeric wrote:
               | Agreed, but my math also says that if this is your pain
               | point, you'd save even more time by not firing off more
               | VMs? That is, skip the startup time entirely by just not
               | doing it that often.
        
               | aidenn0 wrote:
               | I don't startup VMs multiple times per hour, much less
               | per minute, so I don't assume to know what tradeoffs the
               | people using firecracker are making when deciding how
               | often to startup VMs.
        
             | MuffinFlavored wrote:
             | if 4.5ms is 7%, the entire boot time is 64ms?
        
               | taeric wrote:
               | Ha, I thought I saw 4.5ms from another post. Not sure
               | where I got that number. :(
               | 
               | Realizing this is why someone said the number was more
               | like 2ms. I don't think that changes much of my view
               | here. Again, kudos on making it faster, if they do.
        
             | maccard wrote:
             | 4.5ms on what hardware, in what scenario? Would I like to
             | save 5ms off the startup time of my VMs? You betcha. Does
             | that 5ms turn into 200ms on a low power device? Probably
        
               | taeric wrote:
               | I'm not even clear that I care on my VMs. I don't start
               | up enough for that to save me more than... a second a
               | month or so? Maybe. Probably not even.
        
               | maccard wrote:
               | Great! Then this doesn't affect you!
        
               | taeric wrote:
               | Right, but again pointing this at most people on this
               | forum, that answer is probably the same. Very few of us
               | are in a situation where this can save seconds a year,
               | much less seconds a month.
               | 
               | For the folks this could potentially save time, I'm
               | curious if it is better than taking an alternative route.
               | Would be delighted to see an analysis showing impact.
               | 
               | And again, kudos to the folks for possibly speeding this
               | up. I'm assuming qsort or some such will be faster, but
               | that isn't a given. For that matter, if it is being
               | sorted so that searches are faster later, than the number
               | of searches should be considered, as you could probably
               | switch out to sentinel scanning to save the sort time,
               | and then you are down to how many scans you are doing
               | times the number of elements you are looking against.
        
               | daneel_w wrote:
               | How would you even discern this delay from all the other
               | delays happening before you're done booting, when there
               | are so many other natural variances of a couple of
               | milliseconds here and there? Every "on metal" boot is
               | unique, no two are exactly as fast, and certainly never
               | within 1.98 milliseconds of eachother even in the case of
               | a VM with lots of spare resources on the host. You're
               | painting a too pretty picture of this.
        
             | nextaccountic wrote:
             | > I mean, I get it. But shaving 4.5ms does seem to fall
             | into the realm of not most peoples problems?
             | 
             | If whoever is fixing this depend on quickly launching
             | instances in Firecracker, it's their problem. And that's
             | how open source is usually done
        
               | taeric wrote:
               | Sorta. They have to rely on this on a repeated basis such
               | that it matters. And there, if startup time is still 10x
               | this, roughly, why push for solutions that require that
               | many kernel startups?
        
               | vasco wrote:
               | Why do anything, we're all going to die!!
               | 
               | Super fast boots of containers is a good effort, would be
               | cool to be as fast as spawning a process or a thread!
        
               | taeric wrote:
               | That is silly. I already said I do not intend this as a
               | criticism of the folks working on it. Kudos on the
               | improvement.
               | 
               | That said... for most people, you are better off learning
               | to use threads or processes. It is almost certainly less
               | effort than spinning up a VM. Not to mention everything
               | else you will, by necessity, have to do if you are using
               | VMs for everything. Unless you find ways to do shared
               | network connections across VMs and such, but... I'm
               | curious how we can make things closer to threads, without
               | eventually bringing all of the dangers of threads along
               | with it?
        
               | rcoveson wrote:
               | Why make "for most people" points when we're already
               | talking about FreeBSD?
               | 
               | There's a sort of uni-/light-kernel race quietly
               | happening right now since the industry doesn't really
               | fully trust process isolation within an OS for running
               | completely untrusted code. At least not as much as it
               | trusts VT-X/AMD-V. In that race, kernel startup time is
               | just like JVM startup time, or Julia startup time, or
               | Python startup time, all of which are things that people
               | work on shaving milliseconds from.
        
               | taeric wrote:
               | Ouch. That feels even harsher to the idea than what I'm
               | saying. :D
               | 
               | To your point, my "for most people" is aimed at this
               | forum. Not the folks doing the work. Is incredibly cool
               | that the folks doing this are also on the forum, but I
               | think it is safe to say they are not most of the folks on
               | here. That not the case?
        
               | rcoveson wrote:
               | Say the post had been "How Netflix uses FreeBSD to
               | achieve network latencies less than 10 microseconds" or
               | something. How helpful would it be to comment about how
               | "for most people" this doesn't matter?
               | 
               | Did anybody who read the tweet think it mattered to them
               | when it didn't? It mentions Firecracker explicitly. How
               | many people on HN do you think upvote this because they
               | themselves are running FreeBSD on Firecracker, versus the
               | number who upvoted because it's just interesting in and
               | of itself?
        
               | [deleted]
        
               | lacrosse_tannin wrote:
               | why not boot a good os in firecracker instead?
        
         | ilyt wrote:
         | ...wat? How something that bad could linger for so long?
        
           | aidenn0 wrote:
           | It adds 2ms to the boot time on modern hardware, and the list
           | it is sorting has grown by about 2 orders of magnitude since
           | it was introduced. Seems reasonable to me that it was
           | unnoticed.
        
             | mardifoufs wrote:
             | I think the comment you are replying to was referring to
             | the openbsd issue that apparently adds 10-20secs per boot
             | in the GP comment.
        
           | tedunangst wrote:
           | Because none of the people afflicted have done anything about
           | it.
        
         | kevingadd wrote:
         | If you boot OpenBSD thousands of times per day it adds up. I
         | can imagine this being the case for people running big server
         | farms or doing virtualization stuff.
        
           | daneel_w wrote:
           | These are servers and I reboot them only when necessary. It's
           | a pet peeve, but a big one.
        
             | ilyt wrote:
             | In my experience time to boot everything else on server far
             | exceeds any OS shenaningans.
             | 
             | I had servers that took up to 5 minutes from start/reboot
             | to GRUB prompt. And they were not some 60 drive monsters
             | but typical dual CPU 1U servers...
        
           | trzy wrote:
           | At 7% of 28ms, if you boot it a thousand times per day you
           | get a whopping 2 seconds.
        
             | ilyt wrote:
             | no, if your SATA/SAS stalls for 20s you get 20s
        
         | EuropeOverlords wrote:
         | OS should save all serial numbers of devices and after they are
         | all found continue. we should make it declarative or how do you
         | want to call it.
         | 
         | also why cant freaking boot process be optimized after first
         | boot? bsd is essentialy sorting SAME thing every boot, THAT is
         | ridiculous. sort once, save order ( list, sysinit000000000000
         | ), boot fast next time. hardware or other change or failed
         | boot, can trigger start sorting bull for safety.
         | 
         | you know what youre booting into, so sort it once then run it
         | from saved order next time. how many times you change hardware
         | on computer ? and if you do, you can just restart with grub
         | flag, toggle switch in control panel before restart, etc
        
           | Denvercoder9 wrote:
           | > toggle switch in control panel before restart
           | 
           | Ah yes, let's brick the OS when hardware fails and has to be
           | replaced without advance notice.
        
           | AdamH12113 wrote:
           | Windows XP had a tool called BootVis[1] that would try to
           | optimize the startup sequence. Supposedly newer versions of
           | windows do this sort of thing automatically.
           | 
           | I suspect much of the delay comes from the nature of plug-
           | and-play. The OS has to poll every bus and every port to find
           | every device, then query those devices to determine what they
           | are, then load a driver and actually configure them. It
           | simply can't rely on the hardware being exactly the same from
           | one boot to the next.
           | 
           | [1] https://en.wikipedia.org/wiki/BootVis
        
           | exmadscientist wrote:
           | The code to manage "do I sort or do I cache" is probably
           | worse than the code to just sort it unconditionally.
           | 
           | And you really want to do this automatically, with no manual
           | anything, because (among other reasons) if you need to swap a
           | part on a broken machine, you do not want that to needlessly
           | break the software. So you're sorting regardless, and so you
           | might as well just be unconditional about it.
        
           | ilyt wrote:
           | That's nice way to add fuckloads of useless code while
           | wasting memory and making now-read-only process into read-
           | write process (kernel have no place _to_ write cache before
           | userspace initializes and sets up mounts and filesystems).
           | 
           | ...vs just putting quicksort in
        
         | MichaelZuo wrote:
         | How do the other OS's get around this?
        
           | ilyt wrote:
           | Well, for one I'd imagine you can do all ports at once. Or
           | start doing it and allow rest of drivers to initialize their
           | stuff while it is waiting
           | 
           | At least looking in my dmesg on linux all SATA drives are
           | getting ready rougly at same time 0.6 s into boot AHCI driver
           | gets initialized and 3.2 seconds into boot they are all up.
           | Looks parallel too as ordering does not match log order
           | [    3.192274] sd 6:0:0:0: [sdc] 500118192 512-byte logical
           | blocks: (256 GB/238 GiB)         [    3.192277] sd 7:0:0:0:
           | [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
           | [    3.192277] sd 8:0:0:0: [sda] 5860533168 512-byte logical
           | blocks: (3.00 TB/2.73 TiB)
        
           | daneel_w wrote:
           | I don't know, but I don't think there's an actual case that
           | Linux/FreeBSD/etc. happens to have a workaround for. I think
           | it's just OpenBSD charging towards a windmill.
           | 
           | It's not impossible that it's some archaic and since long
           | invalid legacy thing along the lines of "let's wait 5 seconds
           | here just in case the drive hasn't managed to spin up,
           | despite the computer having been powered on for 15 seconds
           | already, because that happened once in 2003 on someone's 1989
           | HPPA system so we need to support that case". I'm not joking,
           | this is really what I imagine it being about.
        
             | adrianmonk wrote:
             | That's believable and maybe even reasonable.
             | 
             | There were (or still are?) SCSI drives that would not spin
             | up until told to over the bus. I think the idea is to not
             | do them all at once since the motor draws a lot of power.
             | 
             | I'm fairly sure I've run into drives that would not respond
             | on the bus while spinning up.
             | 
             | And if it happens with SCSI drives, it may happen with
             | other types.
        
               | daneel_w wrote:
               | _> "There were (or still are?) SCSI drives that would not
               | spin up until told to over the bus"_
               | 
               | Surely they'd spin up as soon as the BIOS started probing
               | the bus to init the hardware, long before the kernel was
               | running, or they'd be infamous for being "the HDDs you
               | cannot boot an OS from"...
               | 
               | In my 25 years of using PCs I have not once come across a
               | drive that did not spin up as soon as the computer was
               | powered on. But whatever the case is, Linux and FreeBSD
               | never had this behavior. Waiting some arbitrary amount of
               | time isn't an appropriate solution (to what I insist is
               | just an imagined problem), it's just a poor bandaid.
        
               | jasomill wrote:
               | Spinning up only upon request is common behavior for SCSI
               | drives. Per the Ultrastar DC HC550 manual I have handy,
               | 
               |  _After power on, the drive enters the Active Wait state.
               | The Drive will not spin up its spindle motor after power
               | on until it receives a NOTIFY (Enable Spinup) primitive
               | on either port to enter the Active state. If a NOTIFY
               | (Enable Spinup) primitive is received prior to receiving
               | a StartStop Unit command with the Start bit set to one,
               | spin up will begin immediately. For SAS, this is
               | analogous to auto-spinup function in legacy SCSI. This
               | provision allows the system to control the power spikes
               | typically incurred with multiple drives powering on (and
               | spinning up) simultaneously._
               | 
               |  _If a StartStop command with the Start bit set to one is
               | received prior to receiving a NOTIFY (Enable Spinup), the
               | drive will not start its spindle motor until Notify
               | (Enable Spinup) is received on either port. Successful
               | receipt of a NOTIFY (Enable Spinup) is a prerequisite to
               | spin up._
               | 
               | Code in the SCSI controller boot ROM, _if enabled_ , does
               | typically handle this before the OS starts, often with an
               | option to stagger spin-up for the reason mentioned above.
               | 
               | For what it's worth, to speed up boot, I typically
               | disable the OPROM on any storage controller not hosting a
               | boot device.
               | 
               | Moreover, as BIOS, UEFI, Power Mac, ..., all require
               | different, incompatible firmware bits, enabling
               | controller firmware isn't an option for many otherwise
               | compatible controllers.
               | 
               | Regardless, possible spin up delays in no way justify
               | introducing explicit delays in device enumeration; if the
               | OS needs to ensure a drive is spun up, multiple
               | mechanisms exist to allow it to do so without introducing
               | artificial delays (TEST UNIT READY, non-immediate START
               | STOP UNIT, simply paying attention to additional sense
               | information provided by commands known to fail before
               | spin up and retrying as appropriate, etc.).
               | 
               | IIRC, explicit multi-second delays between controller
               | initialization and device enumeration were traditionally
               | introduced because some (broken) devices ignored commands
               | entirely -- and were therefore effectively invisible --
               | for a significant period of time after a bus reset.
               | 
               | With that said, introducing a multi-second delay to
               | handle rare broken devices is sufficiently strange
               | _default behavior_ that I assume something else I 'm not
               | aware of is going on.
        
               | toast0 wrote:
               | > Surely they'd spin up as soon as the BIOS started
               | probing the bus to init the hardware, long before the
               | kernel was running, or they'd be infamous for being "the
               | HDDs you cannot boot an OS from"...
               | 
               | Depends on the market they sell into, might not be a big
               | deal for enterprise drives that you don't expect to boot
               | from; especially if you're attaching them to a controller
               | that you also don't expect to boot from. I had some giant
               | quad processor pentium II beast after it was retired, and
               | the BIOS could not boot from the fancy drive array at
               | all; and even if it could, the drives were on a staggered
               | startup system, so it would have had to wait forever for
               | that.
        
               | daneel_w wrote:
               | Contrived case. Such a storage solution would still not
               | be something that ahci(4) on OpenBSD - a basic SATA
               | controller driver - could wrangle, no matter how many
               | seconds it patiently waited.
        
       | [deleted]
        
       | chaxor wrote:
       | Man that's crazy. Think of all that extra time you'll have if
       | timsort was used! An extra 2ms to go on that walk while your
       | system is booting.
        
         | operator-name wrote:
         | I wanted to figure out the cost to all users.
         | 
         | BSD is sometimes used in some iot-like devices, of which there
         | are 3 billion. If I randomly assume 0.5% of those are running a
         | fork of FreeBSD, that's 15 million devices. I'm also going to
         | randomly assume that devices reboot once a month on average.
         | 
         | That all comes out to 4.2 days/year saved! Wow!
         | 
         | https://www.wolframalpha.com/input/?i=2ms+*+3+billion+*+0.5%...
        
         | pkaler wrote:
         | This is for FreeBSD Firecracker. This means this is used for
         | AWS Lambda and Fargate. A whole bunch of 2ms wins is how one
         | would improve cold start time.
        
         | noisy_boy wrote:
         | This is talking about Firecracker[0] which is for microvm
         | deployments so I would think it is the aggregate effect of
         | losing that 7% across a large number of spin-ups that would
         | make a difference. Occasional reboot on your personal machines
         | obviously doesn't make a dent.
         | 
         | [0]: https://firecracker-microvm.github.io/
        
           | vidarh wrote:
           | The lower the boot time, the more aggressively you can also
           | afford to shut them down when not in use, freeing up
           | capacity.
        
           | ilyt wrote:
           | I'd imagine if you wanted absolute boot time you'd just use a
           | container
        
       | 0xbadcafebee wrote:
       | Funny that this is what trends, and not the flame chart of the
       | different boot times of different FBSD versions:
       | https://twitter.com/cperciva/status/1659059445685420032
       | 
       | At a glance looks like 13.1-RELEASE booted 3x faster than
       | 11.1-RELEASE, and 14.0-CURRENT on Firecracker boots 45x faster
       | than 11.1-RELEASE on a c5.xlarge
        
       | aflag wrote:
       | Is there no link for the actual code?
        
         | JdeBP wrote:
         | It's FreeBSD. The source for base comes with the operating
         | system, in this particular case in
         | /usr/src/sys/kern/init_main.c .
        
           | canucker2016 wrote:
           | Starting around line 255 in init_main.c (or search for
           | bubble). see https://github.com/freebsd/freebsd-
           | src/blob/40b287054521f0a9...
        
             | jodrellblank wrote:
             | (btw If you click on the line numbers in Github, you get a
             | three-dots popup, click that and you can "Copy Permalink"
             | to get a link directly to the line which has #L255 on the
             | end, e.g. https://github.com/freebsd/freebsd-
             | src/blob/40b287054521f0a9... )
        
       | asveikau wrote:
       | One of the replies:
       | 
       | https://twitter.com/cperciva/status/1659577625289928705
       | 
       | So this bubble sort is just under 2ms.
       | 
       | I think it's cool to find and optimize that, he says he'll put in
       | a qsort and that's great. But 2ms? Probably not a huge deal.
        
       | gkbrk wrote:
       | Based on the kernel boot time (28 ms) given by the author, this
       | process takes 1.97 milliseconds.
       | 
       | I'd consider this less "haha dum code is slow" and more "you are
       | booting the kernel way too much, even a 1.97 millisecond delay is
       | noticable".
        
         | cperciva wrote:
         | 2.031 ms, I rounded down.
        
         | pjc50 wrote:
         | So that's a very impressive number, but how long does it take
         | to bring up userland?
        
           | loeg wrote:
           | Years (to be a little hyperbolic). FreeBSD still doesn't have
           | a good pid 1 story. It still has a very simple /sbin/init
           | that just spawns whatever shell script happens to live in
           | /etc/rc. That shell script essentially does a topology sort
           | of service dependencies and then starts each one individually
           | and serially.
        
         | lkbm wrote:
         | Probably true. My question was "how often is this run?"
         | 
         | I assume it's not _just_ once when I first turn on my laptop
         | from fully off. Is it once every time I restart a Docker
         | container? (Assuming FreeBSD images.)
         | 
         | Is it something that runs a couple times a day, a dozen times a
         | day, a thousand times a day? Fifty times every time I open an
         | Electron app?
         | 
         | I suspect this doesn't matter and is more a curiosity, but I
         | wouldn't be shocked to learn otherwise.
        
         | shpx wrote:
         | You lack the imagination to come up with a use for some
         | improvement off the cuff, that doesn't mean everyone else (some
         | of whom are more motivated) does as well.
         | 
         | Why do we even need to waste any consciousness-seconds on
         | watching electric circuits initialize? Computers should just
         | start working
        
         | Qweiuu wrote:
         | We now boot so often kernels this is important.
         | 
         | My tv boots a kernel, Alexa's, laptops, cars etc.
         | 
         | And this might be not specifically critical but it's a sign of
         | people just doing something less optimal
        
           | [deleted]
        
           | taeric wrote:
           | Do we? And... do we know that we can make it significantly
           | faster? To wit, will the code be bigger to do a smarter sort
           | such that that itself causes problems? Or, even with a faster
           | sort, is it still 1.6ms?
        
           | LeifCarrotson wrote:
           | Do you think those devices boot every time you send them an
           | input, or just when they're plugged in? Unless you're turning
           | the TV on and off thousands of times per day this would be
           | unnoticeable.
        
             | Qweiuu wrote:
             | I probably boot a few embedded devices per day.
             | 
             | If someone did not optimize this this might indicate other
             | optimizations
        
               | ilyt wrote:
               | Well, a microcontroller in a microwave is a bit different
               | than fully fledged POSIX system and both are "embedded
               | devices".
               | 
               | But yes, I also hate when home router takes forever to
               | reboot and reconnect
        
             | treesknees wrote:
             | Certain systems on a vehicle make sense to power off. Do
             | you really need the unit controlling your backup camera to
             | be booted into standby while the car is turned off?
        
               | LeifCarrotson wrote:
               | No, but I'm not taking thousands of six-inch family
               | vacations per day, I'd walk if I wanted to travel a
               | distance for which two milliseconds was noticeable.
        
               | treesknees wrote:
               | I think you're mixing (bad) analogies to make some type
               | of point but I'm not sure what it is.
               | 
               | The original comment you replied to mentioned cars in the
               | context of booting devices often, so I simply pointed out
               | that yes, actually it makes sense to boot up some devices
               | when they go to receive input and not to leave them in
               | standby as you suggested. In a vehicle that may sit
               | off/idle for several weeks, it's best not to drain the
               | battery unnecessarily.
        
         | rwmj wrote:
         | Sure but Firecracker is used for AWS lambdas, so you're going
         | to be booting those extremely frequently. (Now I wonder if
         | Amazon is using FreeBSD for AWS lambda ..?)
        
           | cperciva wrote:
           | AWS Lambda does not support FreeBSD.
           | 
           | It would be great if that changed though...
        
         | MrDOS wrote:
         | I think the "Firecracker" referenced in this Tweet is Amazon's
         | "microVM" environment[0]. VMs running in this environment are
         | sometimes booted to service scheduled/batch jobs, but are, I
         | think, more often launched in direct response to (and with the
         | intent of serving) an incoming HTTP request (and then stick
         | around for a few minutes afterwards in case there are any
         | subsequent requests). Network latency from the continental US
         | to the nearest AWS datacentre is typically <30ms, so even after
         | you add the response time of the application running in the VM,
         | an extra 2ms boot time still probably makes up >2% of the total
         | cold response time.
         | 
         | [0]: https://firecracker-microvm.github.io/
        
           | nerpderp82 wrote:
           | The problem here isn't the boot time, but the booting _at
           | all_!!
        
             | giantrobot wrote:
             | Having a microVM handle a rare endpoint and then exit is
             | often cheaper than leaving an EC2 instance running
             | constantly. An EC2 instance also has extra administration
             | needs. At some scales the savings can be significant.
             | 
             | Shaving ms off the boot of that VM is a performance boost
             | of that lambda (or whatever the cloud vendor calls it).
        
           | chongli wrote:
           | So wouldn't this fall under the category of "booting the
           | kernel too much"? Why not find a way to keep already-booted
           | kernels laying around and recycle them, rather than rebooting
           | them afresh on every request? This pattern has already played
           | out in other situations and been solved this way, such as
           | with object pools or thread pools.
        
             | NovemberWhiskey wrote:
             | This is the "AWS Lambda cold start" use case: you're not
             | doing this _every_ time, but you are doing it on scale-up,
             | initial deploy, etc. Depending on complexity, those costs
             | start to hit your p99 as well as your p100.
        
               | [deleted]
        
             | pavon wrote:
             | All that sounds like it adds _much_ more complexity than
             | swapping out bubble sort for qsort, and the other various
             | optimizations that have been going in to reduce boot time.
             | Plus optimizing common code improves boot time for everyone
             | running FreeBSD - routers, firewalls, etc, whereas pooling
             | just helps that one use case. If you can make it startup up
             | fast enough that you don 't need any of that extra
             | machinery, why wouldn't you?
        
           | trasz4 wrote:
           | This will also make a lot of difference once we can run
           | lightweight, userspace kernels ("rump kernels").
        
             | yjftsjthsd-h wrote:
             | > once we can run lightweight, userspace kernels ("rump
             | kernels").
             | 
             | Er, didn't NetBSD accomplish that... like a decade ago,
             | now? Or is FreeBSD looking into supporting rump kernels?
        
               | trasz4 wrote:
               | Yes, NetBSD did accomplish that, but I expect it to
               | become more commonplace.
        
               | EuropeOverlords wrote:
               | but question is, is this solution for people running
               | software on other peoples computers ? or solution for
               | running my own code on my own computer only ?
        
               | trasz4 wrote:
               | For my particular case it's about something similar in
               | purpose to sandboxing, but with providing the compartment
               | (ie a process subtree) with an alternative kernel to talk
               | to, to minimise the attack surface between that container
               | and the host kernel.
        
               | yjftsjthsd-h wrote:
               | Does it matter? Running kernel pieces in userspace is
               | useful in both directions (rather like docker), and in
               | both cases we'd like it to run as quickly as possible.
        
         | Paul-Craft wrote:
         | Yeah, and Ahmdal's law comes directly into play here. At 7% of
         | the total boot time, this bubble sort may be the slowest part
         | of the kernel boot. But, even if you optimized away the sort
         | entirely somehow, you could still only save 1.97 ms. And once
         | you do that, the next slowest thing probably doesn't take more
         | than ~1ms to run. Eventually, you get to a state that resists
         | these kinds of targeted optimizations, where there are no
         | bottlenecks, and the only improvements can come from new
         | algorithms and data structures, _i.e._ "uniformly slow code."
         | 
         | Of course, calling anything that takes place in less than a
         | blink of an eye "slow" is overstating things... lol :-)
         | 
         | If only I could get my desktop environment to come up in less
         | than 28ms :-)
         | 
         | https://en.wikipedia.org/wiki/Amdahl%27s_law
         | 
         | https://wiki.c2.com/?UniformlySlowCode
        
           | EuropeOverlords wrote:
           | exactly, it is not problem of making it slower / faster or
           | which is slower/ faster.
           | 
           | BUT
           | 
           | that we use solutions across multiple problems instead
           | providing solution specifically for that problem. if you have
           | KNOWN vm, on KNOWN hardware then why you need all kinds of
           | hardware and other bull.. initializations, checks.... ? just
           | provide list of things it has to have then after list is run
           | over without problem, continue. also why would
           | amazon/gcp/azure care ? they are making money from you
           | running inefficient solutions....
        
             | EuropeOverlords wrote:
             | main point of running unixy stuff is exactly that i can
             | mold my solution to specific problem EASILY. thats why it
             | is so common in embedded market ( routers, kiosks,drones,
             | car infotainment, TV... )
        
       | erdaniels wrote:
       | But how long on average does it take to boot in wall clock time?
        
         | Skunkleton wrote:
         | This is firecracker, so less than a second I guess?
        
         | cwillu wrote:
         | This is from a year old tweet so it may not be correct anymore,
         | but:
         | 
         | "FreeBSD/Firecracker now boots to a login prompt in 782 ms. 170
         | ms of this is spent generating a self-signed TLS certificate
         | for sendmail, because of course it is."
        
           | pjc50 wrote:
           | It's 2023, I don't think any system calling itself "minimal"
           | should have an MTA.
        
             | stephan-cr wrote:
             | Recently, I tested FreeBSD and it seems one can disable
             | sendmail. See https://docs.freebsd.org/en/books/handbook/bs
             | dinstall/#bsdin....
        
             | JdeBP wrote:
             | The notion that the system mails things to certain local
             | accounts is still quite strong in FreeBSD. In fairness,
             | it's a notion that has a long history in Unices in general;
             | so it isn't FreeBSD Think.
             | 
             | On the gripping hand, it's mad that it's still Sendmail,
             | although that is going to change. It's 2023, and the fact
             | that a FreeBSD system still has the traditional ~2000 lines
             | of indistinguishable-from-line-noise "DO NOT EDIT THIS"
             | stuff in /etc/mail/sendmail.cf, complete with whole bunches
             | of rules for UUCP, is rather amazing.
             | 
             | I wonder how many small FreeBSD systems out there are
             | spending their time double-bouncing cron-job mail that
             | cannot be delivered. (-:
        
           | yakubin wrote:
           | FreeBSD 14 is going to replace sendmail with DragonFlyBSD's
           | dma.
        
         | cperciva wrote:
         | In that particular case, 620 ms for the entire boot process. 28
         | ms for the kernel boot (of which 7% or 2 ms is the bubblesort).
        
           | senko wrote:
           | Looking from the outside, this is astonishing.
           | 
           | I suspect Firecracker is set up to be as quick to initialize
           | as possible, but still that a vanilla (ie not an optimized
           | fork) can do that ... My phone takes a two orders of
           | magnitude more!
           | 
           | I hope your BSDCan talk is getting recorded.
        
             | maccard wrote:
             | Your phone boots in 2.8 seconds? Mine is closer to 2.8
             | minutes.
        
               | senko wrote:
               | No, closer to 62 seconds (which is 100x the total boot
               | time, not just the kernel)
        
             | cperciva wrote:
             | Yes, talks are being recorded, should be up on YouTube in a
             | month or two depending on how busy the (volunteer) video
             | editors are.
        
         | throw0101c wrote:
         | > _The kernel boot (of which 7% is bubblesort) is 28 ms._
         | 
         | * https://twitter.com/cperciva/status/1659577625289928705
        
       ___________________________________________________________________
       (page generated 2023-05-19 23:01 UTC)