[HN Gopher] FreeBSD replaces bubblesort with mergesort on SYSINTs
___________________________________________________________________
FreeBSD replaces bubblesort with mergesort on SYSINTs
Author : gslin
Score : 321 points
Date : 2023-08-21 03:09 UTC (19 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| asicsp wrote:
| Related discussion: "FreeBSD spends 7% of its boot time running a
| bubblesort on its SYSINITs"
|
| https://news.ycombinator.com/item?id=36002574 _(381 points | 3
| months ago | 358 comments)_
| upon_drumhead wrote:
| Why not sort this in a precompile step and not sort the static
| list every boot?
| loeg wrote:
| No good reason. It would just require something fiddly with the
| linker / build process. This is good enough.
| cperciva wrote:
| Two reasons: First, it's not worth screwing with the linker to
| save 20 microseconds. Second, you need to merge lists anyway
| because kernel modules.
| JoshTriplett wrote:
| > First, it's not worth screwing with the linker to save 20
| microseconds.
|
| ...yet. Certainly not when boot is 28ms; might be worth it
| when boot is 1ms.
|
| > Second, you need to merge lists anyway because kernel
| modules.
|
| Does FreeBSD have the equivalent of a non-modular kernel
| build, for kernels where you know every piece of hardware
| it'll ever talk to?
| cperciva wrote:
| If I get the entire kernel boot down to 1 ms I might take
| another look at this. ;-)
|
| You absolutely can compile everything you need into the
| kernel -- in fact for Firecracker you have to since you
| don't have a boot loader handing you multiple kernel
| modules. But I needed to write the code to support more
| than just Firecracker.
| JoshTriplett wrote:
| > If I get the entire kernel boot down to 1 ms I might
| take another look at this. ;-)
|
| I look forward to the upcoming boot-time competitions
| between FreeBSD and Linux. :)
|
| > You absolutely can compile everything you need into the
| kernel -- in fact for Firecracker you have to since you
| don't have a boot loader handing you multiple kernel
| modules. But I needed to write the code to support more
| than just Firecracker.
|
| Absolutely. But it seems reasonable to have, for
| instance, link-time sorting support that _only_ works
| when building a kernel that doesn 't support kernel
| modules.
| cperciva wrote:
| Maybe. The tradeoff is that having two different ways of
| doing something makes it more likely that one will break
| without anyone noticing.
| JoshTriplett wrote:
| Valid. Especially in early boot.
|
| Another possibility: _always_ use link-time sorting for
| the kernel, and for individual modules, and then do
| runtime merging if and only if you have kernel modules.
| That way, the link-time sorting path is _always_ tested,
| and the runtime merging is tested anytime you have kernel
| modules (which sounds like the more common case).
|
| (All of this, though, is gated under "is it worth saving
| the microseconds yet".)
| gizmo686 wrote:
| Why not boot as a compilation step, and save a memory instance
| to be loaded on actual boots.
| midoridensha wrote:
| Probably because that would assume far too much about the
| state of the hardware.
| timmaxw wrote:
| For people (like me) who are wondering why a kernel needs to boot
| in under 28ms: It's for virtual machines that get launched on-
| demand in services like AWS Lambda.
| https://www.daemonology.net/blog/2022-10-18-FreeBSD-Firecrac...
| contrarian1234 wrote:
| Very naiive question... But why can't you "just" memcopy and
| exec an image of an already booted system?
| crest wrote:
| You sort of can, but drivers have to be in sync with the
| (emulated) devices they're controlling. The fastest way to go
| about this would be to snapshot an already booted virtual
| system. It could be as simple as instructing the restored
| snapshot to reseed its RNG and (re-)attach to the network,
| but you'll probably in into a few more assumption broken by
| cloning a multiple VMs from the same snapshot. If pre-booting
| and suspending/resuming each individual VM is good enough it
| could look similar to a pre-forking HTTP server like old
| Apache versions.
| Shorel wrote:
| Yeah, for real hardware it is important to be flexible.
|
| But these virtual machines can all be exactly the same,
| except for the network MAC and IP.
|
| It could even be possible to compile executables that
| bypass the need for a kernel, just have some libraries with
| the required stuff, and boot to the program, MS-DOS style.
| Terretta wrote:
| > _could even be possible to compile executables that
| bypass the need for a kernel, just have some libraries
| with the required stuff, and boot to the program_
|
| Or consider: https://github.com/uniqernel/awesome-
| unikernels
| throwawaylinux wrote:
| Simplistic answer is because hardware and firmware
| environment gets configured during boot, even for virtual
| machines.
|
| Some virtual machine environments do have facilities to
| snapshot the entire state of a machine and resume it
| (potentially on another machine), but if you look at what is
| involved with snapshot metadata it's more than just the state
| of memory, you have CPU and device state as well.
| PMunch wrote:
| Wouldn't hibernating the machine and then copying the
| hibernation data to another machine do the trick? Sure it
| wouldn't be as instant, but it would skip most of the boot
| configuration.
| throwawaylinux wrote:
| Conceptually at a high level yes (if the hardware was
| ~identical). Resuming from hibernation on the same
| machine resuming from hibernation is basically loading an
| image into memory and executing it that OP asked about
| too. But in either the resume or hypothetical hypernation
| migration cases, you can't just start running where the
| previous boot left off, you have to go through
| hibernation resume functions which basically do the same
| thing as boot in terms of configuring hardware.
|
| On virtual machines there probably isn't nearly so much
| real hardware to set up, and hypervisors can optimize
| virtual device setup. The machine image could also be
| provided via COW snapshot of memory. So I don't know,
| possibly at some corners it could be faster to do a
| resume type operation than a full boot. I haven't played
| with hypervisor guest boot at these kinds of speeds
| before.
| Angostura wrote:
| Possibly a bit fragile if there is any situation where
| one day the other machine isn't entirely identical? (just
| hand-waving here)
| j16sdiz wrote:
| The usual culprit is CPU feature flags on different CPU
| generations.
| LeifCarrotson wrote:
| Well then wouldn't you just have one image file for each
| CPU generation in your datacenter. Or better yet, have
| one image file on the hypervisor for each individual
| machine! That machine is probably dedicated to Lambda
| instances and boots them 1000x per day, if you're
| spending valuable time that's actually customer-facing
| latency on each call to reboot the system then switch
| that memory<->time tradeoff back towards memory!
| tinus_hn wrote:
| There's things like encryption keys and tokens that need to
| be unique so it's not easy, if you do it like that you would
| run into new, difficult to predict vulnerabilities.
|
| So it's difficult and dangerous for not a lot of gain.
| Perhaps an easier approach would be to boot a few spare
| systems in advance and hibernate them so they can be booted
| quickly.
| abwizz wrote:
| meanwhile our new dishwasher takes between two and four SECONDS
| from pressing the power button to showing something on the
| display
| moring wrote:
| But you still bought this one and not one of the alternatives
| that start faster... which is part of the reason things get
| built this way.
| Narretz wrote:
| Unfortunately stuff like boot time / time to interactive
| isn't in the specifications. You have to check user
| reviews, and not everybody cares about this.
| abwizz wrote:
| came with the house
|
| it's inferior to our old one, which was from the 90s and
| still worked, in other aspects too, most annoyingly
| cleaning performance. i guess it uses less water thou
| maccard wrote:
| I don't think I've ever read a dishwasher review that talks
| about time to interactivity. How do I know which of the 242
| dishwashers available[0] in my online retailer staarts in
| under a second?
|
| [0] https://ao.com/l/standard_dishwashers/1/21-23/
| vladvasiliu wrote:
| I doubt this is an absolute requirement. Would I choose the
| faster booting dishwasher among a set of otherwise
| identical ones? Sure. Would I pay more for it to boot
| faster? It depends. EUR10? Sure. 100? Nah.
|
| Plus, as the sibling poster says, you usually don't know
| this before you first start to use it, which means it's
| already been delivered, connected, and everything.
| Shaanie wrote:
| Would you really pay $10 for your dishwasher to boot in
| 1s instead of 4s? I absolutely wouldn't. _Maybe_ $1, but
| even that is pushing it.
| jusssi wrote:
| For something that's going to annoy you daily for the
| next 10+ years? I'd definitely pay $10. I'd pay another
| $10 for the dishwasher to remember the program selection
| instead of always offering the 4+ hours eco program
| (which I believe is required by some regulation to be the
| default).
|
| I already paid $100 more for 2dB less, according to the
| spec. No regrets.
| vladvasiliu wrote:
| I'd say it depends on how long it takes to boot up.
| Fortunately, my dishwasher boots up instantly. If it
| takes a few seconds, sure it's annoying, but you can work
| around that. Say press the on button, then check the
| detergent is in place, etc, then press start. If it's
| some ridiculous length, like several minutes, I'd
| probably only buy that crap at a steep discount.
|
| > which I believe is required by some regulation to be
| the default
|
| Huh. This must be recent-ish. My parents' dishwasher has
| a rotary button to choose the program, which stays put as
| long as the little grandchildren don't mess with it..
| jusssi wrote:
| Found it: https://eur-lex.europa.eu/legal-
| content/EN/TXT/?uri=uriserv:...
|
| > From 1 March 2021, household dishwashers shall provide
| an eco programme meeting the following requirements:
|
| > (a) this programme shall be: [...] set as the default
| programme for household dishwashers equipped with
| automatic programme selection or any function maintaining
| the selection of a programme, or, if there is no
| automatic programme selection, available for direct
| selection without the need for any other selection such
| as a specific temperature or load;
|
| Ours was bought in late 2020, but manufacturers probably
| take a head start with implementation whenever they renew
| their models.
| bluGill wrote:
| If consumers cared it wouldn't cost anything. They would
| just do it.
| atmosx wrote:
| There was an article about a Linux embedded system handling a
| car back-camera that had to boot within 3-5 seconds or
| something along those lines.
| mongol wrote:
| Yes I remember being surprised that Linux containers were
| used for that purpose.
| srcreigh wrote:
| Very OT, but I concur. I need a DSP for my car audio system.
| I considered a rpi3 with a DAC attachment.
|
| I tried the raspbian OS with some moderate overclocking and
| removing unneeded startup processes. Couldn't get it under 25
| seconds for boot. Kind of a deal breaker (remote start unit
| could help, but...)
|
| Apparently somebody can start the rpi3 in <3 seconds. I'll
| believe it when I try their image myself. Their setup
| disables USB and networking (maybe audio too who knows) which
| is a giant pain for configuring my system.
|
| Maybe I'll try again with an RPi 4 and usb3 nvme. or maybe
| I'll just pay hundreds of dollars for a dedicated DSP unit.
|
| In case anyone cares, my main use case is parametric EQ for
| phon adjusted low bass frequencies. 90dB 20Hz sounds as loud
| as 70dB 50Hz (give or take) and subwoofers are generally
| quieter at lower frequencies. Hence, audio processing unit to
| cut as you go higher than your lowest decently loud
| frequency.
|
| The lowest cost option for this is about $230USD plus some
| electrical components (miniDSP 2x4HD). I figure the rpi would
| cost around $100 including the DAC unit, open source audio
| software, just need some power source to link the battery to
| the rpi power cord.
|
| Edit: I don't want to spam thanks comments for helpful
| replies, so I'll just say it here (possibly in advance):
| thanks!!!
| JeffeFawkes wrote:
| If you make your own buildroot image, you can get VERY
| quick boot times - here's a random vid I found [1] with 3.5
| seconds on a pi zero w (which is considerably less powerful
| than a Pi3.
|
| I've worked with similarly underpowered boards (licheepi
| zero, only has 64mb ram and uses the cpu from a dashcam!)
| and those will still boot and start an SDL app in under
| five seconds.
|
| If you can identify the bare minimum that you need for your
| EQ solution, you can probably get similar boot times.
|
| [1]: https://www.youtube.com/watch?v=ZehLyumyMvE
| kragen wrote:
| i feel like maybe 3.5 seconds is not VERY quick compared
| to 28 milliseconds?
| yjftsjthsd-h wrote:
| 28ms is a VM; in hardware you can't get through firmware
| in 28ms. Apples to oranges.
| cinntaile wrote:
| The use case is different so the comparison doesn't
| really matter.
| kragen wrote:
| 3.5 seconds might be tolerable for your car radio i guess
| red0point wrote:
| It doesn't always have to be a RPi, sometimes there is
| dedicated hardware for that use case that is also kinda
| cheap.
|
| Check this thing out, it's a programmable audio DSP:
|
| https://www.thomannmusic.ch/the_t.racks_dsp_4x4_mini.htm
| ju-st wrote:
| You could use a ADAU1701 DSP board (cheaper and shittier
| minidsp) or do it directly on a modern/fast
| microcontroller, e.g. ESP32. I used an ESP32 to adjust
| frequency response of a small diy loudspeaker (and added
| Bluetooth).
| [deleted]
| pid-1 wrote:
| Search ebay etc... for Texas Instruments DSP evaluation
| boards.
| SomeoneFromCA wrote:
| Can't you just run on bare metal on rpi? Also a resistor, a
| capacitor and a realay would work to filter out the extra
| bass.
| jakear wrote:
| Wild that people have accepted a 5000ms delay between
| starting their car and seeing behind them.
|
| I suppose I'm more sensitive to downtime than most - I
| recently (unsuccessfully) tried to add a ton of capacitance
| into my head unit's power supply and acc-signal lines to keep
| it from turning off for a few seconds when starting the
| engine.
| chii wrote:
| probably because it started off as a 1000ms delay, which
| they begrudgingly accepted. Then it started to slow to
| 1100ms....
| lukevp wrote:
| What was unsuccessful? I did this in a 2012 Miata with a
| pioneer head unit (capacitor on the Acc line so it
| discharges while cranking, which keeps the head unit from
| turning off). Has a fused line directly to the battery for
| power. It's not a huge capacitor either... like a 1500 uF
| or so? Just tried a couple diff cap values until it had
| around 5 secs of charge to hold the line open. It also
| makes it so the music doesn't cut off immediately when the
| car is turned off as well... it gives it a few seconds to
| shut down. Feels more premium.
| jakear wrote:
| Well if I knew for sure why it didn't work I'd fix it ;)
|
| I _think_ it works better now than before, but it 's hard
| to be sure because I neglected to measure the downtime
| before/after the modifications. Essentially I had a
| cap/diode to keep the acc line high during startup for
| the head unit, but it would still cut off due to the
| power supply voltage dropping to ~10-11V on the engine
| turning over (4.8L V8). So then I added a cap/diode to
| the power supply line, and it works
| "better...ish...maybe..." now: there are some times when
| it will stay on across startup, but not always. Perhaps
| with a larger cap on the PS line (I'm using ~2500uF), but
| it seemed to be discharging much faster than I'd expect,
| as if the diode wasn't doing it's thing and the line was
| effectively just at the system's general "+12V". I
| wondered if there was perhaps some way the power was
| escaping the diode/cap isolation and making it back into
| the main system, but I didn't have a good way to test or
| fix that (I previously diagnosed a similar issue to a bad
| diode, but this one seemed to measure fine).
|
| And besides all that, the amp is on a separate line in
| the way back of the car that I didn't bother to tear into
| because the head unit turns off in the critical stage
| anyways. So the head unit does stay on for a couple
| seconds after I turn if off completely, but since the amp
| is off it doesn't matter.
| cwalv wrote:
| Up until a couple years ago my car didn't even have a
| camera. We just looked out the window, which is still an
| option.
| Kiro wrote:
| I rather have no car at all than go back to that.
| usr1106 wrote:
| That would be better for the planet anyway.
|
| Disclaimer: Not a joke. I've never owned a car.
| Kiro wrote:
| I agree and in modern countries with proper
| transportation there is no reason to own a car.
| oblio wrote:
| Well, there is a reason, but you won't like it.
|
| Convenience :-)
| Delk wrote:
| If you have e.g. outdoors activities in sparsely
| populated areas as a hobby, having a car gives you a lot
| more freedom than being dependent on public transport.
| Buses may work if you're only going to locations that are
| popular enough but that may limit your options. Of course
| you could also rent but that could become expensive or
| cumbersome if you do it often enough.
|
| Also, if you live in a sparsely populated area, public
| transport with good coverage often isn't really
| economically (or ecologically) feasible, even in
| countries that have good public transport in cities. A
| bicycle might be an option if distances aren't too great
| but relying on it exclusively probably takes more than an
| average person is willing to take on. Weather conditions
| could be poor, you might regularly need to transport
| groceries for an entire family, etc. All of that is
| entirely doable within a few miles in a bikeable city but
| it's potentially a lot less so if distances are greater.
|
| I've never owned a car, I bicycle some thousands of
| kilometres per year (probably more than I drive most
| years), and I wish people didn't automatically consider
| owning a car a necessity even when it actually isn't. But
| I can definitely see reasons for owning one.
| Sharlin wrote:
| A large majority of cars in traffic don't have a back
| camera. Either because they're old or because they're
| cheap (I don't think a back camera is usually a standard
| feature at least where I'm from...)
| kalleboo wrote:
| In the US a backup camera is now required by law
| Someone wrote:
| For new cars, I guess.
|
| In the EU, it's similar.
| https://ec.europa.eu/info/law/better-regulation/have-
| your-sa...:
|
| _All new vehicles sold from May 2022 must be fitted with
| advanced safety features, including:
|
| - monitors that detect when a driver has become drowsy or
| distracted
|
| - emergency stop signal to help prevent rear-end
| collisions
|
| - rear-view camera or parking sensors
|
| - alcohol interlock system to prevent drunk driving.
|
| These systems will help reduce serious accidents on
| Europe's roads._
| jusssi wrote:
| It seems that the only part that got implemented out of
| that draft was drowsiness/inattention detection. At least
| that's what I read from the document in "Commission
| adoption" section of that page.
| L3viathan wrote:
| > alcohol interlock system to prevent drunk driving.
|
| Wait, what? _All_ new vehicles need to have this? Or do
| they just need to be constructed such that one can be
| installed easily?
| oblio wrote:
| Looks like EU law, so considering the wording, probably
| every new car.
|
| Also, see:
|
| https://road-safety.transport.ec.europa.eu/statistics-
| and-an...
|
| Edit:
|
| The actual text:
|
| > Article 6 of Regulation (EU) 2019/2144 of the European
| Parliament and of the Council requires motor vehicles of
| categories M and N to be equipped with certain advanced
| vehicle systems, including driver drowsiness and
| attention warning systems. It lays down in its Annex II
| basic requirements for the type-approval of motor
| vehicles with regard to the driver drowsiness and
| attention warning systems.
| martin8412 wrote:
| The car must be built to allow for the installation of a
| breathalyser, i.e. some standardised interface. Doesn't
| really affect you unless you're ordered by a court to use
| one.
| plorkyeran wrote:
| It is a much worse option than it used to be. Visibility
| out of the rear windows from the driver's seat has gotten
| worse over the years.
| usr1106 wrote:
| So industrial design with poor usability gets
| complemented by bad electronic devices. Progress?
| harry8 wrote:
| what are they getting for the poor visibility tradeoff
| compensated by electronics? I have no idea. If it's
| safety, then yes, that would be progress.
| SideburnsOfDoom wrote:
| Safety, and aerodynamics (i.e. energy efficiency).
|
| Dismissing it as "design with poor usability" is very
| facile. It's trade-offs between different goals, all the
| way down.
| kortilla wrote:
| Or it's just cheaper to build
| SideburnsOfDoom wrote:
| Cost is yet another trade-off.
|
| Those pillars that hold the roof up are safer if they're
| sturdy, but sturdy means wider and that obstructs the
| driver's field of vision a bit, and weighs more. "strong
| and slender" is possible, but it costs more. It all
| inter-relates.
|
| (strong, narrow, cheap), pick any 2.
|
| There does come a point where you start to appreciate
| what Polestar is doing on one of next years models (the
| Polestar 4): give up on the idea and change direction; no
| rear window at all, just a camera.
| tjoff wrote:
| The failure mode is pretty horrible though.
|
| And I'm not only thinking about damaged camera. Humidity,
| weather etc. often makes these cameras completely
| useless.
|
| You'd have to have wipers for your camera-lens and
| probably heating or something as well to even approach
| the utility of a window - short term.
|
| Because none of that is ever going to fail, and it isn't
| cheap either.
| varjag wrote:
| Lens washdown is rather common now on vehicles.
| devit wrote:
| Trucks drive just fine with no rear view at all other
| than side mirrors, so cars can do that as well.
| xmcqdpt2 wrote:
| On the highway, sure. But generally whenever I see a
| truck back up in the city, there will be someone outside
| directing the driver.
|
| You don't see lots of truck parking in parallel
| unassisted on streets with pedestrians and cyclists, but
| I do this all the time.
| nottorp wrote:
| Hmm? They specifically taught me in driving school how to
| parallel park using just the side mirrors.
| oblio wrote:
| Truck drivers are pros, though, while we're just annoying
| traffic jam fodder.
| tharkun__ wrote:
| They can't see what they can't see. Being a pro makes no
| difference.
|
| Reminds me of the 18 wheeler at the traffic light. He got
| red while waiting for his left turn and decided to back
| up instead of "going for it" once oncoming traffic had
| seized.
|
| The minivan behind him got crushed like it was made of
| sticks. Luckily he heard the noise after a while and
| stopped backing up. Not a pretty sight. I had the side
| view so I was like "WTF is he doing?" when he started
| backing up. Well he couldn't the minivan who had snuck up
| behind him.
|
| Lesson: when behind a truck make sure you can see the
| drivers face through his mirrors. Then you have a chance
| that he is going to see you. If you can't see his face,
| assume that he can't see you. Move away.
| SideburnsOfDoom wrote:
| I don't think that I said that the Polestar 4's lack of
| rear windows is a slam-dunk sure-fire hit. Rather just
| that we can appreciate the attempt to side-step the
| issues with rear-window visibility with a very different
| design.
|
| It is IMHO a bold move, which will attract fans and
| haters. Neither of us know for sure how it will play out;
| if the potential issues outweigh the rest. The
| implementation details clearly matter. In a couple of
| years we will see if it's a genius innovation, a terrible
| misstep or somewhere in-between. Until the vehicle ships
| and is on the road for some time in some volume, we won't
| know for sure.
| tjoff wrote:
| While I agree I kind of think this should be nailed down
| before such a departure.
|
| After all, on a high-end vehicle a back-camera that
| hasn't sorted this is just a gimmick.
|
| Maybe they have, all my experiences indicate otherwise
| though I don't have much experience with new expensive
| cars.
| astrange wrote:
| Americans don't care about the prices of their cars and
| neither do American safety regulations.
| xmcqdpt2 wrote:
| This is correct. It's easy credit all the way down. Cheap
| cars (Chevy Spark, Mitsubishi Mirage, Kia Rio) are
| basically all slowly being discontinued in NA, meanwhile
| luxury pickups sell like hot cakes.
| ilyt wrote:
| It's safety. Same with thicker pillars which are worse
| for visibility but made thicker for rollover protection.
| unethical_ban wrote:
| So cars get safer through more competent structural
| design and are enhanced by reliable camera systems that
| allow for better visibility than craning ones neck around
| 120 degrees.
| oblio wrote:
| > So industrial design with poor usability gets
| complemented by bad electronic devices. Progress?
|
| https://en.wiktionary.org/wiki/Chesterton%27s_fence
|
| It's for safety in case of roll-over and for better
| aerodynamics due to increased emissions standards and in
| the case of EVs, increasing the range.
| bbojan wrote:
| You don't see small kids right behind your car this way.
| I believe this was the main reason why rear-view cameras
| were made mandatory on new cars.
| Shorel wrote:
| Also, because the cars now are gigantic, compared with
| old cars.
|
| Even the Mini Cooper is huge now.
| kevin_thibedeau wrote:
| I wouldn't be surprised if the Mini Cooper is now a light
| truck. Everything is larger because manufacturers are
| gaming the footprint regulations to get by with worse
| emissions and DOT isn't serious about revamping CAFE to
| stop that behavior.
| wmil wrote:
| You're supposed to let your car idle for 30 seconds after
| startup to let the engine lubricate. It's more important in
| cold climates.
|
| So delayed startups aren't some new issue.
| dekhn wrote:
| Probably not necessary in modern cars, whose
| electronically controlled engines have many internal
| details not obvious to typical drivers.
|
| If you believe your idea is true, can you tell me what I
| can expect to see if I _don 't_ do this? IE, I've never
| done this and I have cars that last 10+ years without any
| engine trouble.
| hnlmorg wrote:
| It's probably less than 5 seconds in perceived terms. Eg
| the car gets power before the engine turns over, and you
| then need to select reverse after the engine is running.
| And you'd need to check your blind spots before checking
| the camera too. So by the time the driver is ready to check
| the camera, it might feel like a much smaller delay or even
| no delay at all.
|
| The delay that I notice the most is the time it takes to
| connect to my phone via Bluetooth (usually so I can play
| music from it). I'm often already pulling out of my parking
| space before it connects.
| maccard wrote:
| > Eg the car gets power before the engine turns over, and
| you then need to select reverse after the engine is
| running.
|
| I can change gears in well under a second when moving. My
| car requires the clutch to be depressed when the car
| starts, meaning that I can actually ahve the car started
| in reverse in under a second. Waiting another 4 seconds
| is crazy.
| ilyt wrote:
| I mean, I back into the parking spot instead of out of it
| most of the time so it wouldn't bother me.
|
| Also my usual sequence is "start the engine, set up on
| phone (music, GPS), leave", so it would've booted before I
| need it.
| selimnairb wrote:
| 3-5 seconds sounds like the latency I see in my 2018 Toyota's
| garbage Entune 2.0 system. It's criminal that things like
| this are released for general usage.
| hwd wrote:
| There is an interesting talk from BMW how they did it for
| their cars: https://youtu.be/_cSTBiwY7HE.
| rfmoz wrote:
| With a custom kernel and without initrd/initramfs is possible
| to boot around 1 second.
| rascul wrote:
| Might be quicker if you put everything you need into an
| initramfs that's built into the kernel. Then there's no
| slow storage needed once the kernel is loaded.
| CalChris wrote:
| My 2015 Model S would lose badly in a quarter mile against a
| good cyclist if you included cold boot time. Worst bank
| robbery car ever.
| ciupicri wrote:
| "Linux Kernel Fastboot On the Way", Linux Plumbers Conference
| 2019, https://www.linuxplumbersconf.org/event/4/contributions
| /281/... / https://www.youtube.com/watch?v=A7N_O8pnyTw
| tiffanyh wrote:
| iOS ~13 seconds to boot
|
| FreeBSD ~10 seconds to boot
|
| As a comparison point, my iPhone 14 Pro (latest phone) running
| iOS 16.6 (latest OS) boots in 12.7 second from when the Apple
| logo is displayed when you can see the wallpaper.
| rf15 wrote:
| Does anyone know why they used bubblesort in the first place? It
| is known for its bad performance, so what other factors came into
| play? Number of ops, maybe?
| phoenixreader wrote:
| According to the Twitter thread linked, at the beginning there
| were only 30 items, so using bubblesort didn't matter. The
| number of items has since grown to a thousand.
| professoretc wrote:
| For 30 items bubblesort might very well have been faster than
| mergesort.
| jeltz wrote:
| But it always loses to insertion sort.
| harry8 wrote:
| Bubble/selection sort with early exit can win outright on
| big lists if they're already almost in order. Textbook
| example years ago was cheque numbers which mostly are
| presented in order.
|
| Know your data. (Which in this case, i don't).
| bongobingo1 wrote:
| non xitter mirror:
| https://nitter.net/cperciva/status/1693127769901969772
| chaxor wrote:
| "This is good for LLMs"
| benreesman wrote:
| For everyone challenging Colin on this, by all means, but do
| remember that it's Colin.
|
| If anyone misses less I can't think of them.
| [deleted]
| tiffanyh wrote:
| "1 files changed, 86 insertions, 91 deletions"
|
| Net reduce of 5 LOCs and it's "100x faster".
|
| Nice commit.
| ilyt wrote:
| You'd think some algorithms would be considered near always wrong
| solution for the problem and so never used aside for very
| specific use case (like tiny microcontroller) but here we are.
| chris_wot wrote:
| 100x speed on what though? bootup time?
| dpwm wrote:
| > When the FreeBSD kernel boots in Firecracker (1 CPU, 128 MB
| RAM), it now spends 7% of its time running a bubblesort on its
| SYSINITs.
|
| It's a 100x speed increase on 7% of the boot time, so it should
| be approximately a 7% decrease in boot time.
| torstenvl wrote:
| EDIT: I misread--please disregard the below comment.
|
| _3.5% decrease.
|
| A 100% increase in speed is a 50% reduction in time.
|
| E.g., traveling 60km at 30km/h takes two hours, but at 60km/h
| takes one hour._
| dharmab wrote:
| A 100x increase is a 9900% increase, not a 100% increase.
| loeg wrote:
| You're misreading 100x as 100%. 100x is 10,000%.
| [deleted]
| Finn_ wrote:
| 100x not 100%. 3,000 km/h takes 1.2 minutes to go 60km.
| Blikkentrekker wrote:
| Okay, but what exactly is 100 times faster when measured? Simply
| the sorting algorithm? If it not be a bottleneck to something
| then it doesn't matter much.
|
| I assume the entire performance of FreeBSD hasn't improved a
| hundredfold by this small change.
| wkat4242 wrote:
| On the sorting of SYSINIT (not sysint as the title says) calls
| during bootup. The actual speedup is only a couple milliseconds
| but that means the old method was in the order of hundreds of
| milliseconds. Which is indeed a lot for a sorting operation.
| Crestwave wrote:
| The author mentions in the thread that the boot time they
| measured on the device was 28 ms and 7% of it (2 ms) was spent
| sorting.
| reanimus wrote:
| They explain this further on in the Twitter thread. It's the
| portion of time spent sorting the array of system calls during
| boot. This didn't used to matter when it took longer than a
| second to boot up, but now that the boot times are in the sub-
| second range, the amount of time spent sorting the array of
| system calls ended up taking up a notable chunk of the time
| spent booting.
|
| Might not matter for the things you care about, but some people
| certainly do care about boot performance.
| paulddraper wrote:
| 100x faster on 7% of Firecracker boot
|
| So ~5ms speedup
| im3w1l wrote:
| Why do you boot the system anyway? Wouldn't it be faster dump the
| memory of an already booted system and just read that in?
| bux93 wrote:
| That's what windows fast startup does, it starts from a post-
| boot hibernated state of RAM. Linux mint on the same laptop
| boots fully in less time.
| hakfoo wrote:
| That paradigm used to be a thing for things like home-computer
| games-- the game might take 5 minutes for an anemic Commodore
| 1541 drive to load, so you'd use a "freeze" cartridge that
| snapshotted RAM after the load finished and produced something
| smaller; I suspect a major side appeal was that you bypassed
| copy-protection that involved the loading process.
|
| On a more complex system, with external devices active at all
| times, this gets a lot harder. It would be easy to take a
| snapshot at the "wrong" time when something is deadlocked on a
| timer or external device that won't be in an appropriate state
| at the next power on. A "boot" process ensures everything
| attached has been roused from their slumber and get them back
| to a known state.
|
| OTOH, I could imagine if you were designing to a specific
| enough use case, you could design a system that relied on a
| minimal support devices and a CPU that bit-banged almost
| everything, so you knew if you were in the idle loop,
| everything was safe to snapshot.
| astrange wrote:
| Would be insecure because nothing would be randomized, all your
| systems would have the same ASLR slide, etc.
| RetroTechie wrote:
| Would it?
|
| Read in small amount of code (or execute directly from flash) &
| use that to initialize hardware, might just be faster than read
| memory dump of already-initialized system.
|
| Also the "read memory dump" method would include code & data
| structures which were used on a previous run, but may not be
| needed for the next run (or only much later). And not re-
| initialize hardware which may have gone flaky, or changed
| configuration in the meanwhile (like USB port with different
| device plugged vs. state that memory dump reflects).
|
| Rebooting is just an all around cleaner method to return system
| to a known state. But of course it all depends on hardware
| specifics & what type(s) of initialization is done.
| im3w1l wrote:
| The context here is starting a vm really fast on aws
| firecracker, so those concerns about hardware don't apply.
| Attummm wrote:
| Well, it's a bit of a embarrassing moment.
|
| Bubblesort has been controversial in many cs programs. That memo
| even reached the President ten years ago.
|
| [0]https://youtu.be/koMpGeZpu4Q
| eitland wrote:
| As two persons have already asked:
|
| > When the FreeBSD kernel boots in Firecracker (1 CPU, 128 MB
| RAM), it now spends 7% of its time running a bubblesort on its
| SYSINITs.
|
| > O(N^2) can bite hard when you're sorting over a thousand items.
| Time to replace the bubblesort with something faster.
| _3u10 wrote:
| Boot time was reduced by 2ms, from 28ms to 26ms. Hardly 100x
| speed.
| cperciva wrote:
| The mergesort is roughly 100x faster than the bubblesort was. I
| made no claim in that tweet about the overall speedup.
|
| (But since you mention it, when I started working on speeding
| up the boot process, the kernel took about 10 seconds to boot,
| so I have a kernel booting about 400x faster now than I did a
| few years ago.)
| butterisgood wrote:
| I'd purchase a simpler, easier to debug algorithm over a
| complicated one for 2ms. But that's me...
|
| The overall speedups are still quite impressive - but for
| "thousands" of data items a simple O(n^2) algorithm seems
| fine, doesn't it?
| _3u10 wrote:
| Totally agree but the title makes it sound like it's 100x at
| init. I know you didn't write the title...
| MBCook wrote:
| Way to go.
|
| Do you know how fast Linux starts up? I found the number
| 125ms but that's a 4 year old number.
|
| Just curious how similar they are. This is the first I've
| heard of Firecracker.
| cperciva wrote:
| I believe Linux is at 75-80 ms for the same environment
| where I have FreeBSD booting in 25 ms. That comes from a
| Firecracker benchmark which I _think_ is an apples-to-
| apples comparison with what I 'm measuring in FreeBSD, but
| I'm not 100% certain. Once FreeBSD support is merged into
| mainline Firecracker (it's currently in a feature branch)
| I'm going to see if I can apply their performance tests to
| FreeBSD.
| dataflow wrote:
| Why does _any_ sorting _anywhere_ use bubble sort at all?
| Aren 't there sorts out there that are strictly better?
| cperciva wrote:
| Options for sorting algorithms are severely limited when
| (a) you can't use malloc, (b) you don't have enough stack
| to recurse, and (c) you want a stable sort.
|
| (The first two limitations are because this happens _very_
| early in the boot process.)
| dataflow wrote:
| Thanks! I completely forgot about the malloc/in-place
| aspect.
| aidenn0 wrote:
| Doesn't mergesort require malloc?
| cperciva wrote:
| There are in-place array mergesorts. (Nobody uses them
| because they're horribly complicated, though.)
|
| In this case however I'm putting all of the SYSINITs onto
| a linked list; mergesorting a linked list is easy.
| magicalhippo wrote:
| Why the hash stack limit? I assume this precludes a
| fixed-size array.
| ahoka wrote:
| I'm not sure when this sort happens, but early boot may
| not have a proper virtual memory implementation yet, so
| the kernel has to use a fixed stack size allocated by the
| boot loader.
| kevans91 wrote:
| Right, in this case specifically we're running off a
| limited bootstack allocated in the kernel's .bss
| (somewhere between 2 to 6 pages, generally); we won't
| finish initializing VM until shortly after this sort is
| done (in some of the SYSINITs that we're sorting).
| knome wrote:
| sure, but bubble sort is trivial to write. they may have
| figured they'd replace it when they needed to. they may
| have figured the particular usage was so trivial they would
| never need to.
|
| grabbed the source and git blamed the file before the patch
| to see when the original code was from.
| 94e9d7c12 (Peter Wemm 1998-10-09 23:42:47
| +0000 157) * The sysinit table itself. Items are checked
| off as the are run. 94e9d7c12 (Peter Wemm
| 1998-10-09 23:42:47 +0000 158) * If we want to register
| new sysinit types, add them to newsysinit.
|
| So the section was written in 98 and had some bits modified
| in 2001.
|
| The time wasted during boot here was probably irrelevant
| next to a hundred other things the project has worked on
| since.
|
| the bubble sort was good enough that 25 years later they
| replaced it to save only 2ms during boot.
|
| seems reasonable to me.
| naniwaduni wrote:
| You'd still think it'd be easier to write an insertion
| sort though, even accidentally! It's _almost_ the same
| algorithm conceptually, except cutting out an odd
| pessimization.
| ynik wrote:
| Bubble sort is optimal for sorting data stored on tapes (no
| random access possible). If you have RAM, insertion sort is
| strictly superior (both simpler code + better performance).
| dataflow wrote:
| Fascinating, thanks!
| Someone wrote:
| On tapes (plural >= 3), isn't merge sort the way to go?
|
| I just coarsely checked TAOCP volume 3, and didn't see
| bubble sort mentioned in the chapter on external sorting.
| _3u10 wrote:
| Because bubble sort is often faster for small sorts and
| it's easy to write. This is the kernel, there's no libc you
| have to write the sort algo yourself.
| IshKebab wrote:
| It's really because C doesn't have a proper dependency
| management system like modern languages do. If it did
| there's no reason you couldn't import a third party
| sorting algorithm in the kernel. You can do that in Rust
| for example, although you would just use the built in
| sort in this case.
| throwawaylinux wrote:
| It's not because of that. There exist countless BSD
| compatible code out there, and many are taken and used in
| projects like this. CRC codes being an obvious example.
| There is no reason sort code could not have been grabbed
| from somewhere, no "dependency management system"
| required. The FreeBSD kernel has its own existing sort
| libraries already in the code too -- https://cgit.freebsd
| .org/src/tree/sys/libkern/qsort.c?id=9a7...
|
| The real reason is because this is an early boot
| environment with exact constraints that are unusual if
| not unique to FreeBSD where the facilities to support
| your general language / runtime environment is not up
| yet.
| tialaramex wrote:
| > the facilities to support your general language /
| runtime environment is not up yet.
|
| Rust's [T].sort_unstable() is part of core, that is, the
| code to sort a slice of arbitrary type T doesn't need the
| allocator, operating system etc.
|
| Only if you want a fast _stable_ sort do you need to wait
| for such an environment because their fast _stable_ sort
| needs an allocator, I doubt that FreeBSD needs a stable
| sort here.
| fanf2 wrote:
| It _does_ need to be stable, which is why the replacement
| is mergesort.
|
| The SYSINITs need to bring up the system in a particular
| order. I dunno why there are both explicit ordering (the
| sort keys) and implicit ordering (the stability
| requirement), perhaps cperciva can chime in.
|
| Based on when I last looked at this code, I guess the
| explicit ordering is to do with startup phases, and the
| implicit ordering is because of the way SYSINITs are (or
| were?) implemented using linker sets, which have more
| guarantees about ordering than
| __attribute__((constructor)) and other similar
| functionality.
| tialaramex wrote:
| > I dunno why there are both explicit ordering (the sort
| keys) and implicit ordering (the stability requirement),
| perhaps cperciva can chime in.
|
| In the event Colin is looking at this sub-thread now I'm
| intrigued too.
| cperciva wrote:
| There isn't _supposed_ to be an implicit ordering
| requirement. But a number of bugs have happened in the
| past because people didn 't order properly.
|
| What we need isn't actually "stable" so much as
| "consistent from one boot to the next" to make sure that
| ordering bugs don't end up being heisenbugs.
| throwawaylinux wrote:
| I'm not sure what part of my post you are addressing.
|
| If you had a kernel written in Rust, you would have to
| bring up the rust runtime environment as part of the boot
| process. You don't just magically get it because that's
| the language you used, whether it is "core" or not. You
| can't even execute simple functions before you have set
| up stack, and in this case apparently there is a
| constraint on stack usage. And evidently they do need a
| stable sort.
|
| In any case my main point is that it's not anything to do
| with dependency management, not quibbling about whether
| or not some language could support this function at this
| exact point of the boot.
| tialaramex wrote:
| > you would have to bring up the rust runtime environment
| as part of the boot process
|
| Like the existing C runtime environment, this just isn't
| a big deal by the time we've got here. If you're the code
| for initialising the DRAM controller then sure, that's
| pretty intimidating. But this is all happening _much_
| later, we 're part of an operating system, we were
| already _loaded from disk_.
|
| > And evidently they do need a stable sort.
|
| How so? They appear to even have a deliberate tie-
| breaking mechanism in their data structure, which
| suggests if order actually matters you're expected to
| specify when creating the data, not rely on stability to
| do what you expected.
| throwawaylinux wrote:
| > > you would have to bring up the rust runtime
| environment as part of the boot process
|
| > Like the existing C runtime environment, this just
| isn't a big deal by the time we've got here. If you're
| the code for initialising the DRAM controller then sure,
| that's pretty intimidating. But this is all happening
| much later, we're part of an operating system, we were
| already loaded from disk.
|
| I don't know what you're trying to say. The code which is
| written here that we are discussing is in a constrained
| environment where the regular kernel runtime facilities
| are not all available. This was cited as one of the
| reasons cited for open coding a very simple algorithm.
|
| > > And evidently they do need a stable sort.
|
| > How so?
|
| From reading comments from patch author here.
| IshKebab wrote:
| > the facilities to support your general language /
| runtime environment is not up yet.
|
| What facilities does Rust's sort require? It probably
| just needs a bit of stack but that's it.
|
| That qsort implementation doesn't look like it needs
| anything either tbh, though I only skimmed it.
| joshu wrote:
| welcome to amdahl's law
| [deleted]
| ashishgautam0 wrote:
| Hey what's amdahl's law. Can you explain
| trostaft wrote:
| The Wikipedia article is pretty good, but tldr: "the
| overall performance improvement gained by optimizing a
| single part of a system is limited by the fraction of time
| that the improved part is actually used."
|
| https://en.wikipedia.org/wiki/Amdahl%27s_law
|
| Usually used in the parallel computing context, but works
| here too.
| bombcar wrote:
| So if you have something that takes 1 second out of 10
| seconds, you can't improve the overall system more than 1
| second (by removing it entirely).
| astrange wrote:
| That's not true, because in addition to taking 1 second
| it could be emptying all your caches. Performance isn't
| just adding up wallclock times.
| robocat wrote:
| Same thing happens when driving - your trip duration
| usually depends on the slow bits not the faster parts
| e.g. in NZ driving through small towns will often be the
| longest part of your trip time.
| _3u10 wrote:
| Something about 5% of things can't be parallelized which
| then ends up dominating performance after the 95% of things
| that can be parallelized are.
| mumumu wrote:
| In hardware engineering this is called critical path.
| memefrog wrote:
| That is about latency, but Amdahl's law is about
| throughput I think.
| CyberDildonics wrote:
| Amdahl's law is not about latency or throughput, it is
| about optimizing one part of a program having diminishing
| returns relative to the part that isn't optimized. It is
| not really a law, it is more common sense that most
| people would never assume needs to be explicitly said.
| mumumu wrote:
| Amdahl's law is common sense, but also the one of the
| reasons premature optimization is the root of all evil.
|
| A small speedup on the longest (by time) part of the
| program is usually better than infinite speedup on a short
| part of the program.
| kunley wrote:
| I wonder since 25 years or so why bubblesort is considered a
| viable sort at all. Putting aside its terrible complexity and
| thus performance, it is not even intuitive. For example, if you
| look at it, it is not how we humans sort things manually.
|
| It is IMO an example of a "following the herd" confusion: some
| scientist wrote a paper (they must do it to survive, don't they),
| others mentioned it in a textbook (doesn't hurt to have more
| pages, huh), others put it in their other textbooks because,
| well, it was mentioned in the past, and this is how this
| abomination survives decades..
| bradley13 wrote:
| FWIW: As an instructor, I used to include bubblesort when
| talking about sorting algorithms.. I stopped, because too many
| students saw it presented as an algorithm, and thought it was
| therefore a valid choice. It is not. As others have pointed
| out, insertion sort and selection sort are just as simple, but
| significantly faster.
|
| As long as the quantity of data is small, there is nothing
| wrong with using an n^2 algorithms for sorting. They are
| simple, robust, stable, and easy to implement correctly.
|
| Sure, things may change. In 25 years, you may have 1000
| elements instead of 10. If that happens, your successors can
| change the algorithm to meet the new requirements. That's what
| software maintenance is all about.
| kunley wrote:
| Yeah, great points about insertion and selection sort.
|
| Thank you for the story and for keeping up the quality of
| education.
| Delk wrote:
| To me, bubble sort is intuitive, as far as sorting algorithms
| or their detailed implementations go.
|
| I absolutely wouldn't sort that way if I were to sort something
| as a human, but in that case I wouldn't be even trying to
| figure out an exact systematic logic with its minute details
| that always gets it right. I'd be thinking in terms of ad hoc
| problem solving, and possibly some heuristics for speeding it
| up. When thinking of algorithms in a programming (or CS) sense,
| getting the exact logic right down to the detail is exactly
| what you'll need to do. So, to me, those aren't the same kind
| of intuitive.
|
| As is probably common, a bunch of simple sorting algorithms
| were given as some kinds of introductory examples of algorithms
| during my first university programming courses. I think those
| included selection sort, bubble sort and insertion sort. I
| don't think I initially considered bubble sort the more
| intuitive one (I think selection sort was it for me). But for
| some reason, over the years the basic logic of bubble sort
| became more obvious to me than those other options. So, at
| least personally, maybe there's something else to it than
| following the herd.
|
| Of course I've never actually written bubble sort in any kind
| of real code, but it's still perhaps the most intuitive one to
| me, in terms of detailed algorithms rather than in terms of
| heuristics for human problem solving. (Merge sort is also about
| as intuitive to me at least as a general idea.)
| ggm wrote:
| Small increments add up. Sure, its not 100x faster booting. its
| 100x less spent in one fragment of overall boot time. Do another
| 20, and you've made a LOT of difference.
| hinkley wrote:
| People, including me, used to think I knew some special sauce
| about optimization. It's true that I have a bit more mechanical
| sympathy than most people, due to not only the curriculum at my
| school but which parts fascinated me.
|
| The big thing is that I'm willing to sit down and earn a 50%
| improvement in performance by doing 10x 4% improvements in perf
| time. That's perseverance and foresight. Most of what I know
| about optimization tricks vs optimization _process_ wouldn 't
| quite fill up the first Gems book.
|
| It's all about budget. If you're trying to double, quadruple
| the performance of something, you can't look at what the 5
| slowest things are. You have to look at the thing taking 5% of
| the time and ask, "Does this deserve 20% of the CPU?" Because
| if you get 4x without touching this code, that 5% becomes 20%.
|
| Then you don't work on the tall tent poles first, you work on
| the hot spots that relate to each other, and the ones you have
| the time and attention to do well. Because if you get a 20%
| improvement where a 25% improvement is possible, most bosses
| will not let you go back for the 2 x 2.5% later if you don't
| get it the first time. So then you are forever stuck with 50 x
| 1.5% slowdowns. Which is not a problem if you're profitable,
| but definitely is if you're bleeding money on hosting. Or if
| your main competitor is consistently 50% faster than yours.
| nucleardog wrote:
| Yep. When work started working on boot time, FreeBSD took 30
| seconds to boot. It's now down under 10 seconds. Many have
| contributed, but Colin's done a lot of it [0].
|
| [0]:
| https://wiki.freebsd.org/BootTime#Past_Performance_Improveme...
| 0xFF0123 wrote:
| How is time measured on things like this? Surely processors
| have also become faster as well.
| maccard wrote:
| From the link,
|
| > An Amazon EC2 c5.xlarge instance is being used as a
| reference platform and measuring the time between the
| instance entering "running" state and when it is possible
| to SSH into the instance.
| legulere wrote:
| This is known as the hotspot or bottleneck model of
| optimization. It has its flaws:
|
| https://lemire.me/blog/2023/04/27/hotspot-performance-engine...
| IshKebab wrote:
| That article is 100% right, but gmm said "do another 20" -
| that's not really what hotspot people expect to do. They
| think there's one or two hotspots that they need to optimise
| and then they'll get a 10x speedup.
|
| You can definitely do 20-100 small optimisations and get a
| 2-3x speedup.
|
| The best example I've seen of that is Nicholas Nethercote's
| work on speeding up the Rust compiler. A ton of tiny speed
| improvements that add up to 2x or 3x speed improvements
| overall.
|
| I've done similar work on optimising a compiler. You're never
| going to get 10x without architectural changes but you can
| get 3x with a lot of work.
|
| Of course the "we'll just optimise the hotspots" people have
| written their code so badly they need at least 10x.
| hinkley wrote:
| I compare this to man-to-man versus zone defense. Zone
| defense, as any parents with three+ children will tell you,
| is the only way to stay sane.
|
| By looking at a module or concern as a whole, you get to do
| small-to-medium architectural improvements instead of none-
| to-small improvements. You get a broader picture, and can
| pull in more small areas of improvement in the process. The
| fastest code is code that doesn't run, and 'tall tent-pole'
| optimizations miss at least half of those opportunities. In
| some cases they just push them around, like broccoli on a
| child's plate.
|
| Many teams get to the end of this road and then have
| absolutely no idea what to do next. I find it infuriating for
| a bunch of my 'betters' to listen to someone on the business
| side say, "Our customers are complaining about our
| speed/leaving for a competitor with a faster app" and then
| pull up a rectangular flame chart to 'prove' "We've done all
| we can". You have only done the easy parts, probably not even
| the best parts. You might have actually broken the best
| parts.
|
| These apps are 'death by a thousand cuts'. No one part of the
| app is particularly slow, it's just uniformly mediocre coding
| and an architecture that squeaks by with 'adequate' (to them,
| but not the business) attention to capacity planning. Either
| a force of nature shows up and fixes it, or the company fades
| away.
|
| The only boss I ever followed to a new job had two sayings
| that I stole. 1) "If you want to complete a meeting in an
| hour you need to complete half the meeting in half an hour"
| and the other was about scaling: The first
| order of magnitude is simple. The second is difficult. The
| third requires talent (sometimes tacking on "which not all
| teams have.")
|
| You can interpret that a number of ways, but if your founding
| team isn't paying homage to the problem of scaling up to 1000
| times the users you have at the beginning, you probably don't
| have the talent to get there (you didn't buy it, and you
| didn't build it along the way). I don't know what percentage
| of companies that claim "we did all the right things and
| still failed" have this problem, but I sincerely suspect it's
| at least 10%, which is a very big failure mode to ignore.
| ggm wrote:
| Nice article. I didn't see it say "don't do it" I saw it say
| "try not to have to"
| xyzelement wrote:
| To sum up what I learned from the comments:
|
| Is FreeBSD stupid for having used bubblesort to begin with? NO.
| It worked well for decades before surfacing as a problem in this
| extreme and unanticipated use case.
|
| Is optimizing this a waste of time? NO. The use case revolves
| around super frequent bootups to power lambda and for this type
| of thing every ms matters.
| Sharlin wrote:
| As is often said, O(n^2) (or even 3) is fine until it stops
| being fine.
| alkonaut wrote:
| Yeah and it almost always stops being fine. The problem with
| an O(N^2) solution is that you rarely put in an assert that N
| < 100 or something like that to guard your assumption that
| big N "doesn't happen". Because after all it's better that
| the process eventually completes than that the process
| crashes. And if you don't dare to put that in, you perhaps
| shouldn't use O(N^2).
|
| N^2 or N^3 is fine for things that are fixed size and very
| unlikely to stop being fixed size, or at least unlikely to
| grow a lot. But any time you think "oh it's just a few files"
| when there is nothing to stop it from being 1000 files in ten
| years, or at one rare user: don't use an O(N^2) solution.
| masklinn wrote:
| Here it took almost 30 years for it to stop being fine,
| which is a pretty good run.
|
| And it was a blatant O(n^2) algorithm, easily swapped out,
| it's not like they were calling sscanf in a loop and
| whoopsie turns out C strings are shit so now parsing a 4
| characters decimal number is O(buflen^2).
| alkonaut wrote:
| One should indeed be forgiven when getting 30 years out
| of a code base, any code base. It's hard to predict that
| you'd need to shave milliseconds of cloud/VM startups or
| from some embedded device, if you can't imagine Clouds,
| VMs or computers the size of watches.
| kortilla wrote:
| > Yeah and it almost always stops being fine.
|
| No, it's just a non-issue in the 99% of cases where it
| really does stay at small inputs. I come across
| configuration code all of the time that is stupidly
| inefficient like this and it's almost always irrelevant.
| Tons of stuff works on inputs of <10.
| alkonaut wrote:
| It's an observation with heavy selection bias: in 100% of
| the cases where you had to deal with a perf problem it
| had stopped being fine... Since it also almost invariably
| is accidentally quadratic and not deliberately quadratic
| as with BubbleSort, you also have a hard time estimating
| how many quadratic algorithms you have that haven't
| caused problems, and perhaps never will.
|
| Adding complexity because of an anticipation that some
| input will grow to an unreasonable size (at the time of
| writing) may or may not be a good idea. But using a
| different off the shelf function or data structure is
| usually not a high cost for making sure it's not a
| problem down the line. Of course, that also assumes that
| there _is_ a good and efficient standard lib of methods
| and data structures so that "making a more clever
| choice" isn't also a burden or risk in terms of code
| size, complexity, risk of bugs.
| toast0 wrote:
| > Yeah and it almost always stops being fine.
|
| Properly predicting if it will be fine or not is a sign of
| wisdom. But something like this is a fine place to use
| something where the complexity blows up. The sort happens
| in a clear place; early boot is a little hard to debug, but
| still. It's much worse when you put something N^3 in where
| when it blows up, the time spent is diffuse.
| JonChesterfield wrote:
| Old colleague had some wisdom on this which stuck with me.
| Thanks Ben! Spoken at separate times.
|
| - it's constant time or it's crap
|
| - constant time means your best case is as slow as your worst
| case
|
| In this case, constant time would be sorting it before the
| program runs.
| Sharlin wrote:
| > - constant time means your best case is as slow as your
| worst case
|
| Which is in fact important in crypto code to prevent timing
| attacks.
| adam_oxla wrote:
| It is surprising how popular is using O(N^2) algorithms that
| are simple in implementation even in extremely popular
| libraries. E. g. ICU search method: https://github.com/unicode-
| org/icu/blob/a7a2fdbcf257fa22c7a2... ICU is used by Python,
| Windows or OpenOffice.
| kragen wrote:
| yes, it was stupid to use bubble sort to begin with; there is
| never a good reason to use bubble sort for
| (size_t i = 0; i < n; i++) { for (size_t j = 1; j <
| n - i; j++) { if (a[j-1] > a[j]) {
| item_t tmp = a[j]; a[j] = a[j-1];
| a[j-1] = tmp; } } }
|
| when you could use insertion sort for (size_t
| i = 1; i < n; i++) { for (size_t j = i; j > 0; j--)
| { item_t tmp = a[j]; if (tmp >
| a[j-1]) break; a[j] = a[j-1];
| a[j-1] = tmp; } }
|
| which is no more complicated (both are 17 amd64 instructions
| with -Os), but twice as fast in the random case, and enormously
| faster in the sorted and almost-sorted cases
|
| (above code only minimally tested:
| https://godbolt.org/z/43dqz9Gq8)
|
| this is not much of a criticism of freebsd; if you polished
| your codebase until there was nothing stupid in it, you'd never
| ship anything. freebsd's code quality is quite high, but
| probably we can all agree that this was stupid
|
| of course it wouldn't be nearly as fast as mergesort on large
| inputs, but mergesort is more complicated
|
| (also mergesort is slower on small inputs than insertion sort
| or bubble sort, but in the case where you're only sorting
| something once at startup time, that probably isn't important.
| it might matter if you were sorting a million lists of ten
| items or something)
| SomeoneFromCA wrote:
| Neither of your code snippets are working, or are an actual
| implementation of bubble sort.
|
| Since what insertion sort is faster that bubble sort on
| sorted data?
| kragen wrote:
| i tested both of them on some input data and they both
| sorted it correctly, though of course they could contain
| bugs (if so i'd be grateful if you pointed them out)
|
| admittedly i probably added the godbolt link showing this
| after you wrote your comment
|
| perhaps you can elaborate; i couldn't understand your last
| sentence or why you think the bubblesort function above
| doesn't bubble-sort
| SomeoneFromCA wrote:
| Because it misses a crucial step, early exit if there
| were no swaps. This way it would be extremely efficient
| in case of already sorted data.
|
| You should probably check wikipedia page. Here:
| https://en.wikipedia.org/wiki/Bubble_sort.
| openasocket wrote:
| If you look at the actual change in FreeBSD (https://cgit
| .freebsd.org/src/commit/?id=9a7add6d01f3c5f7eba8... )
| you'll see that the implementation of bubble sort used in
| the FreeBSD kernel was not doing early exits.
| /* * * Perform a bubble sort of the system
| initialization objects by * their subsystem
| (primary key) and order (secondary key). */
| TSENTER2("bubblesort"); for (sipp = sysinit; sipp <
| sysinit_end; sipp++) { for (xipp = sipp + 1; xipp <
| sysinit_end; xipp++) { if ((*sipp)->subsystem <
| (*xipp)->subsystem || ((*sipp)->subsystem ==
| (*xipp)->subsystem && (*sipp)->order <=
| (*xipp)->order)) continue; /* skip*/ save
| = *sipp; *sipp = *xipp; *xipp = save;
| } } TSEXIT2("bubblesort");
| jazzyjackson wrote:
| meh, an optimization, hardly "not working"
| SomeoneFromCA wrote:
| It is not "an optimization", it is in the definition of
| Bubble Sort. What you have written is not a bubble sort,
| and therefore, by the fact of being not a bubble sort, is
| not a working bubble sort.
| thaumasiotes wrote:
| That's not part of the definition of bubble sort.
|
| It's also not an improvement over the insertion sort. If
| you look at kragen's code, you'll see that running it on
| a list of e.g. 6 elements in ascending order will mean
| that you make 5 comparisons, swap 0 elements, and exit.
| (You do have to count from 1 to 6 as you do this.)
|
| The insertion sort is implemented as a bubble sort,
| interestingly enough, but one that doesn't consider the
| entire array at every iteration. Given that it's equally
| correct, it's not difficult to understand why it will
| always be faster than a bubble sort that _does_ consider
| the entire array at every iteration.
| SomeoneFromCA wrote:
| It is in the definition in Wikipedia and most other
| sources I have checked. All the implementations I have
| seen so far, check if any swaps have happened, and stop
| if they have not.
| kragen wrote:
| not clrs
| SomeoneFromCA wrote:
| In any case, you should retract you claim that Bubble
| Sort is slower on already sorted case. It is CS101, that
| bubble sort is one of the fastest in this case.
| kragen wrote:
| after i posted the below comment to say 'oh yeah, you're
| right, thanks' someonefromca, perhaps inadvertently,
| edited the parent to say something completely different
| from what i was agreeing with. it previously said that
| there was a bug in the insertion sort code because it
| would swap things repeatedly, which is true and is a bug.
| my reply to their original comment is below
|
| oh yeah, you're right, thanks, it does do useless extra
| work if there are equal values in the list
|
| it should say if (tmp >= a[j-1]) break;
|
| however, this doesn't make it fail to terminate, which is
| a possible reading of your bug report
| kragen wrote:
| i think i wrote that page originally 22 years ago, before
| the switch from usemodwiki to mediawiki; while i've often
| been referred to wikipedia to correct my errors, i think
| this is the first time i've been referred to a wikipedia
| page that i wrote in the first place :)
|
| it's true that, if your data is exactly sorted, or has a
| small number of items that are far too early ( _e.g._ ,
| 43 3 4 5 6 7 8 9 10), you can modify bubble sort to
| notice this and exit early, at the expense of making it
| more complicated and even slower in the average case
|
| however, this doesn't help if instead you have a small
| number of items that are far too late ( _e.g._ , 4 5 6 7
| 8 9 10 43 3), in which case bubble sort takes the same
| number of passes as it normally would, while insertion
| sort takes linear time. of course you can complicate
| bubble sort further into "shaker sort" to handle this,
| which makes it even more complicated but at least doesn't
| slow it down any further
|
| but all of this is sort of futile because it's still half
| as fast as insertion sort in the case of unsorted input,
| and even without the tweaks, no simpler
| Flockster wrote:
| You did: https://en.wikipedia.org/w/index.php?title=Bubbl
| e_sort&oldid...
| kragen wrote:
| well, that's just the first mediawiki revision of the
| page; i was working on it earlier than that, which is why
| i was editing the subpage link syntax in that version,
| but the usemodwiki history has been lost
|
| someone else might have written the original version, i
| can't remember any more, but i think it was probably me
| SomeoneFromCA wrote:
| I do not have access to the historic definition of bubble
| sort, but all the modern implentation i've seen so far,
| all have a check if there were no swaps. I absolutely
| cannot wrap my mind how exactly it would make it slower
| in the average case, I almost certain it won't, how
| exactly this would make code sgnificantly more complex to
| any meaningful extent.
| kragen wrote:
| we're talking about six lines of code if we don't count
| the curly braces (which can be omitted)
|
| bubble sort with early exit is for
| (size_t i = 0; i < n; i++) { bool sorted = 1;
| for (size_t j = 1; j < n - i; j++) { if
| (a[j-1] > a[j]) { item_t tmp = a[j];
| a[j] = a[j-1]; a[j-1] = tmp;
| sorted = 0; } }
| if (sorted) break; }
|
| which is three more lines of code, which is in some crude
| sense a complexity increase of 50% over both the simple
| bubble sort in my earlier comment and over insertion sort
|
| if you compile it https://godbolt.org/z/jdzov8cPn you can
| see that the inner loop goes from 10 instructions to 11.
| that doesn't guarantee that it's 9% slower (that depends
| a lot on the platform, and in particular on a big
| superscalar cpu it probably just uses a little more power
| but the same number of clocks) but it does tend in that
| direction. of course occasionally it will save you a pass
| or two over the array, which will tend to compensate, but
| it won't usually save you 9% of the passes
|
| if some item is in position n and needs to be in position
| m, where m < n, it will require n - m passes of bubble
| sort to get there. a little thought suffices to show
| that, for randomly shuffled input, where m = 0 the item
| has to move about half the size of the array, where m = 1
| it has to move about half the size of the array minus one
| half, where m = 2 it has to move about half the size of
| the array minus one; but it's the _maximum_ of these
| three numbers which determines the number of passes after
| which all three of those positions will have the correct
| item, and that maximum is heavily skewed to be almost the
| entire size of the array. the chance that none of those
| three items is in the last 25% of the array originally is
| only .75*3 = 42%. and as the value of three increases, it
| gets worse. so in general early exit doesn 't buy you
| much
|
| as for modern implementations of bubble sort, they're all
| either didactic classroom exercises like this one or
| careless mistakes like the original freebsd code; because
| insertion sort always beats it, there aren't like a
| community of bubble sort implementors who have annual
| bubble sort conferences. so you can expect some variation
| in precise terminology and shouldn't worry about it
|
| (fwiw clrs 3ed. gives bubblesort without an early
| termination test, as i did; their definition is in
| exercise 2-2 on p. 40. but they're counting in the
| opposite direction, so everything i said above about
| items moving forward quickly and backward slowly is
| reversed for the clrs version of bubblesort)
| bitRAKE wrote:
| The early exit can be done without an extra flag, but
| increased code size.
| SomeoneFromCA wrote:
| Your original claim that is more complex or slower is
| unsubstantiated. Yes, it is trivially more complex to add
| a flag, but it does not actually make it any more
| difficult to understand or error prone. Your argument is
| way too pedantic. Claim about the performance is equally
| unsubstantiated. To make a proper proof you need to show
| that on average, we still have all N outer passes
| executed, before we face no swaps situation. Which almost
| certainly is never true, especially with non-random data.
|
| Although I certainly agree that Bubble Sort is almost
| always a bad choice, it still is used in computer
| graphics, or other situations when you have only one or
| two misplaced pairs of elements. In this case it will
| outperform the other algorithms, in average case while
| maintainig functionally in the less than ideal situation.
|
| Besides it is entirely irrelevant what are the purposes
| the algorithm is used today, didactic, deliberate or poor
| judgement, the de facto standard way of doing it today is
| to check for the early exit. Your argument about poor
| performance on the already sorted data is a straw man
| argument, your are stubbornly defending. Well ok, if you
| believe so...
| kragen wrote:
| if you aren't convinced that adding the early exit
| generally hurts performance, well, it's not surprising;
| measuring performance is difficult and tricky, and you're
| faced here with claims you don't really have the
| knowledge to evaluate properly, so it's entirely proper
| that you are dubious about them, even if you're confused
| about what a proper proof would consist of
|
| it might help your intuition to think of it this way:
|
| 1. with random input, the first ten positions of output
| are pretty likely (65%) to have an item that was
| originally in the last 10% of the input, and almost
| certain (97%) to have an item that was originally in the
| last 30% of the input. so those ten items alone account
| for needing 95% of the passes you'd need without an early
| exit, minus about five, in 65% of cases.
|
| 2. the first 20 positions are pretty likely (64%) to have
| an item that was originally in the last 5% of the input,
| and almost certain (96%) to have an item that was
| originally in the last 15% of the input. so those 20
| items account for needing 97% of the passes you'd need
| without an early exit, minus about ten, in 64% of cases.
|
| 3. the first 30 positions are pretty likely (64%) to have
| an item that was originally in the last 3.3% of the
| input, and almost certain (96%) to have an item that was
| originally in the last 10% of the input. so those 30
| items account for needing 98.5% of the passes you'd need
| without an early exit, minus about 15, in 64% of cases.
|
| of course, if you're only sorting 30 items, the first ten
| positions of output are actually a third of all the
| positions, so "minus five" is pretty significant. and
| probably if you're sorting more than about 30 items at a
| time you ought to use a linearithmic sort or a linear-
| time radix sort instead of a quadratic sort
|
| similarly, it's reasonable to argue that complexity (in
| the sense of containing many parts, not in the sense of
| asymptotic computational work required) is subjective,
| and to disagree about it
|
| but i think these are sort of moot points
|
| insertion sort still beats bubble sort in the situations
| where you have only one or two misplaced pairs of
| elements, and generally replaces it as soon as somebody
| notices
|
| as for 'the de facto standard way of doing it today', i
| suggest you argue with clrs, not with me
| SomeoneFromCA wrote:
| Count number of outer passes of insertion sort (it is N)
| vs bubble sort (with early exit; it is 2) if you have
| only a single pair misplaced.
| kragen wrote:
| insertion sort only ever makes one outer pass; it does
| almost exactly half as many comparisons as bubble sort (n
| rather than 2n-2) in that case, and the same number of
| swaps (1)
|
| the wikipedia article has a cleverer early-exit version
| of bubble sort that keeps track of the last pair it had
| to swap; it will do 1.5n-1 comparisons in that case on
| average (because the swapped pair is on average in the
| middle of the array)
|
| above i linked to working code on godbolt (modulo the
| extra unnecessary swaps on duplicate keys that you were
| kind enough to point out earlier); edit a comparison
| counter and a swap counter into it and you'll see
| Closi wrote:
| I think we have different definitions of stupid - the
| original implementation was perfectly fine for the time and
| requirements it was written for.
| winternewt wrote:
| But it could have been better for no additional effort.
| Doing something worse when the better option is just as
| easy suggests incompetence.
| jazzyjackson wrote:
| Doing something better when you already have something
| that works suggests a lack of time management xP
| kragen wrote:
| i wouldn't go so far as to say _incompetence_
|
| even the best of us do stupid things sometimes
| ska wrote:
| > But it could have been better for no additional effort.
|
| Not if it was already written.
|
| I don't know anything about this particular codebase, but
| one thing I have learned over the years is that when you
| are tempted to say "such and such decision was stupid" in
| a production codebase, you are almost always missing
| something.
| Closi wrote:
| Yeah, but it was working.
|
| And maybe it wasn't 'no additional effort' because they
| were more confident with quickly writing a bubble sort.
| throwawaaarrgh wrote:
| Never underestimate incompetence. It usually succeeds
| despite itself.
|
| It also could have been apathy: knowing there was a
| better solution and just not caring what was implemented.
| I feel like this is more common than incompetence,
| especially in larger projects or organizations. Nature
| tends to take the path of least resistance.
| moffkalast wrote:
| Then the requirements were terrible?
| Closi wrote:
| I don't think the 90's developers could have forseen that
| 2ms on boot would be huge for cloud serverless compute
| like lamdba.
| moffkalast wrote:
| On an OS whose only gimmick is that it's fast and stable?
| Idk, maybe they could've.
|
| Besides there's little reason for any codebase of any
| kind of ever contain bubble sort (except for an academic
| one, showing it as an example of a bad one). We're
| talking 20 years ago, not 2000 years ago, the basic
| sorting algorithms were pretty well researched and
| established by then.
| oblio wrote:
| True, but a lesson we frequently have to learn over and
| over again is that the code is in software in widespread
| use, the only valid values for collection sizes are 0 and
| N -> [?].
| Closi wrote:
| > the only valid values for collection sizes are 0 and N
| -> [?].
|
| Of course there are valid values below that...
|
| If they were designing to a collection size of N -> [?]
| then they shouldn't have used Insertion Sort either then,
| because it is O(n2) in the worst case.
| oblio wrote:
| True, but make that valid value at least something like 1
| million.
|
| It's trivial to start with something that gets human
| input and someone, somewhere, decides to plug the thing
| into something else that generates that input.
|
| And we all know that if computers are good at something,
| it's doing a lot of small things millions of times.
| erk__ wrote:
| For reference the original code was a bit more complicated
| than just sorting integers which may also have been a reason
| for the choice. /* *
| Perform a bubble sort of the system initialization objects by
| * their subsystem (primary key) and order (secondary key).
| * * Since some things care about execution
| order, this is the * operation which ensures
| continued function. */ for( sipp =
| (struct sysinit **)sysinit_set.ls_items; *sipp; sipp++) {
| for( xipp = sipp + 1; *xipp; xipp++) {
| if( (*sipp)->subsystem < (*xipp)->subsystem ||
| ( (*sipp)->subsystem == (*xipp)->subsystem &&
| (*sipp)->order < (*xipp)->order))
| continue; /* skip*/ save =
| *sipp; *sipp = *xipp;
| *xipp = save; } }
|
| https://github.com/freebsd/freebsd-
| src/blob/2b14f991e64ebe31...
| eru wrote:
| I guess the bigger problem is using a language like C that's
| almost incapable of providing useful libraries. So everyone
| is re-implementing everything from scratch all the time.
| yehaaa wrote:
| qsort is part of the ISO C90 standard though.
| mort96 wrote:
| I would trust myself more to implement bubble sort than to
| implement insertion sort. Not that I couldn't implement
| insertion sort, I've done it before, but if I was in a
| context where I had to implement a quick and dirty ad-hoc
| sorting algorithm which I thought would only ever have to
| sort a few items, I would probably go with bubble sort.
|
| If I suspected at all that the algorithms would have to sort
| a decent number of items, I wouldn't consider using an O(n^2)
| sort at all and would take the time to find a good
| implementation of (or implement) one of the O(n log(n))
| algorithms.
| donkeybeer wrote:
| I trust myself most to implement selection sort, its the
| most intuitive to me: just keep finding the next smallest
| element in the array and place it in its position.
|
| I can never remember the loop bounds of bubble sort, always
| have to look it up.
| Gibbon1 wrote:
| I just remember something from basic math. Exponential
| curve < linear for small values.
|
| That said a lot of time I have small arrays of objects.
| Instead of keeping them sorted I just grind through them to
| find the next largest value. That tends to be hard to get
| wrong.
| Thiez wrote:
| Exponential _can_ be better than linear for small n
| depending on your constant factors, but it 's never a
| given. E.g. `2n` < `10^n + 5` for all n.
| scythe wrote:
| Pratt-Shell is only mildly more complex than insertion sort
| and gives a good worst-case performance O(n log(n)^2). It
| may have a place sometimes.
| kragen wrote:
| my subjective experience writing the above comment as
| documented in https://news.ycombinator.com/item?id=37206998
| may be a relevant data point; note that the insertion-sort
| code is subtly buggy, and the bubble-sort code isn't
| arthur2e5 wrote:
| Personally I find insertion easier to reason about once you
| internalized the idea of a memmove.
| mort96 wrote:
| To me, it seems like bubble sort is almost inherently
| easier to reason about.
|
| One iteration of bubble sort is: find two elements in the
| wrong order and swap them.
|
| One iteration of insertion sort is: find an element out
| of order, then find the location in the array where it
| should've been, then move all the items between those two
| slots one slot ahead, then insert the out-of-order
| element into the now free slot where it belongs.
|
| Admittedly, insertion sort is pretty intuitive, since it
| reflects more or less how a human would instinctively try
| to sort a list; take an out-of-order element and move it
| to where it belongs. But when translated to code, I think
| it's very hard to beat "find two items in the wrong order
| and swap them".
| hasmanean wrote:
| A lot of people look at sorting algorithms with the
| mental model of a 1970s era PDP-8. They analyze scalar
| instructions as if they are being executed sequentially.
|
| Once you admit superscalar execution (multiple loadstore
| pipes), out of order execution and caches into the
| picture ( as well as the possibility of almost sorted
| arrays as input) the picture is never as simple.
|
| That said, I'm curious whether insertion sort is
| categorically better than bubble sort. I thought they
| might have been equivalent for the almost sorted case.
| I've heard both sides argued on this thread.
| masklinn wrote:
| Per the wiki:
|
| > Adaptive, i.e., efficient for data sets that are
| already substantially sorted: the time complexity is
| O(kn) when each element in the input is no more than k
| places away from its sorted position
|
| In fact it's often a sub-sort of hybrid sorts, like
| timsort:
|
| > If a run is smaller than this minimum run size,
| insertion sort is used to add more elements to the run
| until the minimum run size is reached.
| eru wrote:
| If you find yourself having to re-implement a sorting
| algorithm ad-hoc, you are probably using the wrong
| language. (Especially these days.)
| kstrauser wrote:
| ...or you're implementing a kernel and don't have full
| access to a libc.
| eru wrote:
| Who says you have to use libc?
|
| Eg Rust has no-std crates that you can use even if you
| can't use the standard library.
| mort96 wrote:
| What language should they have used instead of C?
|
| And given that C's stdlib has a quicksort function, how
| would using a different language have helped?
| eru wrote:
| Back then C might have been the best bet.
| gpvos wrote:
| _> above code only minimally tested_
|
| There you have your problem: any programmer can write a
| bubble sort while half asleep and inebriated at the same
| time, but for an insertion sort you actually have to think a
| little. There's much more chance of errors.
| moffkalast wrote:
| This has gotta be the most bullshit justification for
| writing bad kernel code that will be reviewed and tested by
| 200 people before it reaches production. Literally anything
| written as low level as this has the potential to crash the
| system on a dime and needs to be thoroughly tested.
| Quekid5 wrote:
| If your testing infrastructure is so anemic that a
| _sorting_ algorithm (of all things) is hard to subject to
| e.g. property-based /fuzzing tests then it'd probably be
| worth it to expand _that_ before doing anything else.
|
| I dunno what FreeBSD's test infrastructure is, but I'm
| just saying that "it needs testing" is not a good reason
| not to change to a better sort algorithm in this case.
| kragen wrote:
| if only the people working on bsd in 01993 had had access
| to your wisdom, think how much more successful they would
| have been
| jcelerier wrote:
| have you ever been bitten by typing a year in a C/C++
| program and getting it inadvertently converted to octal?
| Quekid5 wrote:
| This is about _changing_ the sorting algoritm. No snark
| needed. It should be trivial to add a test for the
| current algorithm (whenever it was introduced) and to
| replace it afterwards. Heck, even keep running both for
| bisimulation testing.
| kragen wrote:
| probably colin was more interested in booting faster than
| in improving testing infrastructure
|
| if your interests run the other way, that's great, and
| hopefully you will succeed in improving freebsd's testing
| infrastructure, but it's hardly a reason for colin not to
| speed up booting
|
| but i don't think i saw anyone claiming that the change
| should not be made because it wouldn't be adequately
| tested?
| Quekid5 wrote:
| I don't think you fully read the post I was replying to.
| I was just saying that that poster's argument against
| changing the sorting algorithm was misguided.
|
| Quote:
|
| > will be reviewed and tested by 200 people before it
| reaches production
|
| I was _only_ saying that this is a bad argument for
| changing to any simple-ish sorting algorithm. (Because:
| add tests if you need assurance)
|
| Granted, if you're doing super-complicated sorting algos
| with fallbacks for different sizes, heuristics for
| detecting semi-sorted input, etc. you might want
| something more sophisticated than trivial property-based
| checking.
| astrange wrote:
| Old code in BSD wasn't thoroughly tested except by virtue
| of being old; there's no QA department and traditionally
| UNIX code was tested by thinking about it really hard
| rather than automated testing.
|
| (FreeBSD got a test suite in 10.1, which came out in
| 2014, not that long ago in UNIX terms.)
| oblio wrote:
| Not just in BSD... GNU wasn't much better.
|
| People would be shocked by how little the core tools we
| use were tested automatically. Some still aren't.
| Quekid5 wrote:
| Quite.
|
| Thankfully the practice of just randomly shelling out to
| 'random' programs (like the core tools) has been stemmed
| somewhat, so perhaps we're mostly safe from RCE via that
| vector. Downloaded files might still be a vector, tho.
| Maybe I'm an optimist even though I sound very
| pessimistic.
| [deleted]
| kragen wrote:
| i'm not sure; i made a few mistakes in each of them as i
| was writing them, but they were both working by the time i
| first tested them. i do feel like the insertion sort was
| maybe a _little_ more challenging?
|
| edited to add: someonefromca spotted a bug in the insertion
| sort; where it says if (tmp > a[j-1])
| break;
|
| it should say if (tmp >= a[j-1]) break;
|
| which is both a performance bug and also breaks the
| stability property the algorithm otherwise has
|
| this is some evidence in favor of your point that insertion
| sort is more error-prone; even if i could have made the
| same error in the bubble sort, i didn't, and possibly that
| was because the rest of the algorithm required less mental
| effort?
| abecedarius wrote:
| fwiw that comparison was where I went "wait, is that
| right?" in my reading.
|
| Maybe a fair judge of mental difficulty would say "you
| gotta write at least the loop invariant" (termination
| being too pedantically obvious) -- though admittedly I
| leave it out more than I used to.
|
| When I was introduced to sorting as a new programmer, via
| bubblesort, I thought "why _that_ way? " and came up with
| selection sort as a method that made more obvious sense.
| It seems to me like bubblesort's alleged intuitiveness is
| post hoc from the shortness of the code in terms of all-
| ascending for-next loops -- the intuitive reason it
| worked was just that the inner loop did move the next max
| into place just like selection sort, but with extra swaps
| along the way.
| kqr wrote:
| > there is never a good reason to use bubble sort
|
| I used to believe this but then a friend pointed out a real
| use case for bubble sort: when you have an array that is
| always mostly sorted, and real-time performance is more
| important than perfectly correct ordering. Then running one
| pass of bubble sort each cycle around the main loop is
| somewhat satisfyingly ideal.
|
| One pass of insertion sort would be too expensive, and bubble
| sort is guaranteed to make incremental progress each pass.
|
| This is sort of like how a linked list is never the right
| solution... unless it's one of those cases where intrusive
| linked lists are ideal. Except while intrusive linked lists
| intrude in space, intrusive bubble sort kind of intrudes in
| time.
| ramses0 wrote:
| I remember a website that "raced" different sorting
| algorithms visually but I don't remember which website, or
| which oddball algorithm that stood out to me as being
| especially good for "continuous partial ordering".
|
| Most sorts are designed to "really sort something", however
| I'm very interested in eg: ranking preferences of movies,
| and in cases like that there may never be "the one true
| ordering" ... however the mechanism for getting things
| "closer to sorted" is incredibly close to the mechanism you
| mentioned: multiple passes, and continual, incremental
| progress.
|
| My imaginary GUI for interactive sorting in this manner is
| something like starting with a few "A vs. B" of a generally
| accepted "top 10/100 vs generic v pretty bad", and then
| morph to a "drag this new movie in between a few other
| movies either above, between, or below"
|
| There are sorting algorithms for computers (eg: compare +
| swap), but fewer sorting algorithms for humans (eg: bucket
| sort / postman's sort -
| https://en.wikipedia.org/wiki/Bucket_sort#Postman's_sort )
| which can take advantage of "not just compare and swap" to
| get things "close to correct" quicker.
| janc_ wrote:
| When this code was written there was no "amd64" though, so
| you would have to compare them on the original/typical
| hardware from then?
| [deleted]
| emmelaich wrote:
| Why didn't they just use qsort, which is in libc?
| masklinn wrote:
| It's kernel code do no libc?
___________________________________________________________________
(page generated 2023-08-21 23:02 UTC)