[HN Gopher] Blitsort is an in-place stable adaptive rotate merge...
___________________________________________________________________
Blitsort is an in-place stable adaptive rotate merge sort
Author : signa11
Score : 91 points
Date : 2021-07-09 10:38 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| scandum wrote:
| Might be worth noting that it's faster than quicksort,
| std::stable_sort, and timsort.
| jtc331 wrote:
| On certain data sets.
| CJefferson wrote:
| I wonder why they don't compare against std::sort? That seems
| like most people's standard choice.
| mlochbaum wrote:
| These are all pretty slow on 32-bit integers, which you're
| using for testing. Could you please include a comparison with
| pdqsort, or at least mention that pdqsort is to be preferred
| when an element fits in a register (as it certainly will be if
| you're only beating timsort by 20% or so)? It's frustrating to
| see sorting algorithms with some effort spent on benchmarking
| and still have to guess at whether it's practical based on how
| much slower the comparison points are than the state of the
| art.
|
| Radix sorts (like your own wolfsort!) will also be much faster
| on the benchmarks presented, although that's sort of misleading
| since they are very good at uniformly random input and not so
| good at other distributions.
| scandum wrote:
| pdqsort isn't stable and still suffers from killer inputs.
|
| Not sure how well this will paste: Name |
| Items | Type | Best | Average | Loops | Samples |
| Distribution -------- | -------- | ---- | -------- |
| -------- | --------- | ------- | ----------------
| blitsort | 100000 | 32 | 0.005962 | 0.006532 | 1
| | 100 | random order pdqsort | 100000 | 32
| | 0.002665 | 0.002802 | 1 | 100 | random
| order | | | |
| | | | blitsort | 100000 | 32 |
| 0.000069 | 0.000078 | 1 | 100 | ascending order
| pdqsort | 100000 | 32 | 0.000098 | 0.000100 | 1 |
| 100 | ascending order | | |
| | | | | blitsort | 100000
| | 32 | 0.001138 | 0.001196 | 1 | 100 |
| ascending saw pdqsort | 100000 | 32 | 0.003245 |
| 0.003317 | 1 | 100 | ascending saw
| | | | | | |
| | blitsort | 100000 | 32 | 0.003777 | 0.003843 |
| 1 | 100 | generic order pdqsort | 100000 |
| 32 | 0.000819 | 0.000851 | 1 | 100 | generic
| order | | | |
| | | | blitsort | 100000 | 32 |
| 0.000051 | 0.000060 | 1 | 100 | descending order
| pdqsort | 100000 | 32 | 0.000202 | 0.000207 | 1 |
| 100 | descending order | | |
| | | | | blitsort | 100000
| | 32 | 0.000815 | 0.000845 | 1 | 100 |
| descending saw pdqsort | 100000 | 32 | 0.002307 |
| 0.002368 | 1 | 100 | descending saw
| | | | | | |
| | blitsort | 100000 | 32 | 0.001761 | 0.001774 |
| 1 | 100 | random tail pdqsort | 100000 |
| 32 | 0.002560 | 0.002572 | 1 | 100 | random
| tail | | | | |
| | | blitsort | 100000 | 32 | 0.003328 |
| 0.003374 | 1 | 100 | random half
| pdqsort | 100000 | 32 | 0.002651 | 0.002854 | 1 |
| 100 | random half | | |
| | | | | blitsort | 100000
| | 32 | 0.000593 | 0.000938 | 1 | 100 |
| ascending tiles pdqsort | 100000 | 32 | 0.002316 |
| 0.002482 | 1 | 100 | ascending tiles
| nightcracker wrote:
| What do you mean by "suffers from killer inputs"? It is
| always O(n log n) by falling back to heapsort, so I don't
| think any input can be described as a "killer".
| mlochbaum wrote:
| You should explain these issues in the repository, not in
| Hacker News comments!
|
| A benchmark of 100,000 32 bit integers suggests that you
| will be comparing against algorithms that do well on that
| benchmark when it's not the case at all. The concept of
| stability doesn't apply when elements that compare the same
| are indistinguishable, which is the case for integers. And
| I think it's an exaggeration to say pdqsort has "killer
| inputs". In your benchmarks the patterned data always sorts
| faster than random, so the most you can say is that merge
| sort exploits some kinds of structure better (I _have_ seen
| patterns where it does worse, but not by much). I expect
| timsort has some similar advantages over blitsort because
| of its ability to find natural runs while blitsort always
| starts at size 32--try an up-down pattern with runs of
| length 50 for instance. The absolute worse case for pdqsort
| is a fallback to heapsort. With pivot randomization it 's
| pretty much impossible on large arrays unless they are
| engineered to make pdqsort fail, and you end up sorting the
| array three times slower or something. Not much of a DOS
| attack.
|
| Having a fast merge sort is definitely useful in a lot of
| situations (my thoughts at [1]), and I've been interested
| in quadsort particularly for that purpose (blitsort seems
| to emphasize lower external memory usage which I've never
| had a use for). Half as fast as pdqsort on random data is
| actually somewhat better than I expected. It feels
| dishonest to see "Blitsort has exceptional performance"
| followed by selective benchmarks showing worse algorithms,
| and it makes it much harder to figure out what blitsort
| _is_ good for.
|
| [1] https://mlochbaum.github.io/BQN/implementation/primitiv
| e/sor...
| scandum wrote:
| Well, blitsort's performance is exceptional in the sense
| that it's 15% faster than the next fastest stable in-
| place sort.
|
| As for stability, it's useful when you need to sort a
| table with an integer index. So it might be of value to
| database software.
|
| I probably should point out some of the weakness on the
| README.
|
| Agreed on pdqsort being 3x slower worst case on "killer"
| input, but it's my understanding that's why it's not
| being used to replace std::sort.
|
| As for runs of 50, haven't benched those, though I assume
| gridsort (https://github.com/scandum/gridsort) would
| handle those rather well.
| nightcracker wrote:
| > Agreed on pdqsort being 3x slower worst case on
| "killer" input, but it's my understanding that's why it's
| not being used to replace std::sort.
|
| This is just not true. Introsort (the current std::sort
| in all C++ compilers I know of) also has 'killer' inputs,
| and will switch to heapsort twice as slow than than
| pdqsort does in those. Worse, the libc++ implementation
| of std::sort has a quadratic killer sequence that I found
| 7 years ago, and it is still not fixed:
| https://bugs.llvm.org/show_bug.cgi?id=20837. In my
| opinion there is no good reason that pdqsort is not
| adopted yet as std::sort as it is faster for many
| patterns, faster for random input and rarely if ever
| slower.
|
| In fact, pdqsort _is_ the standard unstable sorting
| algorithm in Rust: https://doc.rust-
| lang.org/std/vec/struct.Vec.html#method.sor...
|
| Note that any deterministic quicksort variant that does
| not spend a lot of time on choosing its pivots will
| _always_ have some 'killer' pattern that forces it to
| use its fallback algorithm (or worse, go O(n^2)), as you
| can always shuffle the input array such that poor pivots
| are chosen. What pdqsort does is shuffle things around so
| that those 'killer' patterns are a lot less trivial, and
| are not as likely to occur in real-world input. For me
| determinism was very important so I didn't do it in
| https://github.com/orlp/pdqsort, but note that the
| variant implemented in Rust uses non-deterministic
| swapping and thus it is _impossible_ to force a 'killer'
| input against it.
| scandum wrote:
| Given how fast pdq is there's indeed no reason not to
| adopt it. Probably the usual 'not invented here' mindset.
| mlochbaum wrote:
| I guess that makes sense, but I never would have guessed
| that the category is in-place stable algorithms as it's
| only mentioned in the "about" line and not the text
| itself.
|
| Sorting on an integer key isn't the same as sorting a
| list of integers. They have different performance
| considerations.
| yunohn wrote:
| > The concept of stability doesn't apply when elements
| that compare the same are indistinguishable, which is the
| case for integers.
|
| If you're considering the raw comparison value, then
| sure, stability isn't applicable. But it's really rare
| that people are just sorting normal lists of numbers.
|
| Stable sorts come into the picture when the comparisons
| are referring to records, ie rows in a DB.
| amelius wrote:
| As a layman I'd expect advanced sorting algorithms to depend on
| the dimensions of the underlying memory architecture, e.g. size
| of CPU caches, and memory latency, bandwidth and such.
| mosaic_school wrote:
| Love this!
|
| Functions look clean and fas Also great README (documentation,
| evaluation, ..)
|
| I did not see a license file though. Is the repository intended
| as public domain?
|
| UPDATE: thanks, apparently I'm trained to skip file head
| sections.
| detaro wrote:
| All the source files have a license in them?
| kzrdude wrote:
| See the source files for a license.
| kzrdude wrote:
| It's using a qsort-like interface using a function pointer for
| the comparison - surely a C++ like implementation has a lot to
| gain here?
| scandum wrote:
| Correct, if you uncomment //#define cmp(a,b)
| (*(a) > *(b))
|
| in blitsort.h it'll run about 25% faster. Probably still slower
| than a native C++ implementation since it'll evaluate to (a >
| b) > 0 rather than (a > b), not sure if the compiler will
| optimize that.
| beeforpork wrote:
| What is the license for this code?
| hypertele-Xii wrote:
| Copyright (C) 2014-2021 Igor van den Hoven ivdhoven@gmail.com
|
| Permission is hereby granted, free of charge, to any person
| obtaining a copy of this software and associated documentation
| files (the "Software"), to deal in the Software without
| restriction, including without limitation the rights to use,
| copy, modify, merge, publish, distribute, sublicense, and/or
| sell copies of the Software, and to permit persons to whom the
| Software is furnished to do so, subject to the following
| conditions:
|
| The above copyright notice and this permission notice shall be
| included in all copies or substantial portions of the Software.
|
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
| EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
| OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
| NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
| HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
| WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
| FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
| OTHER DEALINGS IN THE SOFTWARE.
| StefanKarpinski wrote:
| I.e. MIT license
| lpapez wrote:
| _Blitsort supports long doubles and 8, 16, 32, and 64 bit data
| types. By using 32 or 64 bit pointers it 's possible to sort any
| other data type._
|
| Seems quite limited, wonder how it compares to radix sort.
| acmj wrote:
| In the source code, it is not apparent where the size
| restriction comes from. It seems to me that it can be easily
| modified to work with any data types. Maybe I am wrong. I am
| more concerned with this sentence:
|
| > _Blitsort 's performance is similar to that of quadsort as
| long as the auxiliary memory is greater or equal to the square
| root of the array being sorted, which comes out at 262,144
| elements with the default stack of 512 elements. Performance on
| larger arrays degrades marginally._
|
| I wonder what "marginally" means here. What if we sort 10
| million integers?
| scandum wrote:
| Blitsort can sort strings, but something like a 12 byte data
| structure would require an array with pointer references to
| sort.
|
| As for the degradation against std:stable_sort:
|
| 5% slower at 1 million, 10% slower at 10 million, 20% slower
| at 100 million.
|
| With sqrt n auxiliary you're looking at 2%, 4%, 6% slower.
|
| Against qsort() it remains faster at 10 million, 3% slower at
| 100 million.
| acmj wrote:
| I am thinking more about sorting, say, 20-byte or 24-byte
| custom structs. Directly sorting arrays is probably faster
| than sorting pointers to arrays. At least in my
| applications, I rarely just sort numbers; I more often sort
| numbers along with other data associated with the numbers.
| scandum wrote:
| Hard to say which would be faster, I've made a mental
| note to benchmark sorting 16 byte long doubles through
| pointers as well as directly.
|
| I never tried, but it should be possible (and relatively
| easy) to add custom sizes in the .h file.
| nuclearnice3 wrote:
| Is the included bench.c the right way to answer that
| question? 1m random dist size 128 avg qsort
| 0.206549 blitsort 0.281630 10m random dist size 32 avg
| qsort 2.963479 blitsort 4.394143 100m random dist size
| 128 avg qsort 34.996847 blitsort 54.102616
| scandum wrote:
| It is, those numbers are less favorable than they are on my
| own machine.
|
| The speed of your RAM memory is likely to have an influence
| on performance. My system is running at 2133MHz.
| [deleted]
| bradleyjg wrote:
| Radix sort wouldn't allow for the "possible to sort any other
| data type" part.
|
| I do wonder about not enabling inlining structs. I don't know
| where the crossover happens, and it certainly varies with
| hardware characteristics, but I'm sure a data structure made up
| of say an int and two longs would better be sorted in
| contiguous memory rather than as an array of pointers to random
| places in the heap.
| lpapez wrote:
| Exactly. It has the same limitations as a radix sort, which
| is why I find them comparable
| bradleyjg wrote:
| For a radix sort it's impossible to sort linked data
| structures, at least as I understand it.
|
| Whereas any comparison based sort can follow the pointers
| and do arbitrary comparisons.
___________________________________________________________________
(page generated 2021-07-10 23:01 UTC)