[HN Gopher] Blitsort is an in-place stable adaptive rotate merge...
       ___________________________________________________________________
        
       Blitsort is an in-place stable adaptive rotate merge sort
        
       Author : signa11
       Score  : 91 points
       Date   : 2021-07-09 10:38 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | scandum wrote:
       | Might be worth noting that it's faster than quicksort,
       | std::stable_sort, and timsort.
        
         | jtc331 wrote:
         | On certain data sets.
        
         | CJefferson wrote:
         | I wonder why they don't compare against std::sort? That seems
         | like most people's standard choice.
        
         | mlochbaum wrote:
         | These are all pretty slow on 32-bit integers, which you're
         | using for testing. Could you please include a comparison with
         | pdqsort, or at least mention that pdqsort is to be preferred
         | when an element fits in a register (as it certainly will be if
         | you're only beating timsort by 20% or so)? It's frustrating to
         | see sorting algorithms with some effort spent on benchmarking
         | and still have to guess at whether it's practical based on how
         | much slower the comparison points are than the state of the
         | art.
         | 
         | Radix sorts (like your own wolfsort!) will also be much faster
         | on the benchmarks presented, although that's sort of misleading
         | since they are very good at uniformly random input and not so
         | good at other distributions.
        
           | scandum wrote:
           | pdqsort isn't stable and still suffers from killer inputs.
           | 
           | Not sure how well this will paste:                     Name |
           | Items | Type |     Best |  Average |     Loops | Samples |
           | Distribution       -------- | -------- | ---- | -------- |
           | -------- | --------- | ------- | ----------------
           | blitsort |   100000 |   32 | 0.005962 | 0.006532 |         1
           | |     100 |     random order        pdqsort |   100000 |   32
           | | 0.002665 | 0.002802 |         1 |     100 |     random
           | order                |          |      |          |
           | |           |         |       blitsort |   100000 |   32 |
           | 0.000069 | 0.000078 |         1 |     100 |  ascending order
           | pdqsort |   100000 |   32 | 0.000098 | 0.000100 |         1 |
           | 100 |  ascending order                |          |      |
           | |          |           |         |       blitsort |   100000
           | |   32 | 0.001138 | 0.001196 |         1 |     100 |
           | ascending saw        pdqsort |   100000 |   32 | 0.003245 |
           | 0.003317 |         1 |     100 |    ascending saw
           | |          |      |          |          |           |
           | |       blitsort |   100000 |   32 | 0.003777 | 0.003843 |
           | 1 |     100 |    generic order        pdqsort |   100000 |
           | 32 | 0.000819 | 0.000851 |         1 |     100 |    generic
           | order                |          |      |          |
           | |           |         |       blitsort |   100000 |   32 |
           | 0.000051 | 0.000060 |         1 |     100 | descending order
           | pdqsort |   100000 |   32 | 0.000202 | 0.000207 |         1 |
           | 100 | descending order                |          |      |
           | |          |           |         |       blitsort |   100000
           | |   32 | 0.000815 | 0.000845 |         1 |     100 |
           | descending saw        pdqsort |   100000 |   32 | 0.002307 |
           | 0.002368 |         1 |     100 |   descending saw
           | |          |      |          |          |           |
           | |       blitsort |   100000 |   32 | 0.001761 | 0.001774 |
           | 1 |     100 |      random tail        pdqsort |   100000 |
           | 32 | 0.002560 | 0.002572 |         1 |     100 |      random
           | tail                |          |      |          |          |
           | |         |       blitsort |   100000 |   32 | 0.003328 |
           | 0.003374 |         1 |     100 |      random half
           | pdqsort |   100000 |   32 | 0.002651 | 0.002854 |         1 |
           | 100 |      random half                |          |      |
           | |          |           |         |       blitsort |   100000
           | |   32 | 0.000593 | 0.000938 |         1 |     100 |
           | ascending tiles        pdqsort |   100000 |   32 | 0.002316 |
           | 0.002482 |         1 |     100 |  ascending tiles
        
             | nightcracker wrote:
             | What do you mean by "suffers from killer inputs"? It is
             | always O(n log n) by falling back to heapsort, so I don't
             | think any input can be described as a "killer".
        
             | mlochbaum wrote:
             | You should explain these issues in the repository, not in
             | Hacker News comments!
             | 
             | A benchmark of 100,000 32 bit integers suggests that you
             | will be comparing against algorithms that do well on that
             | benchmark when it's not the case at all. The concept of
             | stability doesn't apply when elements that compare the same
             | are indistinguishable, which is the case for integers. And
             | I think it's an exaggeration to say pdqsort has "killer
             | inputs". In your benchmarks the patterned data always sorts
             | faster than random, so the most you can say is that merge
             | sort exploits some kinds of structure better (I _have_ seen
             | patterns where it does worse, but not by much). I expect
             | timsort has some similar advantages over blitsort because
             | of its ability to find natural runs while blitsort always
             | starts at size 32--try an up-down pattern with runs of
             | length 50 for instance. The absolute worse case for pdqsort
             | is a fallback to heapsort. With pivot randomization it 's
             | pretty much impossible on large arrays unless they are
             | engineered to make pdqsort fail, and you end up sorting the
             | array three times slower or something. Not much of a DOS
             | attack.
             | 
             | Having a fast merge sort is definitely useful in a lot of
             | situations (my thoughts at [1]), and I've been interested
             | in quadsort particularly for that purpose (blitsort seems
             | to emphasize lower external memory usage which I've never
             | had a use for). Half as fast as pdqsort on random data is
             | actually somewhat better than I expected. It feels
             | dishonest to see "Blitsort has exceptional performance"
             | followed by selective benchmarks showing worse algorithms,
             | and it makes it much harder to figure out what blitsort
             | _is_ good for.
             | 
             | [1] https://mlochbaum.github.io/BQN/implementation/primitiv
             | e/sor...
        
               | scandum wrote:
               | Well, blitsort's performance is exceptional in the sense
               | that it's 15% faster than the next fastest stable in-
               | place sort.
               | 
               | As for stability, it's useful when you need to sort a
               | table with an integer index. So it might be of value to
               | database software.
               | 
               | I probably should point out some of the weakness on the
               | README.
               | 
               | Agreed on pdqsort being 3x slower worst case on "killer"
               | input, but it's my understanding that's why it's not
               | being used to replace std::sort.
               | 
               | As for runs of 50, haven't benched those, though I assume
               | gridsort (https://github.com/scandum/gridsort) would
               | handle those rather well.
        
               | nightcracker wrote:
               | > Agreed on pdqsort being 3x slower worst case on
               | "killer" input, but it's my understanding that's why it's
               | not being used to replace std::sort.
               | 
               | This is just not true. Introsort (the current std::sort
               | in all C++ compilers I know of) also has 'killer' inputs,
               | and will switch to heapsort twice as slow than than
               | pdqsort does in those. Worse, the libc++ implementation
               | of std::sort has a quadratic killer sequence that I found
               | 7 years ago, and it is still not fixed:
               | https://bugs.llvm.org/show_bug.cgi?id=20837. In my
               | opinion there is no good reason that pdqsort is not
               | adopted yet as std::sort as it is faster for many
               | patterns, faster for random input and rarely if ever
               | slower.
               | 
               | In fact, pdqsort _is_ the standard unstable sorting
               | algorithm in Rust: https://doc.rust-
               | lang.org/std/vec/struct.Vec.html#method.sor...
               | 
               | Note that any deterministic quicksort variant that does
               | not spend a lot of time on choosing its pivots will
               | _always_ have some  'killer' pattern that forces it to
               | use its fallback algorithm (or worse, go O(n^2)), as you
               | can always shuffle the input array such that poor pivots
               | are chosen. What pdqsort does is shuffle things around so
               | that those 'killer' patterns are a lot less trivial, and
               | are not as likely to occur in real-world input. For me
               | determinism was very important so I didn't do it in
               | https://github.com/orlp/pdqsort, but note that the
               | variant implemented in Rust uses non-deterministic
               | swapping and thus it is _impossible_ to force a  'killer'
               | input against it.
        
               | scandum wrote:
               | Given how fast pdq is there's indeed no reason not to
               | adopt it. Probably the usual 'not invented here' mindset.
        
               | mlochbaum wrote:
               | I guess that makes sense, but I never would have guessed
               | that the category is in-place stable algorithms as it's
               | only mentioned in the "about" line and not the text
               | itself.
               | 
               | Sorting on an integer key isn't the same as sorting a
               | list of integers. They have different performance
               | considerations.
        
               | yunohn wrote:
               | > The concept of stability doesn't apply when elements
               | that compare the same are indistinguishable, which is the
               | case for integers.
               | 
               | If you're considering the raw comparison value, then
               | sure, stability isn't applicable. But it's really rare
               | that people are just sorting normal lists of numbers.
               | 
               | Stable sorts come into the picture when the comparisons
               | are referring to records, ie rows in a DB.
        
       | amelius wrote:
       | As a layman I'd expect advanced sorting algorithms to depend on
       | the dimensions of the underlying memory architecture, e.g. size
       | of CPU caches, and memory latency, bandwidth and such.
        
       | mosaic_school wrote:
       | Love this!
       | 
       | Functions look clean and fas Also great README (documentation,
       | evaluation, ..)
       | 
       | I did not see a license file though. Is the repository intended
       | as public domain?
       | 
       | UPDATE: thanks, apparently I'm trained to skip file head
       | sections.
        
         | detaro wrote:
         | All the source files have a license in them?
        
         | kzrdude wrote:
         | See the source files for a license.
        
       | kzrdude wrote:
       | It's using a qsort-like interface using a function pointer for
       | the comparison - surely a C++ like implementation has a lot to
       | gain here?
        
         | scandum wrote:
         | Correct, if you uncomment                 //#define cmp(a,b)
         | (*(a) > *(b))
         | 
         | in blitsort.h it'll run about 25% faster. Probably still slower
         | than a native C++ implementation since it'll evaluate to (a >
         | b) > 0 rather than (a > b), not sure if the compiler will
         | optimize that.
        
       | beeforpork wrote:
       | What is the license for this code?
        
         | hypertele-Xii wrote:
         | Copyright (C) 2014-2021 Igor van den Hoven ivdhoven@gmail.com
         | 
         | Permission is hereby granted, free of charge, to any person
         | obtaining a copy of this software and associated documentation
         | files (the "Software"), to deal in the Software without
         | restriction, including without limitation the rights to use,
         | copy, modify, merge, publish, distribute, sublicense, and/or
         | sell copies of the Software, and to permit persons to whom the
         | Software is furnished to do so, subject to the following
         | conditions:
         | 
         | The above copyright notice and this permission notice shall be
         | included in all copies or substantial portions of the Software.
         | 
         | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
         | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
         | OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
         | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
         | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
         | WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
         | FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
         | OTHER DEALINGS IN THE SOFTWARE.
        
           | StefanKarpinski wrote:
           | I.e. MIT license
        
       | lpapez wrote:
       | _Blitsort supports long doubles and 8, 16, 32, and 64 bit data
       | types. By using 32 or 64 bit pointers it 's possible to sort any
       | other data type._
       | 
       | Seems quite limited, wonder how it compares to radix sort.
        
         | acmj wrote:
         | In the source code, it is not apparent where the size
         | restriction comes from. It seems to me that it can be easily
         | modified to work with any data types. Maybe I am wrong. I am
         | more concerned with this sentence:
         | 
         | > _Blitsort 's performance is similar to that of quadsort as
         | long as the auxiliary memory is greater or equal to the square
         | root of the array being sorted, which comes out at 262,144
         | elements with the default stack of 512 elements. Performance on
         | larger arrays degrades marginally._
         | 
         | I wonder what "marginally" means here. What if we sort 10
         | million integers?
        
           | scandum wrote:
           | Blitsort can sort strings, but something like a 12 byte data
           | structure would require an array with pointer references to
           | sort.
           | 
           | As for the degradation against std:stable_sort:
           | 
           | 5% slower at 1 million, 10% slower at 10 million, 20% slower
           | at 100 million.
           | 
           | With sqrt n auxiliary you're looking at 2%, 4%, 6% slower.
           | 
           | Against qsort() it remains faster at 10 million, 3% slower at
           | 100 million.
        
             | acmj wrote:
             | I am thinking more about sorting, say, 20-byte or 24-byte
             | custom structs. Directly sorting arrays is probably faster
             | than sorting pointers to arrays. At least in my
             | applications, I rarely just sort numbers; I more often sort
             | numbers along with other data associated with the numbers.
        
               | scandum wrote:
               | Hard to say which would be faster, I've made a mental
               | note to benchmark sorting 16 byte long doubles through
               | pointers as well as directly.
               | 
               | I never tried, but it should be possible (and relatively
               | easy) to add custom sizes in the .h file.
        
           | nuclearnice3 wrote:
           | Is the included bench.c the right way to answer that
           | question?                 1m random dist size 128 avg qsort
           | 0.206549 blitsort 0.281630       10m random dist size 32 avg
           | qsort 2.963479 blitsort 4.394143       100m random dist size
           | 128 avg qsort 34.996847 blitsort 54.102616
        
             | scandum wrote:
             | It is, those numbers are less favorable than they are on my
             | own machine.
             | 
             | The speed of your RAM memory is likely to have an influence
             | on performance. My system is running at 2133MHz.
        
               | [deleted]
        
         | bradleyjg wrote:
         | Radix sort wouldn't allow for the "possible to sort any other
         | data type" part.
         | 
         | I do wonder about not enabling inlining structs. I don't know
         | where the crossover happens, and it certainly varies with
         | hardware characteristics, but I'm sure a data structure made up
         | of say an int and two longs would better be sorted in
         | contiguous memory rather than as an array of pointers to random
         | places in the heap.
        
           | lpapez wrote:
           | Exactly. It has the same limitations as a radix sort, which
           | is why I find them comparable
        
             | bradleyjg wrote:
             | For a radix sort it's impossible to sort linked data
             | structures, at least as I understand it.
             | 
             | Whereas any comparison based sort can follow the pointers
             | and do arbitrary comparisons.
        
       ___________________________________________________________________
       (page generated 2021-07-10 23:01 UTC)