hngopher.com

       [HN Gopher] My new rust binary search
       ___________________________________________________________________
        
       My new rust binary search
        
       Author : ChadNauseam
       Score  : 33 points
       Date   : 2024-09-07 19:10 UTC (3 hours ago)
        
 (HTM) web link (chadnauseam.com)
 (TXT) w3m dump (chadnauseam.com)
        
       | gnuvince wrote:
       | > Just look at the code. How could it be simpler?
       | 
       | Cannot tell if this is a joke.
        
         | Aeolos wrote:
         | Which part do you find complicated?
         | 
         | The code itself looks very straightforward, two preconditions
         | and a loop. I don't see any significant difference in terms of
         | complexity compared to any other low-level systems programming
         | language. It's obviously a bit more verbose than python or
         | haskell, but those are not targeting the same audience.
        
         | IshKebab wrote:
         | It looks extremely simple to me. I definitely have never seen a
         | simpler binary search implementation (though in fairness it
         | punts the midpoint calculation). Why do you think it's a joke?
        
       | IncreasePosts wrote:
       | > This one also makes that assumption, but in some cases if that
       | assumption is violated it returns an error rather than silently
       | returning a meaningless
       | 
       | Sometimes returning an error and sometimes returning an incorrect
       | value doesn't sound like a binary search that is "absolutely
       | reliable".
       | 
       | Also, s/consolation/consultation/
        
         | fallingsquirrel wrote:
         | It seems to me that it's impossible for a binary search to
         | detect all errors. The precondition is that the list must be
         | sorted, and to fully verify that assumption would be O(n). That
         | would make your binary search no better than a linear search.
         | So I can forgive the best-effort error checking.
        
         | edflsafoiewq wrote:
         | It doesn't seem possible to determine if the predicate is
         | monotonic without doing a full scan, thus not being a binary
         | search.
        
         | Almondsetat wrote:
         | Binary search only works if you suppose the data is well
         | structured. If you perform the search and in the meantime you
         | spot a passing inconsistency, you can interrupt everything and
         | throw an error. But other than that, checking the entire
         | structure negates the entire point of the algorithm
        
       | eknkc wrote:
       | While the implementation is straightforward enough, this post
       | clearly shows me why I don't like Rust.
       | 
       | So, the function accepts two values that implement Mid trait. And
       | there is a blanket implementation for Mid that applies to
       | anything can be added, substracted, divided etc and can be
       | derived from a u8. So now this will make every numeric type a
       | Mid. (Which, as the author states only an example I guess)
       | 
       | I can read and understand how this works here. But when I'm in a
       | larger code base, I can not find what trait implementation made
       | what possible.
       | 
       | With all the trait bounds and implicit conversions happening
       | around and macros implementing traits on stuff it becomes a shit
       | stew in a hurry.
       | 
       | And everyone always reaches to the most clever way to do things.
       | I mean, does Mid need to be a trait? I guess it works beautifully
       | yeah. I'd prefer to pass a midpoint calculator function instead.
       | 
       | I guess this turned into a rant. I actually really like Rust the
       | language but I think I don't like Rust codebases basically.
       | 
       | It is the least readable language I've seen not because of the
       | syntax but because of all the cleverness going on around the
       | traits and macros.
        
         | worik wrote:
         | > I actually really like Rust the language but I think I don't
         | like Rust codebases basically.
         | 
         | And
         | 
         | > everyone always reaches to the most clever way to do things.
         | 
         | Yes oh yes
         | 
         | Rust was a brilliant idea spoilt by too much cleverness
         | 
         | I'm hoping for Rust II. A smaller language, with a bigger
         | standard library and no dependence on run times.
         | 
         | By "no dependence on run times." I mean no async/await. It is
         | an abomination in Rust
        
           | Conscat wrote:
           | There already exist many affine-typed languages with fewer
           | features than Rust. When you phrase it like "Rust but with
           | weaker type safety and higher-overhead abstractions" one
           | might see why this premise fails to gain much traction,
           | though.
        
           | aw1621107 wrote:
           | > By "no dependence on run times." I mean no async/await. It
           | is an abomination in Rust
           | 
           | Out of curiosity, what alternative would you have preferred
           | to see, if any?
        
         | aaronblohowiak wrote:
         | I have good news for you, it gets worse! Look at very large and
         | old ruby codebases :)
        
         | LegionMammal978 wrote:
         | Yeah, trait-heavy code is one of Rust's weaker points in
         | practice. I find tower and some of the database crates to be
         | some of the worst offenders: even knowing the language pretty
         | well, I've often gotten lost in a maze of projections and
         | blanket impls when an error occurs. There's no easy way to
         | trace the implementation back to the intended behavior, to
         | compare the expected and actual outcomes.
        
           | pimeys wrote:
           | Also tracing and rust-opentelemetry gives you a pretty wild
           | ride if needing to do something more special with them.
        
         | forrestthewoods wrote:
         | I like Rust a lot, but I strongly agree. "Advanced" Rust
         | becomes as inscrutable as C++ template crap.
         | 
         | I really wish Rust had a way to define "template style" generic
         | functions rather than "trait style". Some things simply can not
         | be done gracefully with trait spaghetti.
        
           | Ygg2 wrote:
           | > I really wish Rust had a way to define "template style"
           | 
           | It has! It's called proc macro [1]. I'm only half joking.
           | 
           | [1] It can generate anything, errors generally suck ass in it
           | as much in C++ templates and it feels similarly brittle.
           | 
           | > "Advanced" Rust becomes as inscrutable as C++ template
           | crap.
           | 
           | I think people are abusing traits a bit too much. They are
           | very nifty and often compile to very optimized code.
           | 
           | That said this trait is not an example of such abuse. Having
           | a trait to help with overloading is the most benign form of
           | trait use.
           | 
           | Solution to grandparent's problem is called rust analyzer. It
           | will show you what methods are usable on your variable.
        
         | ChadNauseam wrote:
         | Having the function take a midpoint calculator is fine and
         | there are some advantages to it (e.g. it makes it easy to turn
         | into a linear search if you want to do that for some reason).
         | However since writing a midpoint function isn't always totally
         | straightforward (an incorrectly implemented midpoint will cause
         | correctness issues or even infinite loops), I think it's nice
         | to have it as a trait. That way there's little risk of someone
         | using a bad `|x,y| (x + y) / 2` implementation out of laziness.
         | 
         | However I 100% agree with you that traits make it really hard
         | to figure out what actual function you're calling. I wish there
         | was a way to see something like "in your code, this generic
         | function is called with types T1, T2, etc." and see the trait
         | implementations for those types as well. I've actually wanted
         | to contribute this to rust-analyzer for a while.
        
       | rurban wrote:
       | You can omit the runtime sorted checks via proper types. Only
       | accept sorted arrays or lists, and enforce them with such a
       | class. This would make it convenient, faster and safer. It would
       | also take the sort check from the object, and not as extra
       | argument.
        
         | im3w1l wrote:
         | In theory I like this, but it raises the question of how it
         | would work in practice; in practice people create non-sorted
         | lists and then sort them in-place. Maybe you could have the
         | sort method return a sorted-token that certifies that a given
         | array is sorted. And the token would read-only borrow the
         | array, to ensure no one can modify the array as long as the
         | token is valid.
         | 
         | Is it even possible for a rust function to take a mutable
         | borrow as a parameter, decay it to an immutable borrow and
         | store that somewhere, such that other people can borrow the
         | parameter after the function has returned?
        
           | aw1621107 wrote:
           | > Is it even possible for a rust function to take a mutable
           | borrow as a parameter, decay it to an immutable borrow and
           | store that somewhere, such that other people can borrow the
           | parameter after the function has returned?
           | 
           | I think it's possible under some scenarios with the
           | appropriate lifetimes? For example,
           | https://rust.godbolt.org/z/6fe8c4sf4:
           | struct Magic<'a> {             reference: &'a i32         }
           | impl<'a> Magic<'a> {             fn new<'b: 'a>(source: &'b
           | mut i32) -> Magic<'a> {                 Magic { reference:
           | source }             }         }                  fn
           | double(i: &i32) -> i32 {             i * 2         }
           | fn main() {             let mut i = 0;             let m =
           | Magic::new(&mut i);             double(m.reference);
           | }
           | 
           | I think this will stop working once the compiler can no
           | longer see that the lifetimes work out, but I think that
           | makes sense - there's a mutable object underlying the mutable
           | borrow, and the compiler needs to verify that no one else can
           | get another mutable borrow while the derived immutable
           | borrows are live.
        
       | re wrote:
       | > _You want one that doesn 't require a consolation with a
       | manual_ to remember if it returns the first value that's greater
       | or the last value that's lower. ... No more needing to remember
       | "does this return the highest index whose value is lower than the
       | value I'm searching for, or the lowest index whose value is
       | greater than the value I'm searching for?". This one returns
       | both, so you can use whichever you need for your use case.
       | 
       | This ( _italicized_ part) feels like a weird requirement.
       | Function /API documentation (which this author's code lacks) is
       | important, and I don't think I ever write a call to an unfamiliar
       | function without checking the docs for it. Assuming that you know
       | what a function does based on the name alone is a recipe for
       | buggy code.
       | 
       | This is also a kind of unusual formulation of binary search which
       | searches across a monotonic function space rather than a
       | sequence. I get that this is more general and can be used to
       | implement the latter, but a less general interface that only
       | works on sequences is IMO _more_ intuitive: in that case, I
       | expect it to return either the index of the target number, or the
       | index where it should be inserted if not found. It 's somewhat
       | telling that the author doesn't offer any code examples
       | demonstrating how to use this wonderful general function to solve
       | the common use case of finding an element in a sorted vector, or
       | inserting a new one into the correct location.
        
         | edflsafoiewq wrote:
         | The interface is nicer for keyframe interpolation eg. the
         | problem where you have a list of (X,Y) pairs and you want to
         | find the two pairs that straddle a given x and interpolate
         | between them. In that case, search(|&i| v[i].0 >= x, 0, v.len()
         | - 1) handles the three cases "before the beginning", "after the
         | end", and "between two pairs" fairly naturally.
        
       | leni536 wrote:
       | > The only case where it returns (None, None) are those when it
       | detects non-monotonicity in your predicate (returning false when
       | a lower input returned true). So if you know your predicate is
       | monotonic, feel free to hit this case with a panic!("should never
       | happen").
       | 
       | Empty range?
        
         | edflsafoiewq wrote:
         | No such thing, the range is inclusive of the l and r endpoints.
        
       | orlp wrote:
       | This binary search is pretty bad in my opinion. Some issues:
       | 
       | 1. It doesn't support the empty array, because l and r are both
       | inclusive.
       | 
       | 2. It performs way more predicate tests than necessary.
       | 
       | 3. An incorrectly implemented Mid trait causes a silent infinite
       | loop.
       | 
       | 4. If a fundamental programming error in the sanity check is
       | detected it just silently returns Nones instead of panicking.
        
         | ramon156 wrote:
         | 3 is not a good reason. i It's the developer's job to not screw
         | up. What's the alternative?
        
           | orlp wrote:
           | I would normally agree, but the author went out of their way
           | to check for error conditions in their binary search. I think
           | a silent infinite loop is antithetical to that.
        
           | gpm wrote:
           | This is a case where less abstraction would make the bug a
           | lot more obvious. Binary search is simple enough that I don't
           | think a generic implementation of it is worth introducing the
           | possibilities of errors at a distance like that one.
        
       | mmastrac wrote:
       | Style nits: this would be much better with an enum Return:
       | 
       | enum Found { Empty, Before(T), After(T), Between(T, T), }
       | 
       | ... and a range parameter instead of L and R.
       | 
       | And, TBH, this should probably just panic! on an invalid
       | predicate rather than returning an error. Programming errors
       | should panic.
        
         | ChadNauseam wrote:
         | All good points, thank you. A range parameter would be
         | especially good because I think you could use RangeInclusive to
         | emphasize that it's an inclusive range which is not obvious
         | from the type signature.
        
       | dmitrygr wrote:
       | Oh god. A trait to average two numbers? As a separate function
       | that nobody will ever find in a larger codebase since it can live
       | anywhere? I am yet again convinced that C is better. Thanks.
        
         | Ygg2 wrote:
         | Not two numbers. Two somethings. It can average colors, sounds,
         | etc.
        
       | amjoshuamichael wrote:
       | Wow, this binary search implementation is sure to reduce code
       | duplication. It's so elegant! Can I try?! Here's my
       | implementation of a rust program that can do anything!
       | 
       | ```
       | 
       | pub trait Program<T> {                   pub fn run(self);
       | 
       | }
       | 
       | pub fn run_program<T: Program>(program: T) {
       | program.run();
       | 
       | }
       | 
       | ```
       | 
       | Boom. Just look at the code. How could it be simpler? This works
       | for any struct that implements the `Program` trait.
       | 
       | /s
       | 
       | While the code in the original post is elegant, and cool from a
       | "hey look what i can do!" perspective, I really feel like at some
       | point we lost the plot on templates with Rust. When you try to
       | finagle an implementation of a general algorithm, you just end up
       | with a solution that works poorly in all cases, instead of one
       | that works well in most cases.
       | 
       | I've written a lot of Rust, and I definitely know what it's like
       | to get high off of your own template fumes. I once wrote a
       | function with five generics and two trait constraints and I
       | thought it was the coolest shit I'd ever done. But really, I
       | could have just written, like, three discrete functions, and it
       | would have been clearer and smaller.
       | 
       | If someone used something with binary search algorithm _other_
       | than the Mid implementation shown in the post, that would
       | categorically be bad code from someone who was trying to be too
       | clever. If someone has a unique binary search to do, they should
       | write it themselves. That way, the whole algorithm will be on
       | display, and it will be easier to see ways to make it simpler or
       | improve it in the context of the unique search operation. Also,
       | it will be _much_ clearer.
       | 
       | https://en.wiktionary.org/wiki/thneed
        
       ___________________________________________________________________
       (page generated 2024-09-07 23:00 UTC)