[HN Gopher] My new rust binary search
___________________________________________________________________
My new rust binary search
Author : ChadNauseam
Score : 33 points
Date : 2024-09-07 19:10 UTC (3 hours ago)
(HTM) web link (chadnauseam.com)
(TXT) w3m dump (chadnauseam.com)
| gnuvince wrote:
| > Just look at the code. How could it be simpler?
|
| Cannot tell if this is a joke.
| Aeolos wrote:
| Which part do you find complicated?
|
| The code itself looks very straightforward, two preconditions
| and a loop. I don't see any significant difference in terms of
| complexity compared to any other low-level systems programming
| language. It's obviously a bit more verbose than python or
| haskell, but those are not targeting the same audience.
| IshKebab wrote:
| It looks extremely simple to me. I definitely have never seen a
| simpler binary search implementation (though in fairness it
| punts the midpoint calculation). Why do you think it's a joke?
| IncreasePosts wrote:
| > This one also makes that assumption, but in some cases if that
| assumption is violated it returns an error rather than silently
| returning a meaningless
|
| Sometimes returning an error and sometimes returning an incorrect
| value doesn't sound like a binary search that is "absolutely
| reliable".
|
| Also, s/consolation/consultation/
| fallingsquirrel wrote:
| It seems to me that it's impossible for a binary search to
| detect all errors. The precondition is that the list must be
| sorted, and to fully verify that assumption would be O(n). That
| would make your binary search no better than a linear search.
| So I can forgive the best-effort error checking.
| edflsafoiewq wrote:
| It doesn't seem possible to determine if the predicate is
| monotonic without doing a full scan, thus not being a binary
| search.
| Almondsetat wrote:
| Binary search only works if you suppose the data is well
| structured. If you perform the search and in the meantime you
| spot a passing inconsistency, you can interrupt everything and
| throw an error. But other than that, checking the entire
| structure negates the entire point of the algorithm
| eknkc wrote:
| While the implementation is straightforward enough, this post
| clearly shows me why I don't like Rust.
|
| So, the function accepts two values that implement Mid trait. And
| there is a blanket implementation for Mid that applies to
| anything can be added, substracted, divided etc and can be
| derived from a u8. So now this will make every numeric type a
| Mid. (Which, as the author states only an example I guess)
|
| I can read and understand how this works here. But when I'm in a
| larger code base, I can not find what trait implementation made
| what possible.
|
| With all the trait bounds and implicit conversions happening
| around and macros implementing traits on stuff it becomes a shit
| stew in a hurry.
|
| And everyone always reaches to the most clever way to do things.
| I mean, does Mid need to be a trait? I guess it works beautifully
| yeah. I'd prefer to pass a midpoint calculator function instead.
|
| I guess this turned into a rant. I actually really like Rust the
| language but I think I don't like Rust codebases basically.
|
| It is the least readable language I've seen not because of the
| syntax but because of all the cleverness going on around the
| traits and macros.
| worik wrote:
| > I actually really like Rust the language but I think I don't
| like Rust codebases basically.
|
| And
|
| > everyone always reaches to the most clever way to do things.
|
| Yes oh yes
|
| Rust was a brilliant idea spoilt by too much cleverness
|
| I'm hoping for Rust II. A smaller language, with a bigger
| standard library and no dependence on run times.
|
| By "no dependence on run times." I mean no async/await. It is
| an abomination in Rust
| Conscat wrote:
| There already exist many affine-typed languages with fewer
| features than Rust. When you phrase it like "Rust but with
| weaker type safety and higher-overhead abstractions" one
| might see why this premise fails to gain much traction,
| though.
| aw1621107 wrote:
| > By "no dependence on run times." I mean no async/await. It
| is an abomination in Rust
|
| Out of curiosity, what alternative would you have preferred
| to see, if any?
| aaronblohowiak wrote:
| I have good news for you, it gets worse! Look at very large and
| old ruby codebases :)
| LegionMammal978 wrote:
| Yeah, trait-heavy code is one of Rust's weaker points in
| practice. I find tower and some of the database crates to be
| some of the worst offenders: even knowing the language pretty
| well, I've often gotten lost in a maze of projections and
| blanket impls when an error occurs. There's no easy way to
| trace the implementation back to the intended behavior, to
| compare the expected and actual outcomes.
| pimeys wrote:
| Also tracing and rust-opentelemetry gives you a pretty wild
| ride if needing to do something more special with them.
| forrestthewoods wrote:
| I like Rust a lot, but I strongly agree. "Advanced" Rust
| becomes as inscrutable as C++ template crap.
|
| I really wish Rust had a way to define "template style" generic
| functions rather than "trait style". Some things simply can not
| be done gracefully with trait spaghetti.
| Ygg2 wrote:
| > I really wish Rust had a way to define "template style"
|
| It has! It's called proc macro [1]. I'm only half joking.
|
| [1] It can generate anything, errors generally suck ass in it
| as much in C++ templates and it feels similarly brittle.
|
| > "Advanced" Rust becomes as inscrutable as C++ template
| crap.
|
| I think people are abusing traits a bit too much. They are
| very nifty and often compile to very optimized code.
|
| That said this trait is not an example of such abuse. Having
| a trait to help with overloading is the most benign form of
| trait use.
|
| Solution to grandparent's problem is called rust analyzer. It
| will show you what methods are usable on your variable.
| ChadNauseam wrote:
| Having the function take a midpoint calculator is fine and
| there are some advantages to it (e.g. it makes it easy to turn
| into a linear search if you want to do that for some reason).
| However since writing a midpoint function isn't always totally
| straightforward (an incorrectly implemented midpoint will cause
| correctness issues or even infinite loops), I think it's nice
| to have it as a trait. That way there's little risk of someone
| using a bad `|x,y| (x + y) / 2` implementation out of laziness.
|
| However I 100% agree with you that traits make it really hard
| to figure out what actual function you're calling. I wish there
| was a way to see something like "in your code, this generic
| function is called with types T1, T2, etc." and see the trait
| implementations for those types as well. I've actually wanted
| to contribute this to rust-analyzer for a while.
| rurban wrote:
| You can omit the runtime sorted checks via proper types. Only
| accept sorted arrays or lists, and enforce them with such a
| class. This would make it convenient, faster and safer. It would
| also take the sort check from the object, and not as extra
| argument.
| im3w1l wrote:
| In theory I like this, but it raises the question of how it
| would work in practice; in practice people create non-sorted
| lists and then sort them in-place. Maybe you could have the
| sort method return a sorted-token that certifies that a given
| array is sorted. And the token would read-only borrow the
| array, to ensure no one can modify the array as long as the
| token is valid.
|
| Is it even possible for a rust function to take a mutable
| borrow as a parameter, decay it to an immutable borrow and
| store that somewhere, such that other people can borrow the
| parameter after the function has returned?
| aw1621107 wrote:
| > Is it even possible for a rust function to take a mutable
| borrow as a parameter, decay it to an immutable borrow and
| store that somewhere, such that other people can borrow the
| parameter after the function has returned?
|
| I think it's possible under some scenarios with the
| appropriate lifetimes? For example,
| https://rust.godbolt.org/z/6fe8c4sf4:
| struct Magic<'a> { reference: &'a i32 }
| impl<'a> Magic<'a> { fn new<'b: 'a>(source: &'b
| mut i32) -> Magic<'a> { Magic { reference:
| source } } } fn
| double(i: &i32) -> i32 { i * 2 }
| fn main() { let mut i = 0; let m =
| Magic::new(&mut i); double(m.reference);
| }
|
| I think this will stop working once the compiler can no
| longer see that the lifetimes work out, but I think that
| makes sense - there's a mutable object underlying the mutable
| borrow, and the compiler needs to verify that no one else can
| get another mutable borrow while the derived immutable
| borrows are live.
| re wrote:
| > _You want one that doesn 't require a consolation with a
| manual_ to remember if it returns the first value that's greater
| or the last value that's lower. ... No more needing to remember
| "does this return the highest index whose value is lower than the
| value I'm searching for, or the lowest index whose value is
| greater than the value I'm searching for?". This one returns
| both, so you can use whichever you need for your use case.
|
| This ( _italicized_ part) feels like a weird requirement.
| Function /API documentation (which this author's code lacks) is
| important, and I don't think I ever write a call to an unfamiliar
| function without checking the docs for it. Assuming that you know
| what a function does based on the name alone is a recipe for
| buggy code.
|
| This is also a kind of unusual formulation of binary search which
| searches across a monotonic function space rather than a
| sequence. I get that this is more general and can be used to
| implement the latter, but a less general interface that only
| works on sequences is IMO _more_ intuitive: in that case, I
| expect it to return either the index of the target number, or the
| index where it should be inserted if not found. It 's somewhat
| telling that the author doesn't offer any code examples
| demonstrating how to use this wonderful general function to solve
| the common use case of finding an element in a sorted vector, or
| inserting a new one into the correct location.
| edflsafoiewq wrote:
| The interface is nicer for keyframe interpolation eg. the
| problem where you have a list of (X,Y) pairs and you want to
| find the two pairs that straddle a given x and interpolate
| between them. In that case, search(|&i| v[i].0 >= x, 0, v.len()
| - 1) handles the three cases "before the beginning", "after the
| end", and "between two pairs" fairly naturally.
| leni536 wrote:
| > The only case where it returns (None, None) are those when it
| detects non-monotonicity in your predicate (returning false when
| a lower input returned true). So if you know your predicate is
| monotonic, feel free to hit this case with a panic!("should never
| happen").
|
| Empty range?
| edflsafoiewq wrote:
| No such thing, the range is inclusive of the l and r endpoints.
| orlp wrote:
| This binary search is pretty bad in my opinion. Some issues:
|
| 1. It doesn't support the empty array, because l and r are both
| inclusive.
|
| 2. It performs way more predicate tests than necessary.
|
| 3. An incorrectly implemented Mid trait causes a silent infinite
| loop.
|
| 4. If a fundamental programming error in the sanity check is
| detected it just silently returns Nones instead of panicking.
| ramon156 wrote:
| 3 is not a good reason. i It's the developer's job to not screw
| up. What's the alternative?
| orlp wrote:
| I would normally agree, but the author went out of their way
| to check for error conditions in their binary search. I think
| a silent infinite loop is antithetical to that.
| gpm wrote:
| This is a case where less abstraction would make the bug a
| lot more obvious. Binary search is simple enough that I don't
| think a generic implementation of it is worth introducing the
| possibilities of errors at a distance like that one.
| mmastrac wrote:
| Style nits: this would be much better with an enum Return:
|
| enum Found { Empty, Before(T), After(T), Between(T, T), }
|
| ... and a range parameter instead of L and R.
|
| And, TBH, this should probably just panic! on an invalid
| predicate rather than returning an error. Programming errors
| should panic.
| ChadNauseam wrote:
| All good points, thank you. A range parameter would be
| especially good because I think you could use RangeInclusive to
| emphasize that it's an inclusive range which is not obvious
| from the type signature.
| dmitrygr wrote:
| Oh god. A trait to average two numbers? As a separate function
| that nobody will ever find in a larger codebase since it can live
| anywhere? I am yet again convinced that C is better. Thanks.
| Ygg2 wrote:
| Not two numbers. Two somethings. It can average colors, sounds,
| etc.
| amjoshuamichael wrote:
| Wow, this binary search implementation is sure to reduce code
| duplication. It's so elegant! Can I try?! Here's my
| implementation of a rust program that can do anything!
|
| ```
|
| pub trait Program<T> { pub fn run(self);
|
| }
|
| pub fn run_program<T: Program>(program: T) {
| program.run();
|
| }
|
| ```
|
| Boom. Just look at the code. How could it be simpler? This works
| for any struct that implements the `Program` trait.
|
| /s
|
| While the code in the original post is elegant, and cool from a
| "hey look what i can do!" perspective, I really feel like at some
| point we lost the plot on templates with Rust. When you try to
| finagle an implementation of a general algorithm, you just end up
| with a solution that works poorly in all cases, instead of one
| that works well in most cases.
|
| I've written a lot of Rust, and I definitely know what it's like
| to get high off of your own template fumes. I once wrote a
| function with five generics and two trait constraints and I
| thought it was the coolest shit I'd ever done. But really, I
| could have just written, like, three discrete functions, and it
| would have been clearer and smaller.
|
| If someone used something with binary search algorithm _other_
| than the Mid implementation shown in the post, that would
| categorically be bad code from someone who was trying to be too
| clever. If someone has a unique binary search to do, they should
| write it themselves. That way, the whole algorithm will be on
| display, and it will be easier to see ways to make it simpler or
| improve it in the context of the unique search operation. Also,
| it will be _much_ clearer.
|
| https://en.wiktionary.org/wiki/thneed
___________________________________________________________________
(page generated 2024-09-07 23:00 UTC)