[HN Gopher] Comparison Traits - Understanding Equality and Order...
___________________________________________________________________
Comparison Traits - Understanding Equality and Ordering in Rust
Author : rpunkfu
Score : 48 points
Date : 2025-11-02 10:00 UTC (5 days ago)
(HTM) web link (itsfoxstudio.substack.com)
(TXT) w3m dump (itsfoxstudio.substack.com)
| thomasmg wrote:
| I find floating point NaN != NaN quite annoying. But this is not
| related to Rust: this affects all programming languages that
| support floating point. All libraries that want to support
| ordering for floating point need to handle this special case,
| that is, all sort algorithms, hash table implementation, etc.
| Maybe it would cause less issues if NaN doesn't exist, or if NaN
| == NaN. At least, it would be much easier to understand and more
| consistent with other types.
| ramon156 wrote:
| I wonder if "any code that would create a NaN would error"
| would suffice here. I don't think it makes sense when you
| actually start to implement it, but I do feel like making a NaN
| error would be helpful. Why would you want to handle an NaN?
| thomasmg wrote:
| Well floating point operations never throw an exception,
| which I kind of like, personally. I would rather go in the
| opposite direction and change integer division by zero to
| return MAX / MIN / 0.
|
| But NaN could be defined to be smaller or higher than any
| other value.
|
| Well, there are multiple NaN. And NaN isn't actually the only
| weirdness; there's also -0, and we have -0 == 0. I think
| equality for floating point is anyway weird, so then why not
| just define -0 < 0.
| westurner wrote:
| If you don't handle NaN values, and there are NaNs in the
| real observations made for example with real sensors that
| sometimes return NaN and outliers, then the sort order there
| is indeterminate regardless of whether NaN==NaN; the identity
| function collides because there isn't enough entropy for
| there to be partial ordering or total ordering if multiple
| records have the same key value of NaN.
|
| How should an algorithm specify that it should sort by
| insertion order instead of memory address order if the sort
| key is NaN for multiple records?
|
| That's the default in SQL Relational Algebra IIRC?
| westurner wrote:
| What is a good sort key for Photons and Phonons? What is a
| good sort key for H2O water molecules?
| thomasmg wrote:
| > then the sort order there is indeterminate
|
| Well each programming language has a "sort" method that
| sorts arrays. Should this method throw an exception in case
| of NaN? I think the NaN rules were the wrong decision.
| Because of these rules, everywhere there are floating point
| numbers, the libraries have to have special code for NaN,
| even if they don't care about NaN. Otherwise there might be
| ugly bugs, like sorting running into endless loops, data
| loss, etc. But well, it can't be changed now.
|
| The best description of the decision is probably [1], where
| Stephen Canon (former member of the IEEE-754 committee if I
| understand correctly) explains the reasoning.
|
| [1] https://stackoverflow.com/questions/1565164/what-is-
| the-rati...
| MyOutfitIsVague wrote:
| I mentioned in a sibling comment, there's a crate that does
| this in a pretty simple and obvious way:
| https://docs.rs/ordered-float/latest/ordered_float/
| MyOutfitIsVague wrote:
| There's a helpful crate that abstracts that away:
| https://docs.rs/ordered-float/latest/ordered_float/
|
| You have a strongly ordered `NotNan` struct that wraps a float
| that's guaranteed to not be NaN, and an `OrderedFloat` that
| consideres all NaN equal, and greater than non-NaN values.
|
| These are basically the special-cases you'd need to handle
| yourself anyway, and probably one of the approaches you'd end
| up taking.
| newpavlov wrote:
| I agree. In my opinion NaNs were a big mistake in the IEEE 754
| spec. Not only they introduce a lot of special casing, but also
| consume a relatively big chunk of all values in 32 bit floats
| (~0.4%).
|
| I am not saying we do not need NaNs (I would even love to see
| them in integers, see:
| https://news.ycombinator.com/item?id=45174074), but I would
| prefer if we had less of them in floats with clear sorting
| rules.
| tialaramex wrote:
| One thing that isn't discussed here but seems worth knowing for a
| HN audience is that these are what Rust calls "safe" traits. This
| has several related consequences
|
| 1. You don't need to utter the keyword "unsafe" to implement
| these traits for your type. If you're not allowed by policy to
| write unsafe Rust (or if you just don't want to risk making any
| mistakes), you can implement these traits anyway. If you do that
| you should do so correctly as with writing any software...
|
| 2. But, because they're safe traits, nobody else's Rust software
| is allowed to rely on your correctness. If you disobeyed a rule,
| such as you decide all values of your type are always greater
| than themselves (whether carelessly or because you're a vandal)
| other Rust software mustn't become unsafe as a result.
|
| 3. This has real world implications, for example if your type
| Goose has an Ord implementation which is defective, whether on
| purpose or by mistake, sorting a Vec<Goose> in Rust won't have
| Undefined Behaviour like in C++, it might panic (in debug) and it
| can't necessarily sort your type if your Ord implementation is
| nonsensical, but the "sorted" Vec<Goose> is the same geese as
| your original, just potentially in a sorted state to the extent
| that meant anything. It's not fewer geese, or more geese, or just
| different geese altogether - and it certainly isn't say, an RCE
| now like it might be in C++
___________________________________________________________________
(page generated 2025-11-07 23:01 UTC)