[HN Gopher] Demystifying NaN for the Working Programmer
___________________________________________________________________
Demystifying NaN for the Working Programmer
Author : Zacru
Score : 54 points
Date : 2022-03-04 18:44 UTC (1 days ago)
(HTM) web link (www.lucidchart.com)
(TXT) w3m dump (www.lucidchart.com)
| bryanrasmussen wrote:
| I did a code assignment for a potential JavaScript heavy job in
| 2014, for some reason I think isNaN was part of the language then
| because I have a memory deciding not to use it (but could be
| misremembering), at any rate I did Number(x) !== Number(x) at
| some point.
|
| In the meeting when they went over the code the guy who did it
| said we were wondering why you did this? So I had to explain NaN
| to him. He really did not know it existed. At any rate I thought
| this is a weird thing not to know anything at all about.
| pletnes wrote:
| Related: I've met developers who think NaNs are a language or
| library (notably pandas) feature.
| mrlonglong wrote:
| Excellent article, this helped me understand the issues working
| with floating numbers. I work with them quite a lot when
| developing business logic and often times NaN can be a pain.
| Understanding why helps a lot.
| PopePompus wrote:
| I love NaNs, especially their "infectious" quality. Initializing
| float variable to NaNs before first assignment can make a lot of
| errors immediately obvious. I wish there were a NaN for integers.
| colejohnson66 wrote:
| What about a "nullable" double? In C#, you'd use `double?`,
| Rust would be Option<f64>, C++ would be std::optional<double>.
| Then any operation would throw upon an unset value?
| olliej wrote:
| That would required every operation on a floating point value
| to return an optional, which you'd then need to unwrap and
| branch on.
| saagarjha wrote:
| Don't initialize them and turn on UBSan :)
| saagarjha wrote:
| > The only reliable way to test for NaN is to use a language-
| dependent built-in function; the expression a === NaN is always
| false
|
| Well, you test for it by comparing the value against itself and
| seeing if that returns false.
|
| (There's also a bit of confusion on by value vs. by reference
| comparison and the actual bit value on a NaN, which isn't quite
| right.)
| ithkuil wrote:
| Signaling NaNs raise exceptions in some operations. Is
| comparison one of these?
| stephencanon wrote:
| They "raise exceptions" in the IEEE 754 sense, which is not
| at all the same thing as what most programming languages mean
| by "raise exception". It means that they set a sticky flag in
| a register that may be queried at a later point, not that
| program control flow is redirected.
| pletnes wrote:
| The only use I saw for this is that you can enable compiler
| flags to crash the program when NaNs are encountered.
| Useful for testing Fortran code, in my experience. I didn't
| see any support for other languages I've used.
| stephencanon wrote:
| C lets you set and query the flag state with the
| `<fenv.h>` functions in theory, but compiler support for
| rigorously adhering to IEEE 754 semantics around these
| operations is pretty limited in most compilers, to say
| the least. Clang has been making some progress on support
| recently.
| olliej wrote:
| I dislike this article, as it tries repeatedly to imply that the
| use of NaN is somehow a restriction cause by floating point.
|
| No ieee754 _ever_ produces a NaN result unless the operation has
| no valid result in the set of real values.
|
| Similarly the behaviour in comparisons: if you want NaN to equal
| NaN you have to come up with a definition of equality that is
| also consistent with NaN < X NaN >
| X NaN == X
|
| The logical result of this is that NaN does not equal itself, and
| I believe mathematicians agree on that definition. Again not a
| result of the representation, but a result of the mathematical
| rules of real values.
|
| I want to be very clear here: floating point math always produces
| the correct value rounded (according to rounding mode) to the
| appropriate value in the represented space unless it is
| fundamentally not possible. The only place where floating point
| (or indeed any finite representation) produces an incorrectly
| rounded result are the transcendental functions, where _some_
| values can only be correctly rounded if you compute the exact
| value, but the exact value is irrational.
|
| People seem hell bent on complaining about floating point
| behavior, but it is fundamentally mathematically sound. IEEE754
| also specifies some functions like e^x-1 explicitly to ensure
| that you get the best possible accuracy for the core arithmetic
| operations
| dzaima wrote:
| greater-than and less-than already make no sense around NaN,
| you won't get much worse, I don't get what you're trying to
| point out with them. This is less a question about mathematical
| correctness (which there isn't much around NaN anyway), but
| more practical. There being this annoying NaN that breaks
| everything if its in an array to be sorted or in a set or a key
| in a map is just pure awful.
| olliej wrote:
| Correct they don't make sense, but given < and > return a
| Boolean in the ieee environment they need to produce a
| deterministic value.
|
| As you say relations with NaN don't make sense, but given the
| requirement of a single value NaN != NaN makes the most
| "sense" mathematically, and a core principle of ieee754 was
| ensuring the most accurate rendition of true maths with a
| finite representation (see a bunch of papers by Kahan).
|
| Of course x87's ieee754 implementation does actually have
| multiple NaNs, infinities, and representations of the same
| value. For all its quirks remember x87 was what demonstrated
| that the ieee754 specification could be made fast and
| affordably, which non-intel manufacturers were all claiming
| was impossible. The only real "flaw"* in x87 was the explicit
| leading 1, which was an artifact of it intel being
| sufficiently ahead of the curve to predate dropping it.
|
| * the x87 transcendtals are known to be hopelessly
| inaccurate, but that in theory could have been fixed, whereas
| the format could not be.
| dzaima wrote:
| mathematically, yes. In practice, NaN!=NaN just kills any
| hope of having any amount of sanity for operations that
| don't care about floating-point and just want to generally
| compare things. It's not very nice to say "sorting,
| hashmaps & hashsets containing NaNs cause the entire
| operation/structure to be completely undefined behavior",
| especially given that NaNs kind exist to _allow_ noticing
| errors, not cause even more of them.
| jameshart wrote:
| This article conflates the representational limits of floating
| point with the concept of NaN in a way that I suspect will lead
| to more confusion, not less.
|
| Zero/zero doesn't return NaN because it isn't representable
| within floating point - it returns NaN because it is an
| expression that has no mathematical meaning.
|
| The fact that sqrt(-1) has two valid nonreal answers has nothing
| to do with why it returns NaN - after all, sqrt(4) has two valid
| real answers so is also technically not representable by a single
| floating point value, but that doesn't typically result in NaN.
|
| NaN is just an error value you get when you ask floating point
| math a dumb question it can't usefully answer.
|
| Far more interesting and subtle are the ways in which positive
| and negative infinity and positive and negative zero let you
| actually still obtain useful (at least for purposes of things
| like comparison) results to certain calculations even if they
| overflow the representable range.
| pierrebai wrote:
| NaN is a cancer. The choice that NaN == Nan being false is _just
| wrong_. Every type, every variable can have multiple reason for
| being invalid. Yet, no other type has ever chosen to make invalid
| values not being equal to themselves.
|
| Pointers can be invalid. They can be invalid for any number of
| reason. Lack of memory, object not found, etc. No one ever
| suggest that null should not equal null.
|
| File handle can be invalid. They can be invalid for any number of
| reasons: file not found, access denied, file server is offline.
| No one has ever made invalid handles not being equal to
| themselves.
|
| The justification for NaN not being equal to themselves is just
| bonk.
| Fire-Dragon-DoL wrote:
| Note (without disagreeing). In SQL NULL!= NULL
| Lascaille wrote:
| >The justification for NaN not being equal to themselves is
| just bonk.
|
| It makes a lot of sense to me. NaN indicates data has been
| lost. You did something and you stored the result in a number
| datatype but the result isn't a number. Data was lost. You lost
| the data and have only 'your answer wasn't a number.'
|
| Comparing NaN with NaN is asking the computer 'we have two
| buckets that have overflowed, were their contents the same?'
| The answer is 'we don't know' which means, to err on the side
| of safety, the answer is 'no.'
|
| No?
| Dylan16807 wrote:
| Let's say you make a particular NaN equal to itself.
|
| But then it's sensible for different operations to give you
| different NaN values.
|
| And you still wouldn't say that 4 < NaN is true, or NaN < 4 is
| true, would you?
|
| So it's still going to confuse the user. Is just changing
| equality going to give you a better system overall?
| ynik wrote:
| In a world without generic programming, NaN not being equal to
| itself makes a certain amount of sense for some kinds of
| numeric code. But in a world with reusable generic algorithms
| the calculation changes -- here equality/ordering relations
| really must be transitive or weird shit happens. In C++ it's
| undefined behavior to call `std::sort` or `std::unique` on list
| of floats containing NaN.
|
| Most languages nowadays have standard-library functions/types
| that require well-behaved equality, so why have a builtin type
| for which equality is not well-behaved?
| amelius wrote:
| Imagine doing if(x) ..., where x can be NaN. Shouldn't that throw
| an exception in most cases? Why are our compilers not doing it
| that way?
| xen0 wrote:
| Should it? It isn't obvious to me at all that throwing an
| exception in this case is the best behaviour. Throwing an
| exception when testing a value for 'truthiness' is extremely
| surprising.
|
| On the other hand, I would strongly discourage 'if(x)' where x
| is a float that may be NaN purely because the 'correct'
| behaviour here isn't clear to me.
| amelius wrote:
| How about the case where x is (y > 0)? If y is NaN, shouldn't
| x be boolean-NaN? And shouldn't if(x) throw an exception? Or
| shouldn't (y > 0) throw an exception if you don't want
| boolean-NaNs?
| xen0 wrote:
| That's easy: y > 0 is False, not NaN.
|
| You may not think this is wise, but this is very much how
| comparisons with NaN are defined.
|
| And I think this is better than exception raising. Again, I
| think it would be _really_ weird for simple value
| comparisons to throw.
| amelius wrote:
| > Again, I think it would be _really_ weird for simple
| value comparisons to throw.
|
| But ... why?
|
| You may say that NaN > 0 is defined as False, but we know
| that's not how programmers think, most of the time.
|
| In code like if(y > 0) steer_car_to_left() I don't want
| the compiler or the IEEE standard to make any choices for
| me! Let it throw, so emergency systems can kick in.
| ElevenLathe wrote:
| The compiler presumably can't know in most cases, but the
| runtime might be able to throw. It depends on the language
| implementation and the tradeoffs.
___________________________________________________________________
(page generated 2022-03-05 23:01 UTC)