[HN Gopher] Demystifying NaN for the Working Programmer
       ___________________________________________________________________
        
       Demystifying NaN for the Working Programmer
        
       Author : Zacru
       Score  : 54 points
       Date   : 2022-03-04 18:44 UTC (1 days ago)
        
 (HTM) web link (www.lucidchart.com)
 (TXT) w3m dump (www.lucidchart.com)
        
       | bryanrasmussen wrote:
       | I did a code assignment for a potential JavaScript heavy job in
       | 2014, for some reason I think isNaN was part of the language then
       | because I have a memory deciding not to use it (but could be
       | misremembering), at any rate I did Number(x) !== Number(x) at
       | some point.
       | 
       | In the meeting when they went over the code the guy who did it
       | said we were wondering why you did this? So I had to explain NaN
       | to him. He really did not know it existed. At any rate I thought
       | this is a weird thing not to know anything at all about.
        
         | pletnes wrote:
         | Related: I've met developers who think NaNs are a language or
         | library (notably pandas) feature.
        
       | mrlonglong wrote:
       | Excellent article, this helped me understand the issues working
       | with floating numbers. I work with them quite a lot when
       | developing business logic and often times NaN can be a pain.
       | Understanding why helps a lot.
        
       | PopePompus wrote:
       | I love NaNs, especially their "infectious" quality. Initializing
       | float variable to NaNs before first assignment can make a lot of
       | errors immediately obvious. I wish there were a NaN for integers.
        
         | colejohnson66 wrote:
         | What about a "nullable" double? In C#, you'd use `double?`,
         | Rust would be Option<f64>, C++ would be std::optional<double>.
         | Then any operation would throw upon an unset value?
        
           | olliej wrote:
           | That would required every operation on a floating point value
           | to return an optional, which you'd then need to unwrap and
           | branch on.
        
         | saagarjha wrote:
         | Don't initialize them and turn on UBSan :)
        
       | saagarjha wrote:
       | > The only reliable way to test for NaN is to use a language-
       | dependent built-in function; the expression a === NaN is always
       | false
       | 
       | Well, you test for it by comparing the value against itself and
       | seeing if that returns false.
       | 
       | (There's also a bit of confusion on by value vs. by reference
       | comparison and the actual bit value on a NaN, which isn't quite
       | right.)
        
         | ithkuil wrote:
         | Signaling NaNs raise exceptions in some operations. Is
         | comparison one of these?
        
           | stephencanon wrote:
           | They "raise exceptions" in the IEEE 754 sense, which is not
           | at all the same thing as what most programming languages mean
           | by "raise exception". It means that they set a sticky flag in
           | a register that may be queried at a later point, not that
           | program control flow is redirected.
        
             | pletnes wrote:
             | The only use I saw for this is that you can enable compiler
             | flags to crash the program when NaNs are encountered.
             | Useful for testing Fortran code, in my experience. I didn't
             | see any support for other languages I've used.
        
               | stephencanon wrote:
               | C lets you set and query the flag state with the
               | `<fenv.h>` functions in theory, but compiler support for
               | rigorously adhering to IEEE 754 semantics around these
               | operations is pretty limited in most compilers, to say
               | the least. Clang has been making some progress on support
               | recently.
        
       | olliej wrote:
       | I dislike this article, as it tries repeatedly to imply that the
       | use of NaN is somehow a restriction cause by floating point.
       | 
       | No ieee754 _ever_ produces a NaN result unless the operation has
       | no valid result in the set of real values.
       | 
       | Similarly the behaviour in comparisons: if you want NaN to equal
       | NaN you have to come up with a definition of equality that is
       | also consistent with                   NaN < X              NaN >
       | X              NaN == X
       | 
       | The logical result of this is that NaN does not equal itself, and
       | I believe mathematicians agree on that definition. Again not a
       | result of the representation, but a result of the mathematical
       | rules of real values.
       | 
       | I want to be very clear here: floating point math always produces
       | the correct value rounded (according to rounding mode) to the
       | appropriate value in the represented space unless it is
       | fundamentally not possible. The only place where floating point
       | (or indeed any finite representation) produces an incorrectly
       | rounded result are the transcendental functions, where _some_
       | values can only be correctly rounded if you compute the exact
       | value, but the exact value is irrational.
       | 
       | People seem hell bent on complaining about floating point
       | behavior, but it is fundamentally mathematically sound. IEEE754
       | also specifies some functions like e^x-1 explicitly to ensure
       | that you get the best possible accuracy for the core arithmetic
       | operations
        
         | dzaima wrote:
         | greater-than and less-than already make no sense around NaN,
         | you won't get much worse, I don't get what you're trying to
         | point out with them. This is less a question about mathematical
         | correctness (which there isn't much around NaN anyway), but
         | more practical. There being this annoying NaN that breaks
         | everything if its in an array to be sorted or in a set or a key
         | in a map is just pure awful.
        
           | olliej wrote:
           | Correct they don't make sense, but given < and > return a
           | Boolean in the ieee environment they need to produce a
           | deterministic value.
           | 
           | As you say relations with NaN don't make sense, but given the
           | requirement of a single value NaN != NaN makes the most
           | "sense" mathematically, and a core principle of ieee754 was
           | ensuring the most accurate rendition of true maths with a
           | finite representation (see a bunch of papers by Kahan).
           | 
           | Of course x87's ieee754 implementation does actually have
           | multiple NaNs, infinities, and representations of the same
           | value. For all its quirks remember x87 was what demonstrated
           | that the ieee754 specification could be made fast and
           | affordably, which non-intel manufacturers were all claiming
           | was impossible. The only real "flaw"* in x87 was the explicit
           | leading 1, which was an artifact of it intel being
           | sufficiently ahead of the curve to predate dropping it.
           | 
           | * the x87 transcendtals are known to be hopelessly
           | inaccurate, but that in theory could have been fixed, whereas
           | the format could not be.
        
             | dzaima wrote:
             | mathematically, yes. In practice, NaN!=NaN just kills any
             | hope of having any amount of sanity for operations that
             | don't care about floating-point and just want to generally
             | compare things. It's not very nice to say "sorting,
             | hashmaps & hashsets containing NaNs cause the entire
             | operation/structure to be completely undefined behavior",
             | especially given that NaNs kind exist to _allow_ noticing
             | errors, not cause even more of them.
        
       | jameshart wrote:
       | This article conflates the representational limits of floating
       | point with the concept of NaN in a way that I suspect will lead
       | to more confusion, not less.
       | 
       | Zero/zero doesn't return NaN because it isn't representable
       | within floating point - it returns NaN because it is an
       | expression that has no mathematical meaning.
       | 
       | The fact that sqrt(-1) has two valid nonreal answers has nothing
       | to do with why it returns NaN - after all, sqrt(4) has two valid
       | real answers so is also technically not representable by a single
       | floating point value, but that doesn't typically result in NaN.
       | 
       | NaN is just an error value you get when you ask floating point
       | math a dumb question it can't usefully answer.
       | 
       | Far more interesting and subtle are the ways in which positive
       | and negative infinity and positive and negative zero let you
       | actually still obtain useful (at least for purposes of things
       | like comparison) results to certain calculations even if they
       | overflow the representable range.
        
       | pierrebai wrote:
       | NaN is a cancer. The choice that NaN == Nan being false is _just
       | wrong_. Every type, every variable can have multiple reason for
       | being invalid. Yet, no other type has ever chosen to make invalid
       | values not being equal to themselves.
       | 
       | Pointers can be invalid. They can be invalid for any number of
       | reason. Lack of memory, object not found, etc. No one ever
       | suggest that null should not equal null.
       | 
       | File handle can be invalid. They can be invalid for any number of
       | reasons: file not found, access denied, file server is offline.
       | No one has ever made invalid handles not being equal to
       | themselves.
       | 
       | The justification for NaN not being equal to themselves is just
       | bonk.
        
         | Fire-Dragon-DoL wrote:
         | Note (without disagreeing). In SQL NULL!= NULL
        
         | Lascaille wrote:
         | >The justification for NaN not being equal to themselves is
         | just bonk.
         | 
         | It makes a lot of sense to me. NaN indicates data has been
         | lost. You did something and you stored the result in a number
         | datatype but the result isn't a number. Data was lost. You lost
         | the data and have only 'your answer wasn't a number.'
         | 
         | Comparing NaN with NaN is asking the computer 'we have two
         | buckets that have overflowed, were their contents the same?'
         | The answer is 'we don't know' which means, to err on the side
         | of safety, the answer is 'no.'
         | 
         | No?
        
         | Dylan16807 wrote:
         | Let's say you make a particular NaN equal to itself.
         | 
         | But then it's sensible for different operations to give you
         | different NaN values.
         | 
         | And you still wouldn't say that 4 < NaN is true, or NaN < 4 is
         | true, would you?
         | 
         | So it's still going to confuse the user. Is just changing
         | equality going to give you a better system overall?
        
         | ynik wrote:
         | In a world without generic programming, NaN not being equal to
         | itself makes a certain amount of sense for some kinds of
         | numeric code. But in a world with reusable generic algorithms
         | the calculation changes -- here equality/ordering relations
         | really must be transitive or weird shit happens. In C++ it's
         | undefined behavior to call `std::sort` or `std::unique` on list
         | of floats containing NaN.
         | 
         | Most languages nowadays have standard-library functions/types
         | that require well-behaved equality, so why have a builtin type
         | for which equality is not well-behaved?
        
       | amelius wrote:
       | Imagine doing if(x) ..., where x can be NaN. Shouldn't that throw
       | an exception in most cases? Why are our compilers not doing it
       | that way?
        
         | xen0 wrote:
         | Should it? It isn't obvious to me at all that throwing an
         | exception in this case is the best behaviour. Throwing an
         | exception when testing a value for 'truthiness' is extremely
         | surprising.
         | 
         | On the other hand, I would strongly discourage 'if(x)' where x
         | is a float that may be NaN purely because the 'correct'
         | behaviour here isn't clear to me.
        
           | amelius wrote:
           | How about the case where x is (y > 0)? If y is NaN, shouldn't
           | x be boolean-NaN? And shouldn't if(x) throw an exception? Or
           | shouldn't (y > 0) throw an exception if you don't want
           | boolean-NaNs?
        
             | xen0 wrote:
             | That's easy: y > 0 is False, not NaN.
             | 
             | You may not think this is wise, but this is very much how
             | comparisons with NaN are defined.
             | 
             | And I think this is better than exception raising. Again, I
             | think it would be _really_ weird for simple value
             | comparisons to throw.
        
               | amelius wrote:
               | > Again, I think it would be _really_ weird for simple
               | value comparisons to throw.
               | 
               | But ... why?
               | 
               | You may say that NaN > 0 is defined as False, but we know
               | that's not how programmers think, most of the time.
               | 
               | In code like if(y > 0) steer_car_to_left() I don't want
               | the compiler or the IEEE standard to make any choices for
               | me! Let it throw, so emergency systems can kick in.
        
         | ElevenLathe wrote:
         | The compiler presumably can't know in most cases, but the
         | runtime might be able to throw. It depends on the language
         | implementation and the tradeoffs.
        
       ___________________________________________________________________
       (page generated 2022-03-05 23:01 UTC)