hngopher.com

       [HN Gopher] How to compare floats in Python
       ___________________________________________________________________
        
       How to compare floats in Python
        
       Author : EntICOnc
       Score  : 84 points
       Date   : 2022-03-30 15:23 UTC (7 hours ago)
        
 (HTM) web link (davidamos.dev)
 (TXT) w3m dump (davidamos.dev)
        
       | PennRobotics wrote:
       | I never noticed math.isclose() and have often done the naive a -
       | b < eps
       | 
       | There is also (for rolling your own fn)                  import
       | sys        sys.float_info.epsilon
       | 
       | although I can't immediately think of an advantage over the
       | module.
        
         | gotaquestion wrote:
         | Why do you consider that naive?
         | 
         | Also, I wouldn't use epsilon, because that is a very small
         | number and is related to representation of numbers. I would
         | instead use a larger "tolerance" value, in `if |a-b| < tol ...`
         | because that indicates the expected accuracy of your algorithm
         | (unless you're really working with 1e-80 tolerance?). But TFA
         | explains that.
         | 
         | But back to point (a) is that not reasonable? (If not, I've got
         | some code to change... :)
        
           | PennRobotics wrote:
           | Naive because it's just the easiest way. (Also, right, I
           | forgot abs.)
           | 
           | I imagine the downside with a chosen "epsilon" (which I used
           | as an alias for any tolerance) is that it can be truncated on
           | some machines, plus this equation on its own doesn't do
           | relative accuracy.
           | 
           | On an LEGO EV3, Python's float is 1e38 max with 7 significant
           | digits.
           | 
           | I've never run into problems, which is why I keep using it,
           | too.
        
       | nabla9 wrote:
       | Not mentioned in the article but numpy.isclose() works for 0.0
       | case unlike math.isclose().
        
         | PennRobotics wrote:
         | An explanation for the atol difference between numpy and math,
         | by the author of math.isclose():
         | 
         | https://github.com/numpy/numpy/issues/10161#issuecomment-350...
         | 
         | plus a ton of flogging/benchmarking common libraries (numpy,
         | scipy, scikit-learn) and discussion about np.isclose()
         | 
         | ... and also this gem:                 import numpy as np
         | a = 1       b = 10       rtol = 1       atol = 0
         | np.isclose(a, b, rtol=rtol, atol=atol)  # True
         | np.isclose(b, a, rtol=rtol, atol=atol)  # False
         | 
         | (For the record:)                 import math
         | math.isclose(a, b, rel_tol=rtol, abs_tol=atol)  # True
         | math.isclose(b, a, rel_tol=rtol, abs_tol=atol)  # True
        
         | da12 wrote:
         | Could you explain what you mean the 0.0 case? math.isclose()
         | has an abs_tol parameter for, among other things, handling
         | comparisons to 0.
        
           | macintux wrote:
           | I assume "out of the box" is the distinction.
        
       | sacrosancty wrote:
       | I'm not comfortable they recommend using an epsilon (isclose()).
       | That's begging for mysterious bugs when the numbers are near the
       | threshold. Better to just not compare floats for equality except
       | when you're confident they can be exactly equal. Usually, that
       | doesn't make sense, any more than comparing measurements of
       | weight for equality.
        
       | worik wrote:
       | once again I am glad I am not a Python programmer.
       | 
       | What is this: "math.isclose()"???
       | 
       | In computing floating point numbers cannot be compared for
       | equality. It is obvious why.
       | 
       | So "math.isclose()" is for people who cannot use "<" or ">"
       | operators?
       | 
       | Golly. Do any programmers need it explained to them why floating
       | point numbers are not precise and equality does not apply?
       | 
       | Decimal types are an abomination. The Fraction type blows my
       | mind. Why? It is the sort of nonsense C++ programmers loved to do
       | in the twentieth century, and we all (?) learnt from that and do
       | not do such opaque things any more.
       | 
       | This is all very elementary mathematics, and a language is
       | wasting its time making easy things easy. Programmers must learn
       | the easy things
        
       | Veedrac wrote:
       | Either people are doing very strange things with floating point
       | numbers or this is generally pretty poor advice. You should not
       | generally try to ham-fistedly pretend that approximate floating
       | point operations are producing exact results. You should not
       | generally be looking for exact equality of floating point values
       | except in the case where you have a justified reason to believe
       | that a floating point computation would arrive at an exact
       | result. You should not generally be adding approximate equality
       | terms to your directional comparisons because all that does is
       | switch the precise point at which you are dividing the input
       | space, except now it is at a location that isn't consistent with
       | the opposite test and isn't specified explicitly in the program
       | text. You also shouldn't treat exact floating point values as if
       | they are approximate--floating point is _not_ approximate, some
       | floating point operations are (conditionally) approximate. A
       | value of exactly 1.0, say from the result of min(1.0, f), is not
       | going to arbitrarily change to a different value.
       | 
       | Decimal is also not a solution to floating point approximation.
       | Decimal has all of the same rounding properties that binary
       | floating point has, and actually it's a bit less well behaved.
       | The differences are that Decimal round trips between textual
       | decimal representations, and that Decimal supports an arbitrary
       | choice of bit length.
       | 
       | There are cases where one does need to check for convergence to
       | arbitrary values, but these are probably not what you are doing,
       | and if they are you should probably know more about the precise
       | numerical guarantees that you have.
        
         | dmurray wrote:
         | > You also shouldn't treat exact floating point values as if
         | they are approximate--floating point is not approximate, some
         | floating point operations are (conditionally) approximate.
         | 
         | Depending on your domain, they usually are. Perhaps your
         | floating point number represents a reading from a sensor which
         | has much less precision than your floating point type.
         | 
         | It would be more correct to model all of your approximate
         | variables with error bars (and perhaps even error
         | distributions) but in practice, treating the closest floating
         | point number as a point estimate and doing all your comparisons
         | to within some reasonable tolerance may be an acceptable
         | approach.
        
           | sacrosancty wrote:
           | But why would you need to compare such measurements for
           | equality with an arbitrary tolerance? If your algorithm is
           | doing something like producing a yes/no answer from some
           | measured data then it should be obvious what tolerance to use
           | according to the domain. I would code that explicitly using
           | inequalities instead of depending on some mystery default
           | epsilon.
        
             | sfvisser wrote:
             | You're right here I think, but it is a _hard_ problem in
             | practice I noticed. Hard in the engineering sense, because
             | you're sometimes writing generic library-level code that
             | needs a domain specific notion of error bars and
             | tolerances. So you need to parametrize everything,
             | sometimes for multiple dimensions. Becomes messy quickly.
             | 
             | I must admit putting a mystery epsilon in my code here and
             | there for that reason. Admittedly the wrong thing to do.
        
           | Veedrac wrote:
           | This is a completely legitimate concern but it isn't about
           | floating point. You could be getting integer readings from
           | your sensor and you would have exactly the same concern. In
           | such cases you need to provide tolerances in your code that
           | correspond to the tolerances of your measurement. You might
           | need to implement hysteresis, you might need to take multiple
           | samples and track moving averages, you might need to take
           | empirical measurements of the variance you expect and
           | suppress readings below a threshold. It is very rare however
           | for the solution to just be to say two values compare equal
           | if they are within a billionth part from the other.
        
         | Enginerrrd wrote:
         | >You should not generally be looking for exact equality of
         | floating point values except in the case where you have a
         | justified reason to believe that a floating point computation
         | would arrive at an exact result.
         | 
         | I mean, yes, but there are plenty of cases where a result would
         | be equal from an analytic perspective, but the implementation
         | never realizes this due to floating point arithmetic.
         | 
         | >You also shouldn't treat exact floating point values as if
         | they are approximate--floating point is not approximate, some
         | floating point operations are (conditionally) approximate. A
         | value of exactly 1.0, say from the result of min(1.0, f), is
         | not going to arbitrarily change to a different value.
         | 
         | You're not wrong, but I think you also need to be careful here
         | in how you communicate this. I've encountered plenty of
         | situations where low-level details of floating point arithmetic
         | mattered because it was only (in a sense) approximate, and I've
         | even gotten non-deterministic behavior out of what should have
         | been a deterministic program due to floating point issues.
        
           | [deleted]
        
           | Veedrac wrote:
           | I agree that you need to be cautious asserting that you know
           | the exact value of a floating point value, because it isn't a
           | property you get by default.
           | 
           | Generally, though, if it isn't fairly obvious that equality
           | should be preserved, and if you don't have exactness as a
           | requirement, then you probably shouldn't be building your
           | solution around equality checks at all.
        
         | noobermin wrote:
         | What a strange comment to say "A value of exactly 1.0, say from
         | the result of min(1.0,f), is not going to arbitrarily change to
         | a different value." Christ man, floating point arithmetic is on
         | computers, it's definitely _deterministic_ , no one is saying
         | it's indeteriminable, lol, including the author. It's not like
         | we're off in LSD space where numbers change meaning from moment
         | to moment, it's only that there is information lost in each
         | operation that lead to accumulating error such that exact
         | comparisons usually do not express what you mean to compare in
         | computer programs. That's all, and I don't think the OP is
         | saying anything else.
         | 
         | Other than that I'm not sure what you're adding. The point is
         | people are unclear when they write code because they do not
         | understand floats, and because few people really understand
         | round-off error, they should as a rule compare with np.isclose
         | unless, as you say, they now for a fact they expect a specific
         | floating point number (although the times that happens is by
         | and far the minority in actual code).
        
         | cuteboy19 wrote:
         | > Decimal has all of the same rounding properties that binary
         | floating point has
         | 
         | This is not true.
         | 
         | > Unlike hardware based binary floating point, the decimal
         | module has a user alterable precision (defaulting to 28 places)
         | which can be as large as needed for a given problem
         | 
         | > Both binary and decimal floating point are implemented in
         | terms of published standards. While the built-in float type
         | exposes only a modest portion of its capabilities, the decimal
         | module exposes all required parts of the standard. When needed,
         | the programmer has full control over rounding and signal
         | handling. This includes an option to enforce exact arithmetic
         | by using exceptions to block any inexact operations.
         | 
         | https://docs.python.org/3/library/decimal.html
        
           | noobermin wrote:
           | I think they mean in the abstract. With a fixed precision,
           | you have round-off error as long as you only have a fixed
           | number of digits, and thus have all the attendant issues in
           | computations with them.
        
         | jsmith45 wrote:
         | Decimal representations have one really minor advantage: they
         | behave more the way people expect with respect to which numbers
         | are exactly representable especially in the size ranges of
         | every-day numbers, since they exhibit mostly the same behavior
         | as calculators do.
         | 
         | Plenty of people find that 0.2 not being exactly representable
         | in binary floating point is not intuitive.
        
           | [deleted]
        
       | cdavid wrote:
       | This is not a very good article, and I would not recommend it
       | beyond simple use cases. The problem is that there is no right
       | way, it depends on the usecase and the magnitude of numbers
       | you're comparing. See e.g. https://bitbashing.io/comparing-
       | floats.html as a better referenc.
       | 
       | The fundamental difficulty of comparing floats is that the format
       | ensures a near constant number of digit of precision _regardless
       | of the scale_. This is very useful for most calculations because
       | it means you can calculate without worrying too much about the
       | amplitude of your numbers. But it means that the smallest
       | representable difference between two numbers gets bigger as
       | numbers get bigger: that 's why they are called floating point.
       | 
       | That's why using tolerance, etc. is not so reliable: because the
       | tolerance will depend on the magnitude of the numbers, even by
       | doing those simple tricks. In particular, it is important to
       | understand epsilon is only "correct" around 1, i.e. a + eps != a
       | is only true of a is close to 1. More precisely, epsilon is the
       | smallest number such as 1 + eps != 1.
        
       | nealabq wrote:
       | I have another problem with ==. I have some code that needs to
       | know if two (nested) tuples of Any are equivalent. I'd like to
       | test (tupleA == tupleB), but that doesn't work if NaN appears
       | anywhere in tuples because (NaN != NaN).
       | 
       | This is because == is an overloaded concept. Python has chosen to
       | define == in the way that makes the most mathematical sense in
       | the case of NaN.
       | 
       | Maybe there's a metaprogramming way to override how == in
       | interpreted for floats?                 with float_equality_test(
       | math.isclose):         assert (0.1 + 0.2) == 0.3            #
       | This still allows (+0.0 == -0.0):       def my_float_eq( a
       | :float, b :float) ->bool:         return (math.isnan( a) and
       | math.isnan( b)) or (a == b)       with float_equality_test(
       | my_float_eq):         assert (1, 2.0, math.nan) == (1, 2.0,
       | math.nan)
        
       | [deleted]
        
       | cuteboy19 wrote:
       | This is true for all languages, not just Python.
       | 
       | Unfortunately C++ STL does not offer anything close to isclose
       | (boost does however). Neither does Java. Float.compare is just ==
       | in disguise.
       | 
       | But while going through the docs I stumbled upon this Gem
       | 
       | > If f1 and f2 both represent Float.NaN, then the equals method
       | returns true, even though Float.NaN==Float.NaN has the value
       | false.
       | 
       | https://docs.oracle.com/javase/7/docs/api/java/lang/Float.ht...
        
         | saurik wrote:
         | > But while going through the docs I stumbled upon this Gem
         | 
         | You make it sound like that's crazy or something but the goal
         | of .compare is to be used as a "comparator" to provide an
         | _ordering_ of all possible values in the implementation of
         | various data structures or sorting algorithms, while the goal
         | of the various operators is to support programmatic logic.
        
           | cuteboy19 wrote:
           | We need a little bit of background for this. In Java the ==
           | operator checks if two variables are pointing to the same
           | object. The .equals() method checks for actual value
           | equality.                 new String("Hi")==new String("Hi");
           | // False       new String("Hi").equals(new String("Hi")); //
           | True
           | 
           | Except for primitives like ints and floats where == does
           | value comparison.                 1==1;// True
           | 
           | For NaN the value comparison is false. Even then .equals
           | returns true                 Float.NaN==Float.NaN; // False
           | new Float(0f/0f).equals(new Float(0f/0f)); // True!!!
        
         | mlochbaum wrote:
         | > This is true for all languages, not just Python.
         | 
         | Not quite: APL does comparisons with a configurable relative
         | tolerance, which is nonzero by default. My article [0] about
         | comparing one number to many quickly in this system starts with
         | a discussion of the reasoning and mathematics there. It brings
         | a lot of implementation difficulty, particularly with hash
         | tables. It's possible to build a hash on tolerant doubles by
         | making two hashes per lookup, and even complex numbers with
         | four per lookup, but no one's figured out how to deal with
         | entries that contain multiple numbers yet.
         | 
         | Nonetheless I would say it does make it easier to write working
         | programs overall. But not by much: I've been working on an APL
         | derivative and initially left out comparison tolerance as there
         | were some specifics I was unsure of. A year or so later I
         | noticed I hardly ever needed it even for numerical work and
         | dropped any plans to add it to the language.
         | 
         | [0] https://www.dyalog.com/blog/2018/11/tolerated-comparison-
         | par...
        
       | kgm wrote:
       | For another breakdown of the details of floating-point precision,
       | I recommend qntm's "0.1 + 0.2 returns 0.30000000000000004":
       | 
       | https://qntm.org/notpointthree
        
       | daenz wrote:
       | Floating point numbers are a common source of issues in large
       | open world games. Because of how a float is stored, there is more
       | precision for values closer to the "origin" of the world (0, 0,
       | 0). Which means there is less precision for very far away things.
       | This can result in things like "z-fighting"[0], which I'm sure
       | most gamers have seen, and also issues with physics.
       | 
       | One solution to address some of the precision issues (not
       | necessarily z-fighting, as it operates in a different coordinate
       | system) is to dynamically re-adjust the player's world origin as
       | they move throughout the world. This way, they are always getting
       | the highest float precision for things near to them.
       | 
       | 0. https://en.wikipedia.org/wiki/Z-fighting
        
         | jbay808 wrote:
         | Is there a reason that game developers don't use fixed-point
         | math for Cartesian coordinates?
         | 
         | A 64 bit integer and a 64 bit float both chop up your
         | coordinate system into the same number of points, but with the
         | integer, those points are equally spaced which is the behaviour
         | you'd expect from a Cartesian coordinate system (based on the
         | symmetry group of translational invariance).
         | 
         | And even a 32-bit integer is still fine enough resolution to
         | support four kilometres at one-micrometer resolution. With 64
         | bits per axis you can represent the entire solar system with 15
         | nm resolution, while maintaining _equal resolution at any
         | location_ , and exact distance calculations between any points
         | no matter how close or how far.
        
           | heinrich5991 wrote:
           | Rendering is usually done with floating point on graphics
           | cards, but I don't know if this is a requirement.
        
           | eropple wrote:
           | Having asked this myself once, and tried to write it: it is
           | _hilariously_ slow to render. Graphics cards are float
           | crunchers. Changing one 's frame of reference is not trivial
           | but isn't impossible, and it is much faster.
        
             | jbay808 wrote:
             | The rendering can be done relative to the camera position
             | though, can't it?
             | 
             | So for the graphics you just subtract all world coordinates
             | from the camera coordinate, and cast the result to float;
             | for the game physics and AI, you work directly in fixed
             | point.
        
               | eropple wrote:
               | That is typically done with matrix transformations, which
               | all end up in floating-point space anyway. Having to do
               | integer-to-float transforms for _everything_ to get you
               | there is bad news.
        
           | [deleted]
        
         | dimatura wrote:
         | Not just in games, but also the real world. I've had issues
         | implementing robotic mapping code coming from the naive use of
         | GPS coordinates, for which the solution was the same as you
         | just said: dynamically adjusting the origin.
        
         | ryanianian wrote:
         | This is manifested as "far lands" in java editions of
         | minecraft. Location away from origin gets out to precision
         | limitations in floats resulting in jittering and other
         | rendering oddities
         | 
         | https://www.youtube.com/watch?v=crAa9-5tPEI
        
         | cyber_kinetist wrote:
         | I've heard that the Star Citizen engine devs had to change all
         | of their math operations in CryEngine from single to double
         | precision to add support for seamless large worlds (the player
         | origin hack has limits...) Don't even want to imagine how awful
         | of a nightmare that would have been.
        
       | leetrout wrote:
       | I abuse `Decimal` all the time in python. I usually make a little
       | helper with a very short name like                 def d(in: Any)
       | -> Decimal
       | 
       | Then use that everywhere I expect a float (or just anywhere).
       | Decimal's constructor is so forgiving allowing strings, ints,
       | floats, other decimals it is so convenient to use. Of course
       | there is the perf penalty but I always think the precision is
       | worth the tradeoff.
        
         | caturopath wrote:
         | Decimal has the same fundamental representation issues that
         | float does, albeit with greater configuration.
        
           | bsder wrote:
           | It does, but those issues manifest themselves in ways that
           | humans have been trained to operate.
           | 
           | "Round to nearest even" for binary floating point is weird to
           | anyone who doesn't have a numerics background. "Round 0.5 up"
           | is normal to humans because that's what most have been
           | taught.
           | 
           | Future programming languages should probably default to
           | Decimal Floating Point and allow people to opt-in to binary
           | floating point on request.
        
             | caturopath wrote:
             | > It does, but those issues manifest themselves in ways
             | that humans have been trained to operate.
             | 
             | Not really. If you do computations like "convert Fahrenheit
             | to Celsius" or "pay 6 days of interest at this APR" or a
             | million other things, you run into the same basic faulty
             | assumptions as ever
        
         | zokier wrote:
         | wouldn't just                   from decimal import Decimal as
         | d
         | 
         | work as well?
        
           | leetrout wrote:
           | Sorry, I am doing stuff in the function... here is one of the
           | old ones from code I can share                   def
           | dec(value, prec=4):             """Return the given value as
           | a decimal rounded to the given precision."""             if
           | value is None:                 return Decimal(0)
           | value = Decimal(value)                          ret =
           | Decimal(str(round(value, prec)))             if
           | ret.is_zero():                 # this avoids stuff like
           | Decimal('0E-8')                 return Decimal(0)
           | return ret
        
       | petercooper wrote:
       | I know the answer is "yes" because the majority tends to have a
       | point.. but are there good reasons, beyond backwards
       | compatibility, for languages to be using IEEE 754 floating point
       | arithmetic nowadays rather than just storing decimals "precisely"
       | (to a specific degree of resolution)? Or are any new languages
       | eschewing IEEE 754 entirely? (I'm aware of BigDecimal, etc. but
       | these still seem to be treated as a bonus feature rather than
       | 'the way'.)
        
         | bee_rider wrote:
         | They should have convenient hardware support (vector extensions
         | for example). I'm sure you _could_ design floating-point
         | decimals that work nicely with vector extensions (which usually
         | do cover integers), but it would be a significant project.
         | 
         | If your language is going to plug into the numerical computing
         | stack (BLAS+LAPACK then all the fun stuff built up on that)
         | then it'll need to talk binary floats.
         | 
         | All the annoying stuff that numericists understand but would
         | like to not mess around with, like rounding directions and
         | denormals, are handled nicely with 754 floats.
         | 
         | These are all sort of backward compatibility/legacy issues in
         | the sense that they are based on decisions made in the past,
         | but the hardware and libraries aren't going anywhere I bet!
         | 
         | Also, note that IEEE 754 _does_ define a decimal interchange
         | format. I bet they aren 't handled as nicely in hardware,
         | though.
        
         | rich_sasha wrote:
         | Floats have an enormous range, with fixed relative precision.
         | Even a single-precision float can store numbers up to about
         | 1.7e38 .
         | 
         | Now of course you pay for that by losing absolute precision,
         | but chances are that if you're working with numbers like 1e20
         | you don't much care about anything after the decimal point.
        
       | andrewmcwatters wrote:
       | I remember having to write something similar for Lua, because
       | there was no math.approximately() function and I was dealing with
       | networking floats and comparing values over the wire to predicted
       | movement in 2D space.
       | 
       | https://github.com/Planimeter/grid-sdk/blob/master/engine/sh...
       | 
       | https://en.wikipedia.org/wiki/Machine_epsilon
       | 
       | https://pubs.opengroup.org/onlinepubs/009696899/basedefs/flo...
        
       | wheelerof4te wrote:
       | Step one:
       | 
       | You don't. At least, not exact numbers. You almost always have to
       | round the numbers up, or just aproximate the value.
       | 
       | So, while                 1.2 + 2.3 != 3.5
       | 
       | this is always correct:                 1.2 + 2.3 < 3.6
        
       | SloopJon wrote:
       | See issue 1580 for a description of the change that displays
       | floats using the shortestish decimal expansion that round trips:
       | 
       | https://bugs.python.org/issue1580
       | 
       | I thought there was a PEP for it, but evidently not.
       | 
       | Although this is generally an improvement, it can contribute to
       | confusion.
        
       ___________________________________________________________________
       (page generated 2022-03-30 23:01 UTC)