[HN Gopher] How to compare floats in Python
___________________________________________________________________
How to compare floats in Python
Author : EntICOnc
Score : 84 points
Date : 2022-03-30 15:23 UTC (7 hours ago)
(HTM) web link (davidamos.dev)
(TXT) w3m dump (davidamos.dev)
| PennRobotics wrote:
| I never noticed math.isclose() and have often done the naive a -
| b < eps
|
| There is also (for rolling your own fn) import
| sys sys.float_info.epsilon
|
| although I can't immediately think of an advantage over the
| module.
| gotaquestion wrote:
| Why do you consider that naive?
|
| Also, I wouldn't use epsilon, because that is a very small
| number and is related to representation of numbers. I would
| instead use a larger "tolerance" value, in `if |a-b| < tol ...`
| because that indicates the expected accuracy of your algorithm
| (unless you're really working with 1e-80 tolerance?). But TFA
| explains that.
|
| But back to point (a) is that not reasonable? (If not, I've got
| some code to change... :)
| PennRobotics wrote:
| Naive because it's just the easiest way. (Also, right, I
| forgot abs.)
|
| I imagine the downside with a chosen "epsilon" (which I used
| as an alias for any tolerance) is that it can be truncated on
| some machines, plus this equation on its own doesn't do
| relative accuracy.
|
| On an LEGO EV3, Python's float is 1e38 max with 7 significant
| digits.
|
| I've never run into problems, which is why I keep using it,
| too.
| nabla9 wrote:
| Not mentioned in the article but numpy.isclose() works for 0.0
| case unlike math.isclose().
| PennRobotics wrote:
| An explanation for the atol difference between numpy and math,
| by the author of math.isclose():
|
| https://github.com/numpy/numpy/issues/10161#issuecomment-350...
|
| plus a ton of flogging/benchmarking common libraries (numpy,
| scipy, scikit-learn) and discussion about np.isclose()
|
| ... and also this gem: import numpy as np
| a = 1 b = 10 rtol = 1 atol = 0
| np.isclose(a, b, rtol=rtol, atol=atol) # True
| np.isclose(b, a, rtol=rtol, atol=atol) # False
|
| (For the record:) import math
| math.isclose(a, b, rel_tol=rtol, abs_tol=atol) # True
| math.isclose(b, a, rel_tol=rtol, abs_tol=atol) # True
| da12 wrote:
| Could you explain what you mean the 0.0 case? math.isclose()
| has an abs_tol parameter for, among other things, handling
| comparisons to 0.
| macintux wrote:
| I assume "out of the box" is the distinction.
| sacrosancty wrote:
| I'm not comfortable they recommend using an epsilon (isclose()).
| That's begging for mysterious bugs when the numbers are near the
| threshold. Better to just not compare floats for equality except
| when you're confident they can be exactly equal. Usually, that
| doesn't make sense, any more than comparing measurements of
| weight for equality.
| worik wrote:
| once again I am glad I am not a Python programmer.
|
| What is this: "math.isclose()"???
|
| In computing floating point numbers cannot be compared for
| equality. It is obvious why.
|
| So "math.isclose()" is for people who cannot use "<" or ">"
| operators?
|
| Golly. Do any programmers need it explained to them why floating
| point numbers are not precise and equality does not apply?
|
| Decimal types are an abomination. The Fraction type blows my
| mind. Why? It is the sort of nonsense C++ programmers loved to do
| in the twentieth century, and we all (?) learnt from that and do
| not do such opaque things any more.
|
| This is all very elementary mathematics, and a language is
| wasting its time making easy things easy. Programmers must learn
| the easy things
| Veedrac wrote:
| Either people are doing very strange things with floating point
| numbers or this is generally pretty poor advice. You should not
| generally try to ham-fistedly pretend that approximate floating
| point operations are producing exact results. You should not
| generally be looking for exact equality of floating point values
| except in the case where you have a justified reason to believe
| that a floating point computation would arrive at an exact
| result. You should not generally be adding approximate equality
| terms to your directional comparisons because all that does is
| switch the precise point at which you are dividing the input
| space, except now it is at a location that isn't consistent with
| the opposite test and isn't specified explicitly in the program
| text. You also shouldn't treat exact floating point values as if
| they are approximate--floating point is _not_ approximate, some
| floating point operations are (conditionally) approximate. A
| value of exactly 1.0, say from the result of min(1.0, f), is not
| going to arbitrarily change to a different value.
|
| Decimal is also not a solution to floating point approximation.
| Decimal has all of the same rounding properties that binary
| floating point has, and actually it's a bit less well behaved.
| The differences are that Decimal round trips between textual
| decimal representations, and that Decimal supports an arbitrary
| choice of bit length.
|
| There are cases where one does need to check for convergence to
| arbitrary values, but these are probably not what you are doing,
| and if they are you should probably know more about the precise
| numerical guarantees that you have.
| dmurray wrote:
| > You also shouldn't treat exact floating point values as if
| they are approximate--floating point is not approximate, some
| floating point operations are (conditionally) approximate.
|
| Depending on your domain, they usually are. Perhaps your
| floating point number represents a reading from a sensor which
| has much less precision than your floating point type.
|
| It would be more correct to model all of your approximate
| variables with error bars (and perhaps even error
| distributions) but in practice, treating the closest floating
| point number as a point estimate and doing all your comparisons
| to within some reasonable tolerance may be an acceptable
| approach.
| sacrosancty wrote:
| But why would you need to compare such measurements for
| equality with an arbitrary tolerance? If your algorithm is
| doing something like producing a yes/no answer from some
| measured data then it should be obvious what tolerance to use
| according to the domain. I would code that explicitly using
| inequalities instead of depending on some mystery default
| epsilon.
| sfvisser wrote:
| You're right here I think, but it is a _hard_ problem in
| practice I noticed. Hard in the engineering sense, because
| you're sometimes writing generic library-level code that
| needs a domain specific notion of error bars and
| tolerances. So you need to parametrize everything,
| sometimes for multiple dimensions. Becomes messy quickly.
|
| I must admit putting a mystery epsilon in my code here and
| there for that reason. Admittedly the wrong thing to do.
| Veedrac wrote:
| This is a completely legitimate concern but it isn't about
| floating point. You could be getting integer readings from
| your sensor and you would have exactly the same concern. In
| such cases you need to provide tolerances in your code that
| correspond to the tolerances of your measurement. You might
| need to implement hysteresis, you might need to take multiple
| samples and track moving averages, you might need to take
| empirical measurements of the variance you expect and
| suppress readings below a threshold. It is very rare however
| for the solution to just be to say two values compare equal
| if they are within a billionth part from the other.
| Enginerrrd wrote:
| >You should not generally be looking for exact equality of
| floating point values except in the case where you have a
| justified reason to believe that a floating point computation
| would arrive at an exact result.
|
| I mean, yes, but there are plenty of cases where a result would
| be equal from an analytic perspective, but the implementation
| never realizes this due to floating point arithmetic.
|
| >You also shouldn't treat exact floating point values as if
| they are approximate--floating point is not approximate, some
| floating point operations are (conditionally) approximate. A
| value of exactly 1.0, say from the result of min(1.0, f), is
| not going to arbitrarily change to a different value.
|
| You're not wrong, but I think you also need to be careful here
| in how you communicate this. I've encountered plenty of
| situations where low-level details of floating point arithmetic
| mattered because it was only (in a sense) approximate, and I've
| even gotten non-deterministic behavior out of what should have
| been a deterministic program due to floating point issues.
| [deleted]
| Veedrac wrote:
| I agree that you need to be cautious asserting that you know
| the exact value of a floating point value, because it isn't a
| property you get by default.
|
| Generally, though, if it isn't fairly obvious that equality
| should be preserved, and if you don't have exactness as a
| requirement, then you probably shouldn't be building your
| solution around equality checks at all.
| noobermin wrote:
| What a strange comment to say "A value of exactly 1.0, say from
| the result of min(1.0,f), is not going to arbitrarily change to
| a different value." Christ man, floating point arithmetic is on
| computers, it's definitely _deterministic_ , no one is saying
| it's indeteriminable, lol, including the author. It's not like
| we're off in LSD space where numbers change meaning from moment
| to moment, it's only that there is information lost in each
| operation that lead to accumulating error such that exact
| comparisons usually do not express what you mean to compare in
| computer programs. That's all, and I don't think the OP is
| saying anything else.
|
| Other than that I'm not sure what you're adding. The point is
| people are unclear when they write code because they do not
| understand floats, and because few people really understand
| round-off error, they should as a rule compare with np.isclose
| unless, as you say, they now for a fact they expect a specific
| floating point number (although the times that happens is by
| and far the minority in actual code).
| cuteboy19 wrote:
| > Decimal has all of the same rounding properties that binary
| floating point has
|
| This is not true.
|
| > Unlike hardware based binary floating point, the decimal
| module has a user alterable precision (defaulting to 28 places)
| which can be as large as needed for a given problem
|
| > Both binary and decimal floating point are implemented in
| terms of published standards. While the built-in float type
| exposes only a modest portion of its capabilities, the decimal
| module exposes all required parts of the standard. When needed,
| the programmer has full control over rounding and signal
| handling. This includes an option to enforce exact arithmetic
| by using exceptions to block any inexact operations.
|
| https://docs.python.org/3/library/decimal.html
| noobermin wrote:
| I think they mean in the abstract. With a fixed precision,
| you have round-off error as long as you only have a fixed
| number of digits, and thus have all the attendant issues in
| computations with them.
| jsmith45 wrote:
| Decimal representations have one really minor advantage: they
| behave more the way people expect with respect to which numbers
| are exactly representable especially in the size ranges of
| every-day numbers, since they exhibit mostly the same behavior
| as calculators do.
|
| Plenty of people find that 0.2 not being exactly representable
| in binary floating point is not intuitive.
| [deleted]
| cdavid wrote:
| This is not a very good article, and I would not recommend it
| beyond simple use cases. The problem is that there is no right
| way, it depends on the usecase and the magnitude of numbers
| you're comparing. See e.g. https://bitbashing.io/comparing-
| floats.html as a better referenc.
|
| The fundamental difficulty of comparing floats is that the format
| ensures a near constant number of digit of precision _regardless
| of the scale_. This is very useful for most calculations because
| it means you can calculate without worrying too much about the
| amplitude of your numbers. But it means that the smallest
| representable difference between two numbers gets bigger as
| numbers get bigger: that 's why they are called floating point.
|
| That's why using tolerance, etc. is not so reliable: because the
| tolerance will depend on the magnitude of the numbers, even by
| doing those simple tricks. In particular, it is important to
| understand epsilon is only "correct" around 1, i.e. a + eps != a
| is only true of a is close to 1. More precisely, epsilon is the
| smallest number such as 1 + eps != 1.
| nealabq wrote:
| I have another problem with ==. I have some code that needs to
| know if two (nested) tuples of Any are equivalent. I'd like to
| test (tupleA == tupleB), but that doesn't work if NaN appears
| anywhere in tuples because (NaN != NaN).
|
| This is because == is an overloaded concept. Python has chosen to
| define == in the way that makes the most mathematical sense in
| the case of NaN.
|
| Maybe there's a metaprogramming way to override how == in
| interpreted for floats? with float_equality_test(
| math.isclose): assert (0.1 + 0.2) == 0.3 #
| This still allows (+0.0 == -0.0): def my_float_eq( a
| :float, b :float) ->bool: return (math.isnan( a) and
| math.isnan( b)) or (a == b) with float_equality_test(
| my_float_eq): assert (1, 2.0, math.nan) == (1, 2.0,
| math.nan)
| [deleted]
| cuteboy19 wrote:
| This is true for all languages, not just Python.
|
| Unfortunately C++ STL does not offer anything close to isclose
| (boost does however). Neither does Java. Float.compare is just ==
| in disguise.
|
| But while going through the docs I stumbled upon this Gem
|
| > If f1 and f2 both represent Float.NaN, then the equals method
| returns true, even though Float.NaN==Float.NaN has the value
| false.
|
| https://docs.oracle.com/javase/7/docs/api/java/lang/Float.ht...
| saurik wrote:
| > But while going through the docs I stumbled upon this Gem
|
| You make it sound like that's crazy or something but the goal
| of .compare is to be used as a "comparator" to provide an
| _ordering_ of all possible values in the implementation of
| various data structures or sorting algorithms, while the goal
| of the various operators is to support programmatic logic.
| cuteboy19 wrote:
| We need a little bit of background for this. In Java the ==
| operator checks if two variables are pointing to the same
| object. The .equals() method checks for actual value
| equality. new String("Hi")==new String("Hi");
| // False new String("Hi").equals(new String("Hi")); //
| True
|
| Except for primitives like ints and floats where == does
| value comparison. 1==1;// True
|
| For NaN the value comparison is false. Even then .equals
| returns true Float.NaN==Float.NaN; // False
| new Float(0f/0f).equals(new Float(0f/0f)); // True!!!
| mlochbaum wrote:
| > This is true for all languages, not just Python.
|
| Not quite: APL does comparisons with a configurable relative
| tolerance, which is nonzero by default. My article [0] about
| comparing one number to many quickly in this system starts with
| a discussion of the reasoning and mathematics there. It brings
| a lot of implementation difficulty, particularly with hash
| tables. It's possible to build a hash on tolerant doubles by
| making two hashes per lookup, and even complex numbers with
| four per lookup, but no one's figured out how to deal with
| entries that contain multiple numbers yet.
|
| Nonetheless I would say it does make it easier to write working
| programs overall. But not by much: I've been working on an APL
| derivative and initially left out comparison tolerance as there
| were some specifics I was unsure of. A year or so later I
| noticed I hardly ever needed it even for numerical work and
| dropped any plans to add it to the language.
|
| [0] https://www.dyalog.com/blog/2018/11/tolerated-comparison-
| par...
| kgm wrote:
| For another breakdown of the details of floating-point precision,
| I recommend qntm's "0.1 + 0.2 returns 0.30000000000000004":
|
| https://qntm.org/notpointthree
| daenz wrote:
| Floating point numbers are a common source of issues in large
| open world games. Because of how a float is stored, there is more
| precision for values closer to the "origin" of the world (0, 0,
| 0). Which means there is less precision for very far away things.
| This can result in things like "z-fighting"[0], which I'm sure
| most gamers have seen, and also issues with physics.
|
| One solution to address some of the precision issues (not
| necessarily z-fighting, as it operates in a different coordinate
| system) is to dynamically re-adjust the player's world origin as
| they move throughout the world. This way, they are always getting
| the highest float precision for things near to them.
|
| 0. https://en.wikipedia.org/wiki/Z-fighting
| jbay808 wrote:
| Is there a reason that game developers don't use fixed-point
| math for Cartesian coordinates?
|
| A 64 bit integer and a 64 bit float both chop up your
| coordinate system into the same number of points, but with the
| integer, those points are equally spaced which is the behaviour
| you'd expect from a Cartesian coordinate system (based on the
| symmetry group of translational invariance).
|
| And even a 32-bit integer is still fine enough resolution to
| support four kilometres at one-micrometer resolution. With 64
| bits per axis you can represent the entire solar system with 15
| nm resolution, while maintaining _equal resolution at any
| location_ , and exact distance calculations between any points
| no matter how close or how far.
| heinrich5991 wrote:
| Rendering is usually done with floating point on graphics
| cards, but I don't know if this is a requirement.
| eropple wrote:
| Having asked this myself once, and tried to write it: it is
| _hilariously_ slow to render. Graphics cards are float
| crunchers. Changing one 's frame of reference is not trivial
| but isn't impossible, and it is much faster.
| jbay808 wrote:
| The rendering can be done relative to the camera position
| though, can't it?
|
| So for the graphics you just subtract all world coordinates
| from the camera coordinate, and cast the result to float;
| for the game physics and AI, you work directly in fixed
| point.
| eropple wrote:
| That is typically done with matrix transformations, which
| all end up in floating-point space anyway. Having to do
| integer-to-float transforms for _everything_ to get you
| there is bad news.
| [deleted]
| dimatura wrote:
| Not just in games, but also the real world. I've had issues
| implementing robotic mapping code coming from the naive use of
| GPS coordinates, for which the solution was the same as you
| just said: dynamically adjusting the origin.
| ryanianian wrote:
| This is manifested as "far lands" in java editions of
| minecraft. Location away from origin gets out to precision
| limitations in floats resulting in jittering and other
| rendering oddities
|
| https://www.youtube.com/watch?v=crAa9-5tPEI
| cyber_kinetist wrote:
| I've heard that the Star Citizen engine devs had to change all
| of their math operations in CryEngine from single to double
| precision to add support for seamless large worlds (the player
| origin hack has limits...) Don't even want to imagine how awful
| of a nightmare that would have been.
| leetrout wrote:
| I abuse `Decimal` all the time in python. I usually make a little
| helper with a very short name like def d(in: Any)
| -> Decimal
|
| Then use that everywhere I expect a float (or just anywhere).
| Decimal's constructor is so forgiving allowing strings, ints,
| floats, other decimals it is so convenient to use. Of course
| there is the perf penalty but I always think the precision is
| worth the tradeoff.
| caturopath wrote:
| Decimal has the same fundamental representation issues that
| float does, albeit with greater configuration.
| bsder wrote:
| It does, but those issues manifest themselves in ways that
| humans have been trained to operate.
|
| "Round to nearest even" for binary floating point is weird to
| anyone who doesn't have a numerics background. "Round 0.5 up"
| is normal to humans because that's what most have been
| taught.
|
| Future programming languages should probably default to
| Decimal Floating Point and allow people to opt-in to binary
| floating point on request.
| caturopath wrote:
| > It does, but those issues manifest themselves in ways
| that humans have been trained to operate.
|
| Not really. If you do computations like "convert Fahrenheit
| to Celsius" or "pay 6 days of interest at this APR" or a
| million other things, you run into the same basic faulty
| assumptions as ever
| zokier wrote:
| wouldn't just from decimal import Decimal as
| d
|
| work as well?
| leetrout wrote:
| Sorry, I am doing stuff in the function... here is one of the
| old ones from code I can share def
| dec(value, prec=4): """Return the given value as
| a decimal rounded to the given precision.""" if
| value is None: return Decimal(0)
| value = Decimal(value) ret =
| Decimal(str(round(value, prec))) if
| ret.is_zero(): # this avoids stuff like
| Decimal('0E-8') return Decimal(0)
| return ret
| petercooper wrote:
| I know the answer is "yes" because the majority tends to have a
| point.. but are there good reasons, beyond backwards
| compatibility, for languages to be using IEEE 754 floating point
| arithmetic nowadays rather than just storing decimals "precisely"
| (to a specific degree of resolution)? Or are any new languages
| eschewing IEEE 754 entirely? (I'm aware of BigDecimal, etc. but
| these still seem to be treated as a bonus feature rather than
| 'the way'.)
| bee_rider wrote:
| They should have convenient hardware support (vector extensions
| for example). I'm sure you _could_ design floating-point
| decimals that work nicely with vector extensions (which usually
| do cover integers), but it would be a significant project.
|
| If your language is going to plug into the numerical computing
| stack (BLAS+LAPACK then all the fun stuff built up on that)
| then it'll need to talk binary floats.
|
| All the annoying stuff that numericists understand but would
| like to not mess around with, like rounding directions and
| denormals, are handled nicely with 754 floats.
|
| These are all sort of backward compatibility/legacy issues in
| the sense that they are based on decisions made in the past,
| but the hardware and libraries aren't going anywhere I bet!
|
| Also, note that IEEE 754 _does_ define a decimal interchange
| format. I bet they aren 't handled as nicely in hardware,
| though.
| rich_sasha wrote:
| Floats have an enormous range, with fixed relative precision.
| Even a single-precision float can store numbers up to about
| 1.7e38 .
|
| Now of course you pay for that by losing absolute precision,
| but chances are that if you're working with numbers like 1e20
| you don't much care about anything after the decimal point.
| andrewmcwatters wrote:
| I remember having to write something similar for Lua, because
| there was no math.approximately() function and I was dealing with
| networking floats and comparing values over the wire to predicted
| movement in 2D space.
|
| https://github.com/Planimeter/grid-sdk/blob/master/engine/sh...
|
| https://en.wikipedia.org/wiki/Machine_epsilon
|
| https://pubs.opengroup.org/onlinepubs/009696899/basedefs/flo...
| wheelerof4te wrote:
| Step one:
|
| You don't. At least, not exact numbers. You almost always have to
| round the numbers up, or just aproximate the value.
|
| So, while 1.2 + 2.3 != 3.5
|
| this is always correct: 1.2 + 2.3 < 3.6
| SloopJon wrote:
| See issue 1580 for a description of the change that displays
| floats using the shortestish decimal expansion that round trips:
|
| https://bugs.python.org/issue1580
|
| I thought there was a PEP for it, but evidently not.
|
| Although this is generally an improvement, it can contribute to
| confusion.
___________________________________________________________________
(page generated 2022-03-30 23:01 UTC)