[HN Gopher] Catastrophic Cancellation (2020)
___________________________________________________________________
Catastrophic Cancellation (2020)
Author : _ZeD_
Score : 69 points
Date : 2021-03-19 08:05 UTC (14 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| ForHackernews wrote:
| Does Python's Decimal type handle this correctly?
| lazypenguin wrote:
| Yes it does and should be used when precision is important.
| fchu wrote:
| Am I the only one who thinks the concept of relative error is not
| meaningful in this context?
|
| It gives a disproportionate meaning to 0 without real physical
| consideration, eg:
|
| - 0.10C +- 0.1 (wow 100% relative error) - 273.25K +- 0.1 (meh
| 0.04% relative error)
| jcheng wrote:
| This is talking about the error in the _difference between_ two
| values with the same units though. For temperature, it wouldn't
| matter if you're using C, K, or F for your starting values, the
| % error of the difference would be the same (I think).
| Archelaos wrote:
| I am sometimes joking with friends by suggesting: Let's meet at
| 12 o'clock +- 5%.
| andrepd wrote:
| Yes, because there is an arbitrary choice of origin which
| renders the relative error dependant on units. If you're
| measuring a length, for instance, or an interval of time, the
| relative error is independent of which units you choose. If
| you're measuring e.g. a distance to some point, then again you
| have an arbitrary choice of origin.
| _Microft wrote:
| The Celsius temperature scale is an interval scale [0] which
| means that it is possible to calculate differences but not
| ratios. The Kelvin temperature scale is a ratio scale [0] (it
| has an "absolute zero") that allows to do that.
|
| Beside that if there are uncertainties involved, one should do
| proper propagation of uncertainty anyways. [1]
|
| [0] https://en.wikipedia.org/wiki/Level_of_measurement
|
| [1] https://en.wikipedia.org/wiki/Propagation_of_uncertainty
| amelius wrote:
| If you add a really large number, then your relative error will
| decrease!
| chias wrote:
| If you add a function which increments a dummy value 4000
| times, your test coverage will increase!
| choeger wrote:
| It is kind of relieving to not have that thread talk about the
| latest stunt of cancel culture activists.
|
| But I have to wonder: Why should I use a floating point number
| for something lika a micro benchmark. Admittedly, counting from
| 1970 wastes some bits, but I am usually interested in some
| discrete quantity (ns, us, ...), so why introduce numerical
| problems in the first place?
| barbazoo wrote:
| I think you're right if you are actually able to get ms or ms
| but as far as I know the unit your getting most of the time in
| python is second.
| formerly_proven wrote:
| perf_counter generally has nanosecond-ish resolution, but it
| gives you a float with seconds so you can just drop it in
| instead of time or other timers (perf_counter is like two
| decades more recent than time.time()). Newer Pythons have
| xxx_ns variants that give you an int instead.
| Ancapistani wrote:
| Threaded version:
| https://threadreaderapp.com/thread/1275924648132149249.html
| SkyMarshal wrote:
| These tweets are already threaded on twitter, why is this app
| even needed?
|
| It tends to pollute the replies more often than not, with more
| people invoking it than actually replying.
| draw_down wrote:
| I agree, that is so annoying, 8000 replies like
| "@myCoolThreadBot unroll"
| lupire wrote:
| Twitter doesn't thread properly. It does weird things based
| on likes.
| s_gourichon wrote:
| threadreaderapp.com opens quickly, works on a browser without
| Javascript, doesn't nag about installing an app. twitter.com
| on a browser fails those criteria (especially on mobile). So,
| win for threadreaderapp.com.
| majormajor wrote:
| I don't understand how rearranging operations, like suggested
| later in the thread, would avoid the large relative error in "how
| much older is the earth than the oceans" given that our estimates
| for the age of the earth and oceans are only so precise?
| mam2 wrote:
| maybe he wants to implictly display correlations. if you
| substract directly you make the implicit assumptions the two
| even dates are uncorrelated.
| Someone wrote:
| It wouldn't and cannot, because there's only one operation in
| that calculation, so there's nothing to rearrange.
|
| That example only shows that relative errors can explode even
| when doing simple calculations.
|
| Another thing is that, in computers, calculations often are
| imprecise.
|
| Because of catastrophic cancellation and similar issues, that
| means that the computer result of a calculation can be quite
| different from the mathematical result.
|
| To make matters worse, in real life, we often don't know the
| exact values of things we measure, so even if our calculations
| are mathematically perfect, the outcome of a calculation by
| computer can be quite different from the real result.
|
| So, if you do a computer calculation, say to compute how strong
| a bridge has to be, you really, really need to know now close
| the computed value, at worst, is to the mathematically exact
| result.
|
| That's what numerical analysis is about. For a given
| calculation, it might say such things as
|
| _"if the input is between 100 and 200, to get a result with n
| decimal digits of precision, you'll have to compute all
| intermediate results with 4 x n digits."_
|
| or
|
| _"but if you rearrange the computation like this, you only
| need to use 2 x n digits for n digits of precision in your
| result"_
| tylerhou wrote:
| Doesn't help for the "older question." That's just used as an
| intuitive example for how imprecision can arise from
| subtraction.
|
| Rearranging helps when you can only store intermediate results
| with finite precision, but you can compute them to arbitrary
| precision.
| majormajor wrote:
| I see. I might quibble that here the imprecision isn't just
| from the subtraction, but because we originally just have
| very rough estimates for the dates (though 'mam2 brings up a
| very interesting point about correlation if estimates are
| based on each other and using them like that isn't the ideal
| way of answering the question).
|
| It threw me off enough that I didn't get the original point
| of it re: floating point numbers until your comment - in our
| floating point formats, the imprecision is often an
| accidental effect of the limits of the format, vs a true
| unknown, and then magnifying the relative size of that
| arbitrary limitation is a problem.
| skybrian wrote:
| Wether it's imprecise measurements or imprecise
| calculations, there are still things you can do if you're
| aware of the problem. When measurement is imprecise, it
| might be possible to improve accuracy by using a different
| measurement.
|
| For geology, it's often easier to put things in the right
| order due to rock layers rather than to figure how long ago
| they were from present.
|
| In this case, obviously, the Earth is older than the
| oceans, even if the estimated ages were even rougher and
| the error bars implied they could be in the opposite order.
|
| For history, you may be able to figure out the relative
| order of events without knowing what year they were on our
| calendar.
| dragontamer wrote:
| This is a strange tweet. It assumes people are familiar with
| classical cancellation error, but not familiar with error
| analysis. Which in my experience... people either understand
| both, or are ignorant of both concepts.
|
| The general point is that "cancellation error" happens more
| than just in floating-point operations, but also in "classic
| scientific sig-fig error analysis".
|
| ---------
|
| The tweet should either be dumbed down to discuss cancellation
| error in floating-point arithmetic, or elevated up and assume
| people know about sig-fig analysis. It sits at a weird point in
| the "assumed knowledge" curve.
|
| ---------
|
| For people unfamiliar with cancellation error, try the two
| following statements in Python3 (which defaults to double-
| precision... aka 53-bits of mantissa).
| poor_ordering = 9007199254740992.0 + 1.0 + 1.0 + 1.0 + 1.0 -
| 9007199254740992.0 good_ordering = 9007199254740992.0 -
| 9007199254740992.0 + 1.0 + 1.0 + 1.0 + 1.0
|
| What are the values of "poor_ordering" vs "good_ordering" ??
| What does this tell us about double-precision?
|
| 9007199254740992.0 == 2^53. So it is impossible for a double-
| precision number to accurately represent +/- 1.0 at 2^53. (Note
| that +/- 2.0 will work out just fine).
|
| Play around with 9007199254740992.0 +/- 1.0, or 2.0, and other
| values for about 15 minutes, and you'll probably learn
| everything you need to know about cancellation error from that
| playtime alone.
|
| Double-precision numbers are composed of 52-explicit bits + 1
| implicit bit + 1 sign bit + 11-bit exponent bits (yes, 65-bits
| total. The implicit bit "doesn't count", but makes 0.0 and
| subnormal numbers harder to deal with)
| khawkins wrote:
| I think it's a bad example, but illustrates an important point
| he doesn't make explicit, that sometimes the variable you need
| to estimate is the DX itself, not just X1 and X2 to produce X2
| - X1 = DX. With a sufficiently high amount of variance in your
| approximations of X1 and X2 their difference will tell you
| little to nothing about DX.
___________________________________________________________________
(page generated 2021-03-19 23:01 UTC)