[HN Gopher] When Double Precision Is Not Enough
       ___________________________________________________________________
        
       When Double Precision Is Not Enough
        
       Author : leonry
       Score  : 87 points
       Date   : 2022-03-19 17:47 UTC (5 hours ago)
        
 (HTM) web link (adambaskerville.github.io)
 (TXT) w3m dump (adambaskerville.github.io)
        
       | peter303 wrote:
       | The are algorithms taught in numerical analysis class for re-
       | ordering calculations to maximize precision. Computing in order
       | you wrote on paper or the way you were initially taught may not
       | be the best.
        
       | akimball wrote:
       | Rational arithmetic needs better matrix libraries. No loss of
       | precision then.
        
         | klodolph wrote:
         | People have got to stop suggesting this. It doesn't make any
         | sense and it's usually a waste of time.
         | 
         | First of all, the example in the article is computing
         | eigenvalues. Let me ask you: how do you expect to calculate
         | eigenvalues using rational numbers? As a refresher, if it's
         | been years since you took a look at linear algebra, the
         | eigenvalues of a matrix are the roots of the matrix's
         | characteristic polynomial, and we remember from high-school
         | algebra that the roots of a polynomial are often not rational
         | numbers, even if the coefficients are rational. Therefore,
         | using rational arithmetic here is a complete non-starter. It's
         | just not even possible, right from the very start.
         | 
         | However, let's suppose that the particular problem that we are
         | interested in solving does admit a rational solution. What
         | then? Well, our real-world experience tells us that we often
         | need an excessive amount computational resources and memory for
         | exact, rational answers to linear algebra problems even of
         | modest size.
         | 
         | As you perform more computations on rational numbers, the
         | memory required continues to grow.
        
           | thechao wrote:
           | Floating point numbers are a particular way to represent (and
           | modestly compress) rational numbers. I'm against rational
           | number libraries because they use non-power-of-2 denominators
           | and are, thus, fundamentally less efficient with respect to
           | the actual cost of calculation.
        
           | throwaway858 wrote:
           | I agree with this from a practical perspective.
           | 
           | But it is worth pointing out that it is actually possible to
           | maintain "perfect" accuracy when doing arithmetic even when
           | sqrts (or other irrational numbers) are involved.
           | 
           | The idea is to store numbers as an AST consisting of an
           | expression. Then you can perform any arithmetic on them by
           | growing the AST. After each operation you simplify as much as
           | you can. Once you have your final result you can evaluate the
           | AST up to any precision that you need (and sometimes your
           | final AST will already be a rational number because sqrts
           | cancelled out)
        
             | aSanchezStern wrote:
             | I believe you're basically referring to the computable
             | reals; for the curious, here's the wikipedia page (kind of
             | abstract): https://en.wikipedia.org/wiki/Computable_number
             | and here's the paper where they implement them
             | programmatically, including for the android calculator
             | application (I believe this is how the android calculator
             | works now):
             | https://dl.acm.org/doi/abs/10.1145/3385412.3386037
        
           | irchans wrote:
           | I still like to occasionally use rational number arithmetic
           | because I "know" that the answer is correct. It is very true
           | that it can be horribly slow, but if it does finish, then you
           | do have a correct answer. I have only rarely used it in
           | production code. In those rare cases, I think that I would
           | have checked to make sure that the running time was
           | acceptable.
        
             | klodolph wrote:
             | In this case, we know it would be incorrect (in general)
             | because eigenvalues are often irrational.
             | 
             | In other cases, there are often better ways to figure out
             | that the answer is correct. Figuring out that the answer is
             | correct using rational arithmetic is often just too
             | inefficient, compared to doing a bit of numerical analysis
             | or figuring out a way to validate the answer you got. That
             | is, unless the problem is very small.
        
         | kadoban wrote:
         | True, but then they run arbitrarily slowly, and it's trivial to
         | reach cases where this actually happens.
         | 
         | And anything without a rational result you're stuck again.
        
           | irchans wrote:
           | Often you can bound the running time --- certainly if there
           | are no loops and you only use + - * and / operations. But,
           | you are often screwed if there is a sin, cos, sqrt or other
           | function that gives irrational results. You can sometimes get
           | around sqrt function or other root functions by using an
           | extension field, but I think I could count on one hand how
           | many times I've done that.
        
       | dekhn wrote:
       | IIUC most people instead precondition their matrices and stick to
       | double precision instead of quad or deep precision.
        
         | boulos wrote:
         | Yeah, they also allude to "better ways to do it" both in the
         | main text and the last paragraph:
         | 
         | > The next post will discuss better ways to implement higher
         | precision in numerical calculations.
         | 
         | There are lots of automatic row / column swap techniques,
         | particularly for sparse matrices that ensure you avoid
         | catastrophic cancellation. You wouldn't do this automatically,
         | because most matrices are reasonably conditioned (and figuring
         | out the optimal swaps is NP complete and so on), but there are
         | lots of algorithms to do well enough when you need to.
        
       | willis936 wrote:
       | If anyone ever runs out of precision using IIR filters: convert
       | to second-order section (SOS). I stumbled upon this trick a few
       | years ago and it's a game changer. No need for FIR.
        
       | vlmutolo wrote:
       | An interesting tool to help preserve floating point precision is
       | Herbie (https://herbie.uwplse.org/), which takes an expression as
       | input and outputs an optimized version of it that will retain
       | precision.
       | 
       | The example they give on their site is:
       | sqrt(x+1) - sqrt(x)  =>  1/(sqrt(x+1) + sqrt(x))
        
         | gruez wrote:
         | I'm glad that there's a tool to automatically do this. When I
         | was in university, this was something that was on
         | tests/assignments.
        
           | ranit wrote:
           | Somebody has to make the tool :-)
        
         | aSanchezStern wrote:
         | For those curious, this tool uses arbitrary precision floating-
         | point arithmetic (up to 3000 bits) to sample the input space
         | and measure the error of different implementations of the
         | function, then some clever numerical program synthesis to
         | generate candidates and search for the most accurate
         | implementation.
        
         | [deleted]
        
       | mistrial9 wrote:
       | use 80 bits?
       | https://en.wikipedia.org/wiki/Standard_Apple_Numerics_Enviro...
        
         | liversage wrote:
         | Or 128 bits. FORTRAN has a REAL*16 type (quadruple precision
         | floating point number). I encountered this at university in a
         | physics department that studied non- linear systems also known
         | as chaos theory. Rounding errors can quickly become a problem
         | in these systems.
        
         | ip26 wrote:
         | I believe x87 still offers 80 bits in your PC today.
        
           | mark-r wrote:
           | Nobody uses the x87 instructions any more, they've been
           | superseded by MMX/SSE/SSE2 which can't do 80-bit operations.
           | And it's not clear whether 80 bits would be enough of an
           | improvement over 64 to fix the problem.
        
             | mhh__ wrote:
             | D (dlang.org) has a "real" type alongside float and double
             | which represents the widest float available on the target
             | (so 80bits if x87 is available). It does see some use.
             | 
             | (But don't use it without a valid reason or your program is
             | going to be slow because no SIMD)
        
               | Someone wrote:
               | > a "real" type alongside float and double which
               | represents the widest float available on the target
               | 
               | Initial question: what would that be on a system without
               | hardware floating point?
               | 
               | Luckily, https://dlang.org/spec/type.html says (not 100%
               | unambiguously) it isn't necessarily "the widest float
               | available on the target", but either that or, if that
               | doesn't have both the range and precision of double,
               | double.
               | 
               | So, if you decide to use it, you're guaranteed not to
               | give up range or precision, may gain some, but also may
               | give up performance.
        
             | vardump wrote:
             | A slight correction, no one uses MMX anymore.
             | 
             | It's all SSE and AVX (wider and more comprehensive
             | instruction set) nowadays.
        
         | aSanchezStern wrote:
         | This is actually fewer bits than they're using in the article,
         | but that's not clear because they refer to their numbers by how
         | many "[decimal] digits", not how many bits. The hardware
         | doubles they're starting with are 64-bits wide, representing
         | 15.5 decimal digits, but then they move into 128- and 256-bit
         | representations. 80-bit math, such as that available on the old
         | x87 floating point co-processors, is I believe about 21 decimal
         | digits.
        
           | shmageggy wrote:
           | I found that very unintuitive. Is it typical in numerical
           | analysis to talk about the number of decimal digits
           | represented by a binary floating point number? I don't see
           | what is gained by changing base.
        
       | layer8 wrote:
       | There is a whole field about that, called numerical analysis:
       | https://en.m.wikipedia.org/wiki/Numerical_analysis
        
       | jandrewrogers wrote:
       | There are a few edge cases in geospatial computational geometry
       | that require greater than quad precision to correctly compute a
       | double precision result. This fact is sometimes used as a litmus
       | test for implementation correctness. Many implementations don't
       | have a conditional escape for these edge cases that switches to
       | arbitrary precision algorithms in their double precision
       | algorithms, which is easy to verify.
        
         | s2mcallis wrote:
         | I'd be interested in hearing if you have more info on this.
        
           | alar44 wrote:
           | There was an article here maybe about a month ago about
           | testing Maya vs CAD vs Blender or something. Had to do with a
           | litmus test where you'd create a screw and inspect how the
           | threads terminated at the tip.
        
         | kevinventullo wrote:
         | Conversely, if you use double-precision floating point
         | arithmetic near "null island", the lat/lon coordinate system
         | becomes more precise than the Planck length.
        
         | jacobolus wrote:
         | "Robustness" of results is a serious problem for any kind of
         | computational geometry, because computational geometry
         | algorithms generally only work when various assumed invariants
         | (based on ideal arithmetic) actually hold, but those can easily
         | be violated by arithmetic with rounding. When invariants are
         | violated, the result can be an infinite loop, a program crash,
         | or a comically incorrect result,
         | 
         | https://www.cs.cmu.edu/~quake/robust.html is some famous work
         | from the mid 1990s showing how to achieve provably correct
         | results in a computationally efficient way (on a CPU).
        
       | vHMtsdf wrote:
       | I remember a class I had where we were shown a methods for exact
       | PC arithmetic with integers, rationals (e.g., 1/2) and algebraic
       | numbers (e.g., sqrt(2)). Only thing it couldn't deal with were
       | transcendental numbers (e.g., pi)
       | 
       | I think it worked by representing the numbers by integer
       | matrices, all operations were then matrix operations.
       | Unfortunately, the matrices were gaining dimension when new
       | algebraic numbers got involved, so it probably wasn't useful for
       | anything :-).
       | 
       | Anyway, it it blew me away anyway at the time, one of the few
       | things that sticked with me from school.
        
         | aSanchezStern wrote:
         | Yeah there are a few number systems that can do this kind of
         | thing, ranging from rational implementations where the
         | numerator and denominator are stored, all the way to the
         | computable reals
         | (https://en.wikipedia.org/wiki/Computable_number), which can
         | handle even trancendentals exactly! Fun fact, the built-in
         | calculator on android actually uses the computable reals
         | (https://dl.acm.org/doi/abs/10.1145/3385412.3386037), so you
         | can do any calculation on it, and then ask for any number of
         | digits (keep scrolling the result), and it will always be
         | correct. The only downside is that they're a little too slow to
         | use in a lot of performance-sensitive numerical applications.
        
       | frozenport wrote:
       | >> Numerically stable algorithms exist for such problems, but
       | these occasionally fail leaving us to brute force the calculation
       | with higher precision to minimize floating point rounding errors.
       | 
       | Do algorithm authors really implement two code paths?
       | 
       | One that uses a cost function iterative algorithm to solve the
       | matrix (this is what everybody does), and a second algorithm
       | using the brute force approach the author showed (known not to
       | work).
        
         | irchans wrote:
         | >> Do algorithm authors really implement two code paths?
         | 
         | I do sometimes. Sometimes I test the result, and if it is not
         | correct, then I use a slower, more numerically stable
         | algorithm. I'm not sure if I have ever written code that
         | repeats the same code with quad precision, but of course that
         | would often work. Sometimes you can compute a correction from
         | the "residual".
         | 
         | >>One that uses a cost function iterative algorithm to solve
         | the matrix (this is what everybody does), and a second
         | algorithm using the brute force approach the author showed
         | (known not to work).
         | 
         | We do often use Jacobi or Gauss-Seidel, or something similar
         | (esp for computing estimates of A^(-1)b ). In those cases we
         | never revert back to QR. We do sometimes use Jacobi or GS to
         | improve a result obtained from an ill conditioned QR, but
         | rarely.
         | 
         | Many years ago, I would have the code raise an exception if the
         | result is bad so that I would get a call. When I got such a
         | call, I would think about how to make the algorithm more
         | stable.
        
       | bluetwo wrote:
       | I'll admit being lost in some of the math here but I wonder
       | perhaps naively, if a language like Julia, which does fractions
       | as fractions, would handle it better.
        
         | salty_biscuits wrote:
         | It's really just a floating point precision issue. Even if you
         | represented the input matrix as a rational elementary type the
         | output eigenvalues would be a subset of the reals and
         | calculated as being floating point. You could get around it if
         | you had a special matrix type where you knew you were
         | definitely going to get rational eigenvalues and you could
         | dispatch on that type.
        
       | magicalhippo wrote:
       | This reminded me of Posits[1] and Unums[2]. I dabbled with my own
       | implementation some years ago, but nothing serious.
       | 
       | Anyone tried them out? I see there's some FPGA implementations
       | and a RISC-V extension idea[3], would be fun to try it on a
       | softcore.
       | 
       | [1]: https://posithub.org/docs/posit_standard.pdf
       | 
       | [2]: http://johngustafson.net/unums.html
       | 
       | [3]: https://www.posithub.org/khub_doc
        
         | aSanchezStern wrote:
         | Unfortunately while Posits (and Unums) are more accurate for
         | some computations, they are still designed around using small
         | numbers of bits to represent numbers, so they have error in
         | computations just like floating point. The argument for posits
         | are mostly that they produce less error on the inputs you care
         | about, and more on the ones you don't, but it really depends on
         | your computation; you really have to know what you're doing to
         | use them safely.
        
         | Dylan16807 wrote:
         | Which version?
         | 
         | With current posits, they're a clever way to give more bits to
         | precision near 1 and more bits to exponent near 0 and infinity,
         | but if you're not overflowing or underflowing then it's a
         | marginal improvement at best. Then Unum glues intervals on top,
         | right? But you could do intervals with normal floats too. And
         | you still won't get the correct answer, you'll just know the
         | error exploded. Is the giant accumulator feature able to help
         | here?
         | 
         | For the other versions I tend to get completely lost.
        
       | a-dub wrote:
       | also worth noting that you can ask the fpu to trap to catch some
       | types of precision loss or fp underflows.
        
         | bayindirh wrote:
         | If you're playing around subnormal numbers [0], most modern
         | processors automatically switch to a microcode based
         | calculation mode to preserve these numbers. However, these
         | algorithms are very expensive, so it's not desired unless
         | explicitly needed and generally used as an escape hatch for
         | edge cases.
         | 
         | [0]: https://en.wikipedia.org/wiki/Subnormal_number
        
           | a-dub wrote:
           | i'd be curious what matlab would do with finding the eigs for
           | op's matrix. i'd bet a coffee that it (one of the eigs
           | functions) prints a warning. entirely possible it may even
           | switch search methods if precision loss is detected.
        
       ___________________________________________________________________
       (page generated 2022-03-19 23:00 UTC)