[HN Gopher] Beware of Fast-Math
___________________________________________________________________
Beware of Fast-Math
Author : blobcode
Score : 277 points
Date : 2025-05-31 07:05 UTC (15 hours ago)
(HTM) web link (simonbyrne.github.io)
(TXT) w3m dump (simonbyrne.github.io)
| Sophira wrote:
| Previously discussed at
| https://news.ycombinator.com/item?id=29201473 (which the article
| itself links to at the end).
| anthk wrote:
| On Forth, there's the philosophy of the fixed point:
|
| https://www.forth.com/starting-forth/5-fixed-point-arithmeti...
|
| With 32 and 64 bit numbers, you can just scale decimals up. So,
| Torvalds was right. On dangerous contexts (uper-precise medical
| doses, FP has good reasons to exist, and I am not completely
| sure).
|
| Also, both Forth and Lisp internally suggest to use represented
| rationals before floating point numbers. Even toy lisps from
| https://t3x.org have rationals too. In Scheme, you have both
| exact->inexact and inexact->exact which convert rationals to FP
| and viceversa.
|
| If you have a Linux/BSD distro, you may already have Guile
| installed as a dependency.
|
| Hence, run it and then: scheme@(guile-
| user)> (inexact->exact 2.5) $2 = 5/2
| scheme@(guile-user)> (exact->inexact (/ 5 2)) $3 =
| 2.5
|
| Thus, in Forth, I have a good set of q{+,-,*,/} operations for
| rational (custom coded, literal four lines) and they work great
| for a good 99% of the cases.
|
| As for irrational numbers, NASA used up 16 decimals, and the
| old 113/355 can be precise enough for a 99,99 of the pieces
| built in Earth. Maybe not for astronomical distances, but
| hey...
|
| In Scheme: scheme@(guile-user)>
| (exact->inexact (/ 355 113)) $5 =
| 3.1415929203539825
|
| In Forth, you would just use : pi* 355
| 133 m*/ ;
|
| with a great precision for most of the objects being measured
| against.
| eqvinox wrote:
| Those rational numbers fly out the window as soon as your
| math involves any kind of more complicated trigonometry, or
| even a square root...
| stassats wrote:
| You can turn them back into rationals, (rational (sqrt
| 2d0)) => 6369051672525773/4503599627370496
|
| Or write your own operations that compute to the precision
| you want.
| anthk wrote:
| My post already covered inexact->exact:
| scheme@(guile-user)> (inexact->exact (sqrt 2.0))
|
| $1 = 6369051672525773/4503599627370496
|
| s9 Scheme fails on this as it's an irrational number, but
| the rest of Schemes such as STKlos, Guile, Mit Scheme,
| will do it right.
|
| With Forth (and even EForth if the images it's compiled
| with FP support), you are on your own to check (or
| rewrite) an fsqrt function with an arbitrary precision.
|
| Also, on trig, your parent commenter should check what
| CORDIC was.
|
| https://en.wikipedia.org/wiki/CORDIC
| anthk wrote:
| Check CORDIC, please.
|
| https://en.wikipedia.org/wiki/CORDIC
|
| Also, on sqrt functions, even a FP-enabled toy EForth under
| the Subleq VM (just as a toy, again, but it works) provides
| some sort of fsqrt functions: 2 f fsqrt
| f. 1.414 ok
|
| Under PFE Forth, something 'bigger': 40
| set-precision ok 2e0 fsqrt f.
| 1.4142135623730951454746218587388284504414 ok
|
| EForth's FP precision it's tiny but good enough for very
| small microcontrollers. But it wasn't so far from the
| exponents the 80's engineers worked to create properly
| usable machinery/hardware and even software.
| dreamcompiler wrote:
| If you want high precision trig functions on rationals,
| nothing's stopping you from writing a Taylor series library
| for them. Or some other polynomial appromation or a lookup
| table or CORDIC.
| AlotOfReading wrote:
| Floats _are_ fixed point, just done in log space. The main
| change is that the designers dedicated a few bits to variable
| exponents, which introduces alignment and normalization steps
| before /after the operation. If you don't mix exponents, you
| can essentially treat it as identical to a lower precision
| fixed point system.
| anthk wrote:
| No, not even close. Scaling integers to mimic decimals
| under 32 and 64 bit can be much faster. And with 32 bit
| double numbers you can cover Plank numbers, so with 64 bit
| double numbers you can do any field.
| orlp wrote:
| I helped design an API for "algebraic operations" in Rust:
| <https://github.com/rust-lang/rust/issues/136469>, which are
| coming along nicely.
|
| These operations are
|
| 1. Localized, not a function-wide or program-wide flag.
|
| 2. Completely safe, -ffast-math includes assumptions such that
| there are no NaNs, and violating that is _undefined behavior_.
|
| So what do these algebraic operations do? Well, one by itself
| doesn't do much of anything compared to a regular operation. But
| a sequence of them is allowed to be transformed using
| optimizations which are algebraically justified, as-if all
| operations are done using real arithmetic.
| eqvinox wrote:
| Are these calls going to clear the FTZ and DAZ flags in the
| MXCSR on x86? And FZ & FIZ in the FPCR on ARM?
| orlp wrote:
| I don't believe so, no. Currently these operations only set
| the LLVM flags to allow reassociation, contraction, division
| replaced by reciprocal multiplication, and the assumption of
| no signed zeroes.
|
| This can be expanded in the future as LLVM offers more flags
| that fall within the scope of algebraically motivated
| optimizations.
| eqvinox wrote:
| Ah sorry I misunderstood and thought this API was for the
| other way around, i.e. forbidding "unsafe" operations. (I
| guess the question reverses to _setting_ those flags)
|
| ('Naming: "algebraic" is not very descriptive of what this
| does since the operations themselves are algebraic.' :D)
| nextaccountic wrote:
| > ('Naming: "algebraic" is not very descriptive of what
| this does since the operations themselves are algebraic.'
| :D)
|
| Okay, the floating point operations are literally
| algebraic (they form an algebra) but they don't follow
| some common algebraic properties like associativity. The
| linked tracking issue itself acknowledges that:
|
| > Naming: "algebraic" is not very descriptive of what
| this does since the operations themselves are algebraic.
|
| Also this comment https://github.com/rust-
| lang/rust/issues/136469#issuecomment...
|
| > > On that note I added an unresolved question for
| naming since algebraic isn't the most clear indicator of
| what is going on. > > I think it is fairly clear. The
| operations allow algebraically justified optimizations,
| as-if the arithmetic was real arithmetic. > > I don't
| think you're going to find a clearer name, but feel free
| to provide suggestions. One alternative one might
| consider is real_add, real_sub, etc.
|
| Then retorted here https://github.com/rust-
| lang/rust/issues/136469#issuecomment...
|
| > These names suggest that the operations are more
| accurate than normal, where really they are less
| accurate. One might misinterpret that these are infinite-
| precision operations (perhaps with rounding after a whole
| sequence of operations). > > The actual meaning isn't
| that these are real number operations, it's quite the
| opposite: they have best-effort precision with no strict
| guarantees. > > I find "algebraic" confusing for the same
| reason. > > How about approximate_add, approximate_sub?
|
| And the next comment
|
| > Saying "approximate" feels imperfect, as while these
| operations don't promise to produce the exact IEEE result
| on a per-operation basis, the overall result might well
| be more accurate algebraically. E.g.: > > (...)
|
| So there's a discussion going on about the naming
| eqvinox wrote:
| It doesn't feel appropriate to comment there for me not
| knowing any Rust really, but "lax_" (or "relax_") would
| have the extra benefit of being very short.
|
| (Is this going to overload operators or are people going
| to have to type this... a lot... ?)
| Sharlin wrote:
| Rust has some precedence for adding convenience newtypes
| with overloaded operators (eg. `Wrapping<I>' for
| `I.wrapping_add(I)` etc). Such a wrapper isn't currently
| proposed AFAIK but there's no reason one couldn't be
| added in the future I believe.
| eqvinox wrote:
| Right, as long as the LLVM intrinsics are exposed you
| could just put that in a crate somewhere.
| Measter wrote:
| For giggles, here's one I whipped up, along with an
| example use: https://godbolt.org/z/Eezj35dzc
| Sharlin wrote:
| Wow, that's some hardcore unrolling.
| CryZe wrote:
| WebAssembly also ended up calling its set of similar
| instructions relaxed.
| evrimoztamur wrote:
| Does that mean that a physics engine written with these
| operations will always compile to yield the same deterministic
| outcomes across different platforms (assuming they correctly
| implement (or able to do so) algebraic operations)?
| orlp wrote:
| No, there is no guarantee which (if any) optimizations are
| applied, only that they _may_ be applied. For example a fused
| multiply-add instruction may be emitted for a*b + c on
| platforms which support it, which is not cross-platform.
| SkiFire13 wrote:
| No, the result may depend on how the compiler reorders them,
| which could be different on different platforms.
| Sharlin wrote:
| It's more like the opposite. These tell the compiler to
| assume for optimization purposes that floats are associative
| and so on (ie. algebraic), even when in reality they aren't.
| So the results may vary depending on what transformations the
| compiler performs - in particular, they may vary between
| optimized and non-optimized builds, which normally isn't
| allowed.
| vanderZwan wrote:
| > _These tell the compiler to assume for optimization
| purposes that floats are associative and so on (ie.
| algebraic), even when in reality they aren 't._
|
| I wonder if it is possible to add an additional constraint
| that guarantees the transformation has equal or fewer
| numerical rounding errors. E.g. for floating point doubles
| (0.2 + 0.1) - 0.1 results in 0.20000000000000004, so I
| would expect that transforming some (A + B) - B to just A
| would always reduce numerical error. OTOH, it's floating
| point maths, there's probably some kind of weird gotcha
| here as well.
| legobmw99 wrote:
| Kahan summation is an example (also described in the top
| level article) of one such "gotcha". It involves adding a
| term that - if floats were algebraic in this sense -
| would always be zero, so ffast-math often deletes it, but
| this actually completely removes the accuracy improvement
| of the algorithm
| anthk wrote:
| Under EForth with FP done in software:
|
| 2 f 1 f 1 f f+ f- f. 0.000 ok
|
| PFE, I think reusing the GLIBC math library:
|
| 2e0 1e0 1e0 f+ f- f. 0.000000 ok
| StefanKarpinski wrote:
| Pretty sure that's not possible. More accurate for some
| inputs will be less accurate for others. There's a very
| tricky tension in float optimization that the most
| predictable operation structure is a fully skewed op
| tree, as in naive left-to-right summation, but this is
| the slowest and least accurate order of operations. Using
| a more balanced tree is faster and more accurate (great),
| but unfortunately which tree shape is fastest depends
| very much on hardware-specific factors like SIMD width
| (less great). And no tree shape is universally guaranteed
| to be fully accurate, although a full binary tree tends
| to have the best accuracy, but has bad base case
| performance, so the actual shape that tends to get used
| in high performance kernels is SIMD-width parallel in a
| loop up to some fixed size like 256 elements, then
| pairwise recursive reduction above that. The recursive
| reduction can also be threaded. Anyway, there's no silver
| bullet here.
| scythe wrote:
| I think a restricted version might be possible to
| implement: only allow transformations if the transformed
| version has strictly fewer numerical rounding errors on
| some inputs. This will usually only mean canceling terms
| and collecting expressions like "x+x+x" into 3x.
|
| In general, rules that allow fewer transformations are
| probably easier to understand and use. Trying to optimize
| everything is where you run into trouble.
| glkindlmann wrote:
| That sounds neat. What would be really neat is if the language
| helped to expose the consequences of the ensuing rounding error
| by automating things that are otherwise clumsy for programmers
| to do manually, like running twice with opposite rounding
| directions, or running many many times with internally
| randomized directions (two of the options in Sec 4 of *). That
| is, it would be cool if Rust enabled people learn about the
| subtleties of floating point, instead of hiding them away.
|
| * https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf
| pclmulqdq wrote:
| -ffast-math is actually something like 15 separate flags, and
| you can use them individually if you want. 3 of them are "no
| NaNs," "no infinities," and "no subnormals." Several of the
| other flags allow you to treat math as associative or
| distributive if you want that.
|
| The library has some merit, but the goal you've stated here is
| given to you with 5 compiler flags. The benefit of the library
| is choosing when these apply.
| eqvinox wrote:
| I wish the Twitter links in this article weren't broken.
| Smaug123 wrote:
| They aren't, at least for the spot-check I performed; probably
| you need to be logged in.
| SunlitCat wrote:
| Maybe an unpopular opinion, but having to be logged in, is
| being broken. ;)
| eqvinox wrote:
| All it says is "Something went wrong. Try reloading." -- no
| indication having an account logged in would help (...and I
| don't feel like creating an account just to check...)
| genewitch wrote:
| Change X to xcancel
| rlpb wrote:
| > I mean, the whole point of fast-math is trading off speed with
| correctness. If fast-math was to give always the correct results,
| it wouldn't be fast-math, it would be the standard way of doing
| math.
|
| A similar warning applies to -O3. If an optimization in -O3 were
| to reliably always give better results, it wouldn't be in -O3;
| it'd be in -O2. So blindly compiling with -O3 also doesn't seem
| like a great idea.
| CamouflagedKiwi wrote:
| The optimisations in -O3 aren't supposed to give incorrect
| results. They're not in -O2 because they make a more aggressive
| space/speed tradeoff or increase compile times more
| significantly. In the same way, the optimisations in -O2 are
| not meant to be less correct than -O1, but they aren't in that
| group for similar reasons.
|
| -Ofast is the 'dangerous' one. (It includes -ffast-math).
| rlpb wrote:
| > The optimisations in -O3 aren't supposed to give incorrect
| results.
|
| I didn't mean to imply that they result in incorrect results.
|
| > they make a more aggressive space/speed tradeoff...
|
| Right...so "better" becomes subjective, depends on the use
| case, so it doesn't make sense to choose -O3 blindly unless
| you understand the trade-offs and want that side of them for
| the particular builds you're doing. Things that everyone
| wants would be in -O2. That's all I'm saying.
| eqvinox wrote:
| It doesn't become subjective; things in -O3 can objectively
| be understood to produce equal or faster code for a higher
| build cost in the vast majority of cases, roughly averaged
| across platforms. (Without loss in correctness.)
|
| If you know your exact target and details about your input
| expectations, of course you can optimize further, which
| might involve turning off some things in -O3 (or even -O2).
| On a whole bunch of systems, -Os can be faster than -O3 due
| to I-cache size limits. But at-large, you can expect -O3 to
| be faster.
|
| Similar considerations apply for LTO and PGO. LTO is
| commonly default for release builds these days, it just
| costs a whole lot of compile time. PGO is done when
| possible (i.e. known majority inputs).
| CamouflagedKiwi wrote:
| If they're things that everyone wants, why aren't they in
| -O1?
| wffurr wrote:
| If the answer can be wrong, you can make it as fast as you
| want.
| zinekeller wrote:
| (2021)
|
| Previous discussion: Beware of fast-math (Nov 12, 2021,
| https://news.ycombinator.com/item?id=29201473)
| Affric wrote:
| For non-associativity what is the best way to order operations?
| Is there an optimal order for precision whereby more similar
| values are added/multiplied first?
|
| EDIT: I am now reading Goldberg 1991
|
| Double edit: Kahan Summation formula. Goldberg is always worth
| going back to.
| zokier wrote:
| Herbie can optimize arbitrary floating point expressions for
| accuracy
|
| https://herbie.uwplse.org/
| Sharlin wrote:
| > -funsafe-math-optimizations
|
| What's wrong with fun, safe math optimizations?!
|
| (:
| keybored wrote:
| Hah! I was just about to comment that I immediately read it as
| fun-safe, everytime I see it.
|
| I guess that happens when I don't deal with compiler flags
| daily.
| vardump wrote:
| "This roller coaster is optimized to be Fun and Safe!"
| Sharlin wrote:
| Many funroll loops in that coaster.
| emn13 wrote:
| I get the feeling that the real problem here are the IEEE specs
| themselves. They include a huge bunch of restrictions that each
| individually aren't relevant to something like 99.9% of floating
| point code, and probably even in aggregate not a single one is
| relevant to a large majority of code segments out in the wild.
| That doesn't mean they're not important - but some of these
| features should have been locally opt-in, not opt out. And at the
| very least, standards need to evolve to support hardware
| realities of today.
|
| Not being able to auto-vectorize seems like a pretty critical bug
| given hardware trends that have been going on for decades now; on
| the other hand sacrificing platform-independent determinism isn't
| a trivial cost to pay either.
|
| I'm not familiar with the details of OpenCL and CUDA on this
| front - do they have some way to guarrantee a specific order-of-
| operations such that code always has a predictable result on all
| platforms and nevertheless parallelizes well on a GPU?
| Affric wrote:
| How does IEEE 754 prevent auto-vectorisation?
| Kubuxu wrote:
| IIRC reordering additions can cause the result to change
| which makes auto-vectorisation tricky.
| kzrdude wrote:
| If you write a loop `for x in array { sum += x }` Then your
| program is a specification that you want to add the elements
| in exactly that order, one by one. Vectorization would change
| the order.
| stingraycharles wrote:
| Yup, because of the imprecision of floating points, cannot
| just assume that "(a + c) + (b + d)" is the same as "a + b
| + c + d".
|
| It would be pretty ironic if at some point fixed point /
| bignum implementations end up being faster because of this.
| anthk wrote:
| They are, just check anything fixed-point for the 486SX
| vs anything floating under a 486DX. It's faster scaling
| and sum and print the desired precision than operating on
| floats.
| einpoklum wrote:
| I wonder... couldn't there just be some library type for
| this, e.g. `associative::float` and `associative::doube`
| and such (in C++ terms), so that compilers can ignore
| non-associativity for actions on values of these types?
| Or attributes one can place on variables to force
| assumption of associativity?
| dahart wrote:
| The bigger problem there is the language not offering a way
| to signal the author's intent. If an author doesn't care
| about the order of operations in a sum, they will still
| write the exact same code as the author who does care. This
| is a failure of the language to be expressive enough, and
| doesn't reflect on the IEEE spec. (The spec even does
| suggest that languages should offer and define these sorts
| of semantics.) Whether the program is specifying an order
| of operations is lost when the language offers no way for a
| coder to distinguish between caring about order and not
| caring. This is especially difficult since the vast
| majority of people don't care and don't consider their own
| code to be a specification on order of operations. Worse,
| most people would even be surprised and/or annoyed if the
| compiler didn't do certain simplifications and constant
| folding, which change the results. The few cases where
| people do care about order can be extremely important, but
| they are rare nonetheless.
| goalieca wrote:
| Floating point arithmetic is neither commutative or
| associative so you shouldn't.
| lo0dot0 wrote:
| While it technically correct to say this it also gets the
| wrong point across because it leaves out the fact that
| ordering changes create only a small difference. Other
| examples where arithmetic is not commutative, e.g. matrix
| multiplication , can create much larger differences.
| kstrauser wrote:
| > ordering changes create only a small difference.
|
| That can't be assumed.
|
| You can easily fall into a situation like:
| total = large_float_value for _ in
| range(1_000_000_000): total += .01 assert
| total == large_float_value
|
| Without knowing the specific situation, it's impossible
| to say whether that's a tolerably small difference.
| eapriv wrote:
| Why is it not commutative?
| layer8 wrote:
| It actually is commutative according to IEEE-754, except
| that in the case of a NaN result you might get a
| different NaN representation.
| adgjlsfhk1 wrote:
| having multiple NaNs and no spec for how they should
| behave feels like such an unforced error to me
| layer8 wrote:
| IEEE-754 addition and multiplication is commutative. It
| isn't distributive, though.
| dahart wrote:
| The spec doesn't prevent auto-vectorization, it only says the
| language should avoid it when it wants to opt in to producing
| "reproducible floating-point results" (section 11 of IEEE
| 754-2019). Vectorizing can be implemented in different ways,
| so whether a language avoids vectorizing in order to opt in
| to reproducible results is implementation dependent. It also
| depends on whether there is an option to not vectorize. If a
| language only had auto-vectorization, and the vectorization
| result was deterministic and reproducible, and if the
| language offered no serial mode, this could adhere to the
| IEEE spec. But since C++ (for example) offers serial
| reductions in debug & non-optimized code, and it wants to
| offer reproducible results, then it has to be careful about
| vectorizing without the user's explicit consent.
| ajross wrote:
| > I get the feeling that the real problem here are the IEEE
| specs themselves.
|
| Well, all standards are bad when you really get into them,
| sure.
|
| But no, the problem here is that floating point code is often
| sensitive to precision errors. Relying on rigorous adherence to
| a specification doesn't fix precision errors, but it does
| guarantee that software behavior in the face of them is
| deterministic. Which 90%+ of the time is enough to let you
| ignore the problem as a "tuning" thing.
|
| But no, precision errors _are bugs_. And the proper treatment
| for bugs is to fix the bugs and not ignore them via tricks with
| determinism. But that 's hard, as it often involves design
| decisions and complicated math (consider gimbal lock: "fixing"
| that requires understanding quaternions or some other
| orthogonal orientation space, and that's hard!).
|
| So we just deal with it. But IMHO --ffast-math is more good
| than bad, and projects should absolutely enable it, because the
| "problems" it discovers are bugs you want to fix anyway.
| chuckadams wrote:
| > (consider gimbal lock: "fixing" that requires understanding
| quaternions or some other orthogonal orientation space, and
| that's hard!)
|
| Or just avoiding gimbal lock by other means. We went to the
| moon using Euler angles, but I don't suppose there's much of
| a choice when you're using real mechanical gimbals.
| ajross wrote:
| That is the "tuning" solution. And mostly it works by
| limiting scope of execution ("just don't do that") and if
| that doesn't work by having some kind of recovery method
| ("push this button to reset", probably along with "use this
| backup to recalibrate"). And it... works. But the bug is
| still a bug. In software we prefer more robust techniques.
|
| FWIW, my memory is that this was _exactly_ what happened
| with Apollo 13. It lost its gyro calibration after the
| accident (it did the thing that was the "just don't do
| that") and they had to do a bunch of iterative contortions
| to recover it from things like the sun position (because
| they couldn't see stars out the iced-over windows).
|
| NASA would have strongly preferred IEEE doubles and
| quaternions, in hindsight.
| adrian_b wrote:
| Not being able to auto-vectorize is not the fault of the IEEE
| standard, but the fault of those programming languages which do
| not have ways to express that the order of some operations is
| irrelevant, so they may be executed concurrently.
|
| Most popular programming languages have the defect that they
| impose a sequential semantics even where it is not needed.
| There have been programming languages without this defect, e.g.
| Occam, but they have not become widespread.
|
| Because nowadays only a relatively small number of users care
| about computational applications, this defect has not been
| corrected in any mainline programming language, though for some
| programming languages there are extensions that can achieve
| this effect, e.g. OpenMP for C/C++ and Fortran. CUDA is similar
| to OpenMP, even if it has a very different syntax.
|
| The IEEE standard for floating-point arithmetic has been one of
| the most useful standards in all history. The reason is that
| both hardware designers and naive programmers have always had
| the incentive to cheat in order to obtain better results in
| speed benchmarks, i.e. to introduce errors in the results with
| the hope that this will not matter for users, which will be
| more impressed by the great benchmark results.
|
| There are always users who need correct results more than
| anything else and it can be even a matter of life and death.
| For the very limited in scope uses where correctness does not
| matter, i.e. mainly graphics and ML/AI, it is better to use
| dedicated accelerators, GPUs and NPUs, which are designed by
| prioritizing speed over correctness. For general-purpose CPUs,
| being not fully-compliant with the IEEE standard is a serious
| mistake, because in most cases the consequences of such a
| choice are impossible to predict, especially not by the people
| without experience in floating-point computation who are the
| most likely to attempt to bypass the standard.
|
| Regarding CUDA, OpenMP and the like, by definition if some
| operations are parallelizable, then the order of their
| execution does not matter. If the order matters, then it is
| impossible to provide guarantees about the results, on any
| platform. If the order matters, it is the responsibility of the
| programmer to enforce it, by synchronization of the parallel
| threads, wherever necessary.
|
| Whoever wants vectorized code should never rely on programming
| languages like C/C++ and the like, but they should always use
| one of the programming language extensions that have been
| developed for this purpose, e.g. OpenMP, CUDA, OpenCL, where
| vectorization is not left to chance.
| emn13 wrote:
| If you care about absolute accuracy, I'm skeptical you want
| floats at all. I'm sure it depends on the use case.
|
| Whether it's the standards fault or the languages fault for
| following the standard in terms of preventing auto-
| vectorization is splitting hairs; the whole point of the
| standard is to have predictable and usually fairly low-error
| ways of performing these operations, which only works when
| the order of operations is defined. That very aim is the
| problem; to the extent the stardard is harmless when ordering
| guarrantees don't exist you're essentially applying some of
| those tricky -ffast-math suboptimizations.
|
| But to be clear in any case: there are obviously cases
| whereby order-of-operations is relevant enough and accuracy
| altering reorderings are not valid. It's just that those are
| rare enough that for many of these features I'd much prefer
| that to be the opt-in behavior, not opt-out. There's
| absolutely nothing wrong with having a classic IEEE 754 mode
| and I expect it's an essentialy feature in some niche corner
| cases.
|
| However, given the obviously huge application of massively
| parallel processors and algorithms that accept rounding
| errors (or sometimes conversely overly precise results!),
| clearly most software is willing to generally accept rounding
| errors to be able to run efficiently on modern chips. It just
| so happens that none of the computer languages that rely on
| mapping floats to IEEE 754 floats in a straitforward fashion
| are any good at that, which is seems like its a bad trade
| off.
|
| There could be multiple types of floats instead; or code-
| local flags that delineate special sections that need precise
| ordering; or perhaps even expressions that clarify how much
| error the user is willing to accept and then just let the
| compiler do some but not all transformations; and perhaps
| even other solutions.
| dzaima wrote:
| The precise requirements of IEEE-754 may not be important for
| any given program, but as long as you want your numbers to have
| _any_ form of well-defined semantics beyond "numbers exist,
| and here's a list of functions that do Something(tm) that may
| or may not be related to their name", any number format that's
| capable of (approximately) storing both 10^20 and 10^-20 in 64
| bits is gonna have those drawbacks.
|
| AFAIK GPU code is basically always written as scalar code
| acting on each "thing" separately, that's, as a whole,
| semantically looped over by the hardware, same way as
| multithreading would (i.e. no order guaranteed at all), so you
| physically cannot write code that'd need operation reordering
| to vectorize. You just can't write an equivalent to "for (each
| element in list) accumulator += element;" (or, well, you can,
| by writing that and running just one thread of it, but that's
| gonna be slower than even the non-vectorized CPU equivalent
| (assuming the driver respects IEEE-754)).
| cycomanic wrote:
| I think this article overstates the importance of the problems
| even for scientific software. In the scientific code I've
| written, noise processes are often orders of magnitude larger
| than what what is discussed here and I believe this applies to
| many (most?) simulations modelling the real world (i.e. Physics
| chemistry,..). At the same time enabling fast-math has often
| yielded a very significant (>10%) performance boost.
|
| I particularly find the discussion of - fassociative-math because
| I assume that most writers of some code that translates a
| mathetical formula to into simulations will not know which would
| be the most accurate order of operations and will simply codify
| their derivation of the equation to be simulated (which could
| have operations in any order). So if this switch changes your
| results it probably means that you should have a long hard look
| at the equations you're simulating and which ordering will give
| you the most correct results.
|
| That said I appreciate that the considerations might be quite
| different for libraries and in particular simulations for
| mathematics.
| londons_explore wrote:
| It would be nice if there was some syntax for "math order
| matters, this is the order I want it done in".
|
| Then all other math will be fast-math, except where annotated.
| hansvm wrote:
| The article mentioned that gcc and clang have such
| extensions. Having it in the language is nice though, and
| that's the approach Zig took.
| sfn42 wrote:
| I thought most languages have this? If you simply write a
| formula operations are ordered according to the language
| specifiction. If you want different ordering you use
| parentheses.
|
| Not sure how that interacts with this fast math thing, I
| don't use C
| kstrauser wrote:
| That's a different kind of ordering.
|
| Imagine a function like Python's `sum(list)`. In abstract,
| Python should be able to add those values in any order it
| wants. Maybe it could spawn a thread so that one process
| sums the first half in the list, another sums the second
| half at the same time, and then you return the sum of those
| intermediate values. You could imagine a clever `sum()`
| being many times faster, especially using SIMD instructions
| or a GPU or something.
|
| But alas, you can't optimize like that with common IEEE-754
| floats and expect to get the same answer out as when using
| the simple one-at-a-time addition. The result depends on
| what order you add the numbers together. Order them
| differently and you very well may get a different answer.
|
| That's the kind of ordering we're talking about here.
| on_the_train wrote:
| I worked in cad, robotics and now semiconductor optics. In
| every single field, floating precision down to the very last
| digits was a huge issue
| cycomanic wrote:
| Interesting, I stand corrected. In most of the fields I'm
| aware off one could easily work in 32bit without any issues.
|
| I find the robotics example quite surprising in particular. I
| think the precision of most input sensors is less than 16bit
| so. If your inputs have this much noise on them how come you
| need so much precision your calculations?
| spookie wrote:
| The precision isn't uniform across a range of possible
| inputs. This means you need a higher bit depth, even though
| "you aren't really using it", just so you can establish a
| good base precision you are sure you are hitting at every
| range. The part where you are saying "most sensors" is
| doing a lot of leverage here.
| AlotOfReading wrote:
| "precision" is an ambiguous term here. There's
| reproducibility (getting the same results every time),
| accuracy (getting as close as possible to same results
| computed with infinite precision), and the native format
| precision.
|
| ffast-math is sacrificing both the first and the second for
| performance. Compilers usually sacrifice the first for the
| second by default with things like automation fma
| contraction. This isn't a necessary trade-off, it's just
| easier.
|
| There's very few cases where you actually need accuracy down
| to the ULP though. No robot can do anything meaningful with
| femtometer+ precision, for example. Instead you choose a
| development balance between reproducibility (relatively easy)
| and accuracy (extremely hard). In robotics, that will usually
| swing a bit towards reproducibility. CAD would swing more
| towards accuracy.
| quotemstr wrote:
| All I want for Christmas is a programming language that uses
| dependant typing to make floating point precision part of the
| type system. Catastrophic cancellation should be a compiler error
| if you assign the output to a float with better ulps than you get
| with worst case operands.
| thesuperbigfrog wrote:
| Ada might have what you want:
|
| https://www.jviotti.com/2017/12/05/an-introduction-to-adas-s...
|
| http://www.ada-auth.org/standards/22rm/html/RM-3-5-7.html
|
| http://www.ada-auth.org/standards/22rm/html/RM-A-5-3.html
|
| Ada also has fixed point types:
|
| http://www.ada-auth.org/standards/22rm/html/RM-3-5-9.html
| storus wrote:
| This problem is happening even on Apple MPS with PyTorch in deep
| learning, where fast math is used by default in many operations,
| leading to a garbage output. I hit it recently while training an
| autoregressive image generation model. Here is a discussion by
| folks that hit it as well:
|
| https://github.com/pytorch/pytorch/issues/84936
| JKCalhoun wrote:
| > Even compiler developers can't agree.
|
| > This is perhaps the single most frequent cause of fast-math-
| related StackOverflow questions and GitHub bug reports
|
| The second line above should settle the first.
| layer8 wrote:
| The first line points out that it doesn't, even if one thinks
| that it should. Also, note the "perhaps".
| teleforce wrote:
| "Nothing brings fear to my heart more than a floating point
| number." - Gerald Jay Sussman
|
| Is there any IEEE standards committee working on FP alternative
| for examples Unum and Posit [1],[2].
|
| [1] Unum & Posit:
|
| https://posithub.org/about
|
| [2] The End of Error:
|
| https://www.oreilly.com/library/view/the-end-of/978148223986...
| Q6T46nT668w6i3m wrote:
| Is this sarcasm? If not, the proposed posit standard, IEEE
| P3109.
| teleforce wrote:
| Great, didn't know that it exists.
| pclmulqdq wrote:
| The current P3109 draft has no posits in it.
| kvemkon wrote:
| I'm wondering, why there are still no announcements for
| hardware support of such approaches in CPUs.
| neepi wrote:
| HP had proper deterministic decimal arithmetic since the
| 1970s.
| datameta wrote:
| Luckily outside of mission critical systems, like in demoscene
| coding, I can happily use "44/7" as a 2pi approximation (my
| beloved)
| razighter777 wrote:
| The worst thing that strikes fear into me is seeing floating
| points used for real world currency. Dear god. So many things can
| go wrong. I always use unsigned integers counting number of
| cents. And if I gotta handle multiple currencies, then I'll use
| or make a wrapper class.
| knert wrote:
| How do you store negative numbers?
| psychoslave wrote:
| Maybe as in accounting, one column for benefits, one for
| debts?
| MobiusHorizons wrote:
| You use a signed integer type, so you just store a negative
| number.
|
| You can think of fixed point as equivalent to ieee754 floats
| with a fixed exponent and a two's complement mantissa instead
| of a sign bit.
| rcleveng wrote:
| Wrappers are good even when non dealing with multiple
| currencies since in many places some transactions are in
| fractions of cents, so depending on the usecase may need to
| push that decimal a few places out.
|
| I always have a wrapper class to put the logic of converting to
| whole currency units when and if needed, as well as when
| requirements change and now you need 4 digits past the decimal
| instead of 2, etc.
| simonw wrote:
| I've been having an interesting challenge relating to this
| recently. I'm trying to calculate costs for LLM usage, but the
| amounts of money involved are _so tiny_. Gemini 1.5 Flash 8B is
| $0.0375 per million tokens!
|
| Should I be running my accounting system on units of 10
| billionths of a dollar?
| outurnate wrote:
| You're better off representing values as rationals; a ratio
| between two different numbers. For example, 0.0375 would be
| represented as 375 over 10000, or 3 over 80
| simonw wrote:
| Sounds hard to model in SQLite.
| teaearlgraycold wrote:
| Two columns?
| anthk wrote:
| From Forth, here's how I'd set the rationals:
| : gcd begin dup while tuck mod repeat drop ; : lcm
| 2dup \* abs -rot gcd / ; : reduce 2dup gcd tuck /
| >r / r> ; : q+ rot 2dup \* >r rot \* -rot \* + r>
| reduce ; : q- swap negate swap q+ ; : q\*
| rot \* >r \* r> reduce ; : q/ >r \* swap r> \*
| swap reduce ;
|
| Example: to compute 70 * 0.25 = 35/2
|
| 70 1 1 4 q* reduce .s 35 2 ok
|
| On stack managing words like 2dup, rot and such, these are
| easily grasped under either Google/DDG or any Forth with
| the words "see" and/or "help".
|
| as a hint, q- swaps the top two numbers in the stack,
| (which compose a rational), makes the last one negative and
| then turns back its position. And then it calls q+.
|
| So, 2/5 - 3/2 = 2/5 + -3/2.
| marcosdumay wrote:
| Accounting happens on the unities people pay, not the ones
| that generate expenses.
|
| But you probably should run your billing in fixed point or
| floating decimals with a billionth of a dollar precision,
| yes. Either that or you should consolidate the expenses into
| larger bunches.
| scott_w wrote:
| Fixed point Decimal is your friend here. I'm guessing you buy
| tokens in increments of 1,000,000 so it isn't too much of an
| issue to account for. You can then normalise in your
| accounting so 1,000,000 is just "1 unit," or you can just
| account in increments of 1,000,000 but that does start
| looking weird (but might be necessary!)
| Filligree wrote:
| No, billing happens per-token. It's entirely necessary to
| use billionths of a dollar here, if you don't use floating
| point.
| scott_w wrote:
| In which case, I'd look at this thread
| https://news.ycombinator.com/item?id=44145263
| latchkey wrote:
| Ethereum is 1e18 or 1 wei.
|
| https://ethereum.stackexchange.com/questions/158517/does-
| sol...
| kolbe wrote:
| I've used Auroa Units to do this. You can define the dollars
| dimension, and then all the nano-micro-whatever scale comes
| with.
| klysm wrote:
| Convert to money as late as possible
| roryirvine wrote:
| This is surely the right answer: simply count the number of
| tokens used, and do the billing reconciliation as a
| separate step.
|
| As an added benefit, it makes it much easier to deal with
| price changes.
| osigurdson wrote:
| Wouldn't it be better to use a decimal type?
| MobiusHorizons wrote:
| This is what's called a fixed point decimal type. If you need
| variable precision, then a decimal type might be a good idea,
| but fixed point removes a lot of potential foot guns if the
| constraints work for you.
| osigurdson wrote:
| I meant fixed point decimal type (like C#) 128 bit. I don't
| understand why the parent commenter (top voted comment?)
| used unsigned integers to track individual cents. Why roll
| your own decimal type?
|
| Using arbitrary precision doesn't make sense if the data
| needs to be stored in a database (for most situations at
| least). Regardless, infinite precision is magical thinking
| anyway: try adding Pi to your bank account without loss of
| precision.
| MobiusHorizons wrote:
| the C# decimal type is not fixed point, its a floating
| point implementation, but just uses a base 10 exponent
| instead of a base 2 one like IEE754 floats.
|
| Fixed point is a general technique that is commonly done
| with machine integers when the necessary precision is
| known at compile time. It is frequently used on embedded
| devices that don't have a floating point unit to avoid
| slow software based floating point implementations.
| Limiting the precision to $0.01 makes sense if you only
| do addition or subtraction. Precision of $0.001 (Tenths
| of a cent also called mils) may be necessary when
| calculating taxes or applying other percentages although
| this is typically called out in the relevant laws or
| regulations.
| osigurdson wrote:
| Good to know. In a scientific domain so haven't used it
| previously.
| MobiusHorizons wrote:
| Fun fact there is a decimal type on some hardware. I
| believe Power PC, and presumably mainframes. You can
| actually use it from C although it's a software
| implementation on most hardware. IEEE754-2008 if you are
| curious.
| jjmarr wrote:
| IEEE754 defines a floating point decimal type. What are
| your opinions on that?
| MobiusHorizons wrote:
| It's very cool, but not present on most hardware. Fixed
| point is a lot simpler though if you are dealing with
| something with inherent granularity like currency
| nurettin wrote:
| I inherited systems that trade real world money using f64. They
| work surprisingly well, and the errors and bugs are almost
| never due to rounding. Those that are also have easy fixes. So
| I'm always baffled by this "expert opinion" of using integers
| for cents. It is pretty much up there with "never use python
| pickle it is unsafe" and "never use http, even if the program
| will never leave the subnet".
| dataangel wrote:
| you can't accurately represent 10 cents with floats, 0.1 is
| not directly representable. same with 1 cent, 0.01. Seems
| like if you do and significant math on prices you should run
| into rounding issues pretty quickly?
| adgjlsfhk1 wrote:
| no. Float64 has 16 digits of precision. Therefore even if
| you're dealing with trillions of dollars, you have accuracy
| down to the thousandth of a cent.
| cstrahan wrote:
| You might want to re-study this topic.
|
| The decimal number 0.1 has an infinitely repeating binary
| fraction.
|
| Consider how 1/3 in decimal is 0.33333... If you truncate
| that to some finite prefix, you no longer have 1/3. Now
| let's suppose we know, in some context, that we'll only
| ever have a finite number of digits -- let's say 5 digits
| after the decimal point. Then, if someone asks "what
| fraction is equivalent to 0.33333?", then it is
| reasonable to reply with "1/3". That might sound like
| we're lying, but remember that we agreed that, in this
| context of discussion, we have a finite number of digits
| -- so the value 1/3 _outside_ of this context has no way
| of being represented faithfully _inside_ this context, so
| we can only assume that the person is asking about the
| nearest approximation of "1 /3 as it means outside this
| context". If the person asking feels lied to, that's on
| them for not keeping the base assumptions straight.
|
| So back to floating point, and the case of 0.1
| represented as 64 bit floating point number. In base 2,
| the decimal number 0.1 looks like 0.0001100110011... (the
| 0011 being repeated infinitely). But we don't have an
| infinite number of digits. The finite truncation of that
| is the closest we can get to the decimal number 0.1, and
| by the same rationale as earlier (where I said that
| equating 1/3 with 0.33333 is reasonable), your
| programming language will likely parse "0.1" as a f64 and
| print it back out as such. However, if you try something
| like (a=0.1; a+a+a) you'll likely be surprised at what
| you find.
| adgjlsfhk1 wrote:
| > you'll likely be surprised at what you find.
|
| I very much doubt it. My day job is writing symbolic-
| numeric code. The result of 0.1+0.1+0.1 != 0.3, but for
| rounding to bring it up to 0.31 (i.e. rounding causing an
| error of 1 cent), you would need to accumulate at least
| .005 error, which will not happen unless you lose 13 out
| of your 16 digits of precision, which will not happen
| unless you do something incredibly stupid.
| nulld3v wrote:
| I'm curious where you got this idea from because it is
| trivially disprovable by typing 0.1 or 0.01 into any python
| or JS REPL?
| krapht wrote:
| https://docs.python.org/3/tutorial/floatingpoint.html
|
| Stop at any finite number of bits, and you get an
| approximation. On most machines today, floats are
| approximated using a binary fraction with the numerator
| using the first 53 bits starting with the most
| significant bit and with the denominator as a power of
| two. In the case of 1/10, the binary fraction is
| 3602879701896397 / 2 * 55 which is close to but not
| exactly equal to the true value of 1/10.
|
| Many users are not aware of the approximation because of
| the way values are displayed. Python only prints a
| decimal approximation to the true decimal value of the
| binary approximation stored by the machine. On most
| machines, if Python were to print the true decimal value
| of the binary approximation stored for 0.1, it would have
| to display: 0.1
| 0.1000000000000000055511151231257827021181583404541015625
|
| That is more digits than most people find useful, so
| Python keeps the number of digits manageable by
| displaying a rounded value instead: 1 /
| 10 0.1
|
| That being said, double should be fine unless you're
| aggregating trillions of low cost transactions. (API
| calls?)
| Izkata wrote:
| For anyone curious about testing it themselves and/or
| wanting to try other numbers: >>> from
| decimal import Decimal >>> Decimal(0.1) Decim
| al('0.100000000000000005551115123125782702118158340454101
| 5625')
| chowells wrote:
| Do you believe that the way the REPL prints a number is
| the way it's stored internally? If so, explaining this
| will be a fun exercise: $ python3
| Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0]
| on linux Type "help", "copyright", "credits" or
| "license" for more information. >>> a = 0.1
| >>> a + a + a 0.30000000000000004
|
| By way of explanation, the algorithm used to render a
| floating point number to text used in most languages
| these days is to find the shortest string representation
| that will parse back to an identical bit pattern. This
| has the direct effect of causing a REPL to print what you
| typed in. (Well, within certain ranges of "reasonable"
| inputs.) But this doesn't mean that the language stores
| what you typed in - just an approximation of it.
| anthk wrote:
| Oddly, tcl prints 0.30000000000000004 while jimtcl prints
| 0.3, while with 1/7 both crap out and round it to a
| simple 0.
|
| Edit: Now it does it fine after inputting floats:
|
| puts [ expr { 1.0/7.0 } ]
|
| Eforth on top of Subleq, a very small and dumb virtual
| machine: 1 f 7 f f/ f. 0.143
| ok
|
| Still, using rationals where possible (and mod operations
| otherwise) gives a great 'precision', except for
| irrationals.
| nulld3v wrote:
| :facepalm: my bad, I completely missed the more rational
| intepretation of OP's comment...
|
| I interpreted "directly representable" as "uniquely
| representable", all < 15 digit decimals are uniquely
| represented in fp64 so it is always safe to roundtrip
| between those decimals <-> f64, though indeed this
| guarantee is lost once you perform any math.
| CamperBob2 wrote:
| At the end of a long chain of calculations you're going to
| round to the nearest 0.01. It will be a LONG time before
| errors caused by double-precision floats cause you to gain
| or lose a penny.
| SonOfLilit wrote:
| You can make money modeling buy/sell decisions in floats and
| then having the bank execute them, but if the bank models
| your account as a float and loses a cent here and there, it
| will be sued into bankruptcy.
| kccqzy wrote:
| You will not lose a cent here and there just by using
| float64, for the range of values that banks deal with. For
| added assurance, just round to the nearest cent after each
| operation.
| jcranmer wrote:
| A double-precision float has ~16 decimal digits of
| precision. Which means as long as your bank account is less
| than a quadrillion dollars, it can accurately store the
| balance to the nearest cent.
| scott_w wrote:
| For far too many years I had inherited a billing system that
| used floats for all calculations then rounded up or down. Also
| doing some calculations in JS and mirroring them on the Python
| backend, so "just switch to Decimal" wasn't an easy change to
| make...
| layer8 wrote:
| Wait until you learn that Excel calculates everything using
| floating-point, and doesn't even fully observe IEEE 754.
|
| https://learn.microsoft.com/en-us/office/troubleshoot/excel/...
|
| (It nevertheless happens to work just fine for most of what
| Excel is used for.)
| jksflkjl3jk3 wrote:
| Floating point math shouldn't be that scary. The rules are well
| defined in standards, and for many domains are the only
| realistic option for performance reasons.
|
| I've spent most of my career writing trading systems that have
| executed 100's of billions of dollars worth of trades, and have
| never had any floating point related bugs.
|
| Using some kind of fixed point math would be entirely
| inappropriate for most HFT or scientific computing
| applications.
| eddd-ddde wrote:
| How do you handle the lack of commutativity? I've always
| wondered about the practical implications.
| jcranmer wrote:
| Floating-point is completely commutative (ignoring NaN
| payloads).
|
| It's the associativity law that it fails to uphold.
| jakevoytko wrote:
| I asked an ex-Bloomberg coder this question once after he
| told me he used floating points to represent currency all
| the time, and his response was along the lines of "unless
| you have blindingly-obvious problems like doing operations
| on near-zero numbers against very large numbers, these
| calculations are off by small amounts on their least-
| significant digits. Why would you waste the time or the
| electricity dealing with a discrepancy that's not even
| worth the money to fix?"
| BeetleB wrote:
| Nitpick: FP arithmetic is commutative. It's not
| associative.
| kolbe wrote:
| All your price field messages are sent to the exchange and
| back via fixed point, so you are using fixed point for at
| least some of the process (unless you're targeting those few
| crypto exchanges that use fp prices).
|
| If you need to be extremely fast (like fpga fast), you don't
| waste compute transforming their fixed point representation
| into floating.
| djrj477dhsnv wrote:
| Sure, string encodings are used for most APIs and ultra HFT
| may pattern match on the raw bytes, but for regular HFT if
| you're doing much math, it's going to be floating point
| math.
| usefulcat wrote:
| You can certainly make trading systems that work using
| floating point, but there are just so many fewer edge cases
| to consider when using fixed point.
|
| With fixed point and at least 2 decimal places, 10.01 + 0.01
| is always _exactly_ equal to 10.02. But with FP you may end
| up with something like 10.0199999999, and then you have to be
| extra careful anywhere you convert that to a string that it
| doesn 't get truncated to 10.01. That could be logging (not
| great but maybe not the end of the world if that goes wrong),
| or you could be generating an order message and then it is a
| real problem. And either way, you have to take care every
| time you do that, as opposed to solving the problem once at
| the source, in the way the value is represented.
|
| > Using some kind of fixed point math would be entirely
| inappropriate for most HFT or scientific computing
| applications.
|
| In the case of HFT, this would have to depend very greatly on
| the particulars. I know the systems I write are almost never
| limited by arithmetical operations, either FP or integer.
| kolbe wrote:
| It depends on what you're doing. If your system is a linear
| regression on 30 features, you should probably use floating
| point. My recollection is fixed is prohibitively slower and
| with far less FOSS support.
| gamescr wrote:
| I work on game engines and the problem with floats isn't on
| small values like 10.01 but on large ones like 400,010.01
| that's when the precision wildly varies.
| malfist wrote:
| Not only that but the precision loss accumulates.
| Multiply too many numbers with small inaccuracies and you
| wind up with numbers with large inaccuracies
| osigurdson wrote:
| The issue with floats is the mental model. The best way
| to think about them is like a ruler with many points
| clustered around 0 and exponentially fewer as the
| magnitude grows. Don't think of it like a real value -
| assume that there are hardly any values represented with
| perfect precision. Even "normalish" numbers like 10.1 are
| not in the set actually. When values are converted to
| strings, even in debuggers sometimes, they are often
| rounded which throws people off further ("hey, the value
| is exactly 10.1 - it is right there in the debugger").
| What you can count on however is that integers are
| represented with perfect precision up to a point (e.g.
| 2^53 -1 for f64).
|
| The other "metal model" issue is that associative
| operations in math. Adding a + (b + c) != (a + b) + c due
| to rounding. This is where fp-precise vs fp-fast comes
| in. Let's not talk about 80 bit registers (though that
| used to be another thing to think about).
| 01HNNWZ0MV43FF wrote:
| Lua is telling me 0.1 + 0.1 == 0.2, but 0.1 + 0.2 != 0.3.
| That's 64-bit precision. The issue is not with precision,
| but with 1/10th being a repeating decimal in binary.
| anthk wrote:
| Not an issue on Scheme and Common Lisp and even Forth
| operating directly with rationals with custom words.
| phendrenad2 wrote:
| I'm wondering if trading systems would run into the same
| issues as a bank or scientific calculation. You might not be
| making as many repeated calculations, and might not care if
| things are "off" by a tiny amount, because you're trading
| between money and securities, and the "loss" is part of your
| overhead. If a bank lost $0.01 after every 1 million
| transactions it would be a minor scandal.
| usefulcat wrote:
| Personally, I would be more concerned about something like
| determining whether the spread is more than a penny.
| Something like: if (ask - bid > 0.01) {
| // etc }
|
| With floating point, I have to think about the following
| questions: * What if the constant 0.01 is actually slightly
| greater than mathematical 0.01? * What if the constant 0.01
| is actually slightly less than mathematical 0.01? * What if
| ask - bid is actually slightly greater than the
| mathematical result? * What if ask - bid is actually
| slightly less than the mathematical result?
|
| With floating point, that seemingly obvious code is
| anything but. With fixed point, you have none of those
| problems.
|
| Granted, this only works for things that are priced in
| specific denominations (typically hundredths, thousandths,
| or ten thousandths), which is most securities.
| CamperBob2 wrote:
| So the spread is 0.0099999 instead of 0.01. When will
| that difference matter?
| usefulcat wrote:
| It matters if the strategy is designed to do very
| different things depending on whether or not the offers
| are locked (when bid == ask, or spread is less than
| 0.01).
|
| In this example, I'm talking about securities that are
| priced in whole cents. If you represent prices as floats,
| then it's possible that the spread appears to be less (or
| greater) than 0.01 when it's actually not, due to the
| inability of floats to exactly represent most real
| numbers.
| CamperBob2 wrote:
| But I'm still not understanding the real-world
| consequences. What will those be, exactly? Any good
| examples or case studies to look at?
| ljosifov wrote:
| I can imagine sth like: if (bid ask blah blah) { send
| order to buy 10 million of AAPL; }
| T0Bi wrote:
| > Using some kind of fixed point math would be entirely
| inappropriate for most HFT or scientific computing
| applications.
|
| May I ask why? (generally curious)
| jcranmer wrote:
| For starters, it's giving up a lot of performance, since
| fixed-point isn't accelerated by hardware like floating-
| point is.
| rendaw wrote:
| Isn't fixed point just integer?
| mitthrowaway2 wrote:
| Yes, integer combined with bit-shifts.
| Athas wrote:
| Yes, but you're not going to have efficient
| transcendental functions implemented in hardware.
| rendaw wrote:
| Ah okay, fair enough. But what sort of transcendental
| functions would you use for HFT?
|
| I guess I understood GGGGP's comment about using fixed
| point for interacting with currency to be about
| accounting. I'd expect floating point to be used for
| trading algorithms, but that's mostly statistics and I
| presume you'd switch back to fixed point before making
| trades etc.
| Athas wrote:
| The problem with fixed point is in its, well, fixed point.
| You assign a fixed number of bits to the fractional part of
| the number. This gives you the same absolute precision
| everywhere, but the relative precision (distance to the
| next highest or lowest number) is worse for small numbers -
| which is a problem, because those tend to be pretty
| important. It's just overall a less efficient use of the
| bit encoding space (not just performance-wise, but also in
| the accuracy of the results you get back). Remember that
| fixed point does not mean absence of rounding errors, and
| if you use binary fixed point, you still cannot represent
| many decimal fractions such as 0.1.
| anthk wrote:
| With fixed point you either scale it up or use rationals.
| osigurdson wrote:
| Fundamentally there is uncertainty associated with any
| physical measurement which is usually proportional to the
| magnitude being measured. As long as floating point is <<
| this uncertainty results are equally predictive. Floating
| point numbers bake these assumptions in.
| f33d5173 wrote:
| It's the front of house/back of house distinction. Front of
| house should use fixed point, back of house should use
| floating point. Unless you're doing trading, you want really
| strict rules with regards to rounding and such, which are
| going to be easier to achieve with fixed point.
| pasc1878 wrote:
| I don't think it is that clear. The split I think is
| between calculating settlement amounts which lead to real
| transfers of money and so should be fixed point whilst
| risk, pricing (thus trading) and valuation use models which
| need many calculations so need to be floating point.
| pie_flavor wrote:
| One of the things I always appreciate about the crypto
| community is that you do not have to ask what numeric type is
| being used for money, it is _always_ 8-digit fixed-point. No
| floating-point rounding errors to be found anywhere.
| immibis wrote:
| Correction: Bitcoin is 8-digit fixed-point. But Lightning is
| 10, IIRC. Other currencies have different conventions. Still,
| it's fixed within a given system and always fixed-point. As
| far as I'm aware, there are no floating-point
| cryptocurrencies at all, because it would be an obvious
| exploit vector - keep withdrawing 0.000000001 units from your
| account that has 1.0 units.
| Athas wrote:
| How does this avoid rounding error? Division and
| multiplication and still result in nonrepresentable numbers,
| right?
| jcranmer wrote:
| I've found fear of the use of floating-point in finance to be a
| good litmus test for how knowledgeable people are about
| floating-point. Because as far as I can tell, finance people
| almost exclusively uses (binary) floating-point [1], whereas a
| lot of floating-point FUD focuses on how disastrous it is for
| finance. And honestly, it's a bit baffling to me why so many
| people seem to think that floating-point is disastrous.
|
| My best guess for the latter proposition is that people are
| reacting to the default float printing logic of languages like
| Java, which display a float as the shortest base-10 number that
| would correctly round to that value, which extremely
| exaggerates the effect of being off by a few ULPs. By contrast,
| C-style printf specifies the number of decimal digits to round
| to, so all the numbers that are off by a few ULPs are still
| correct.
|
| [1] I'm not entirely sure about the COBOL mainframe
| applications, given that COBOL itself predates binary floating-
| point. I know that modern COBOL does have some support for IEEE
| 754, but that tells me very little about what the applications
| running around in COBOL do with it.
| pgwhalen wrote:
| I agree overall but my take is that it shows more ignorance
| about the domain of finance (or a particular subdomain) than
| it does about floating-point ignorance.
|
| It's really more of a concern in accounting, when monetary
| amounts are concrete and represent real money movement
| between distinct parties. A ton of financial software systems
| (HFT, trading in general) deal with money in a more abstract
| way in most of their code, and the particular kinds of
| imprecision that FP introduces doesn't result in bad business
| outcomes that outweigh its convenience and other benefits.
| munch117 wrote:
| FP does not introduce imprecision. Quite the contrary: The
| continuous rounding (or truncation) triggered by using
| scaled integers is what introduces imprecision. Whereas
| exponent scaling in floating point ensures that all the
| bits in the mantissa are put to good use.
|
| It's a trade-off between precision and predictability.
| Floating point provides the former. Scaled integers provide
| the latter.
| pgwhalen wrote:
| I was using imprecision in a more general and less
| mathematical sense than the way you're interpreting it,
| but yes this is a good point about why FP is useful in
| many financial contexts, when the monetary amount is
| derived from some model.
| munch117 wrote:
| The answer is accounting. In accounting you want
| predictability and reproducibility more than anything, and
| you are prepared to throw away precision on that alter.
|
| If you're summing up the cost of items in a webshop, then
| you're in the domain of accounting. If the result appears to
| be off by a single cent because of a rounding subtlety, then
| you're in trouble, because even though no one should care
| about that single cent, it will give the appearance that you
| don't know what you're doing. Not to mention the trouble you
| could get in for computing taxes wrong.
|
| If, on the other hand, you're doing financial forecasting or
| computing stock price targets, then you're not in the domain
| of accounting, and using floating point for money is just
| fine.
|
| I'm guessing from your post that your finance people are more
| like the latter. I could be wrong though - accountants do
| tend to use Excel.
| jcranmer wrote:
| To get the right answers for accounting, all you have to do
| is pay attention to how you're doing rounding, which is no
| harder for floating-point than it is for fixed-point.
| Actually, it might be slightly easier for floating-point,
| since you're probably not as likely to skip over the part
| of the contract that tells you what the rounding rules you
| have to follow are.
| munch117 wrote:
| Agreed. To do accounting, you need to employ some kind of
| discipline to ensure that you get rounding right. So many
| people erroneously believe that such a discipline has to
| be based on fixed point or decimal floating point
| numbers. But binary floating point can work just fine.
| chuckadams wrote:
| I haven't worked with C in nearly 20 years and even I remember
| warnings against -ffast-math. It really ought not to exist: it's
| just a super-flag for things like -funsafe-math-optizations, and
| the latter makes it really clear that it's, well, unsafe (or
| maybe it's actually _fun_ safe!)
| smcameron wrote:
| One thing I did not see mentioned in the article, or in these
| comments (according to ctrl-f anyway) is the use of
| feenableexcept()[1] to track down the source of NaNs in your
| code. feenableexcept(FE_DIVBYZERO | FE_INVALID
| | FE_OVERFLOW);
|
| will cause your code to get a SIGFPE whenever a NaN crawls out
| from under a rock. Of course it doesn't work with fast-math
| enabled, but if you're unknowingly getting NaNs _without_ fast-
| math enabled, you obviously need to fix those before even trying
| fast-math, and they can be hard to find, and feenableexcept()
| makes finding them a lot easier.
|
| [1] https://linux.die.net/man/3/feenableexcept
| hyghjiyhu wrote:
| One thing I wonder is what happens if you have an inline function
| in a header that is compiled with fast math by one translation
| unit and without in another.
| sholladay wrote:
| Correctness > performance, almost always. It's easier to notice
| that you need more performance than to notice that you need more
| correctness. Though performance outliers can definitely be a
| hidden problem that will bite you.
|
| Make it work. Make it right. Make it fast.
| cbarrick wrote:
| This page consistently crashes on Vivaldi for Android.
|
| Vivaldi 7.4.3691.52
|
| Android 15; ASUS_AI2302 Build/AQ3A.240812.002
| dirtyhippiefree wrote:
| I'm stunned by the following admission: "If fast-math was to give
| always the correct results, it wouldn't be fast-math"
|
| If it's not always correct, whoever chooses to use it chooses to
| allow error...
|
| Sounds worse than worthless to me.
| mg794613 wrote:
| Haha, the neverending cycle.
|
| Stop trying. Let their story unfold. Let the pain commence.
|
| Wait 30 years and see them being frustrated trying to tell the
| next generation.
| boulos wrote:
| I've also come around to --ffast-math considered harmful. It's
| useful though to help find optimization _opportunities_ , but in
| the modern (AVX2+) world, I think the risks outweigh the
| benefits.
|
| I'm surprised by the take that FTZ is worse than reassociation.
| FTZ being environmental rather than per instruction is certainly
| unfortunate, but that's true of rounding modes generally in x86.
| And I would argue that _most_ programs are unprepared to handle
| subnormals anyway.
|
| By contrast, reassociation definitely allows more optimization,
| but it also prohibits you from specifying the order precisely:
|
| > Allow re-association of operands in series of floating-point
| operations. This violates the ISO C and C++ language standard by
| possibly changing computation result.
|
| I haven't followed standards work in forever, but I imagine that
| the introduction of std::fma, gets people most of the benefit.
| That combined with something akin to volatile (if it actually
| worked) would probably be good enough for most people. Known,
| numerically sensitive code paths would be carefully written,
| while the rest of the code base can effectively be "meh, don't
| care".
| leephillips wrote:
| This part was fascinating:
|
| "The problem is how FTZ actually implemented on most hardware: it
| is not set per-instruction, but instead controlled by the
| floating point environment: more specifically, it is controlled
| by the floating point control register, which on most systems is
| set at the thread level: enabling FTZ will affect all other
| operations in the same thread.
|
| "GCC with -funsafe-math-optimizations enables FTZ (and its close
| relation, denormals-are-zero, or DAZ), even when building shared
| libraries. That means simply loading a shared library can change
| the results in completely unrelated code, which is a fun
| debugging experience."
___________________________________________________________________
(page generated 2025-05-31 23:00 UTC)