[HN Gopher] Floating points between zero and one
___________________________________________________________________
Floating points between zero and one
Author : ChadNauseam
Score : 70 points
Date : 2024-08-29 13:42 UTC (9 hours ago)
(HTM) web link (chadnauseam.com)
(TXT) w3m dump (chadnauseam.com)
| antishatter wrote:
| Theres infinite numbers between any two numbers...different sizes
| of infinity and all that...
| davidt84 wrote:
| Not between floating point numbers.
| jimhefferon wrote:
| This post is not about real numbers, though. It's about
| floating point numbers.
| ot wrote:
| Even if we were talking about real numbers (which we're not),
| it would still be true that (0, 1) and (1, +inf) have the
| same cardinality :)
|
| As touched on in the post, 1/x is an explicit bijection (but
| any set of real numbers that contains an interval has the
| same cardinality)
| mbauman wrote:
| But perhaps that's the interesting point -- 1/x is *cannot
| be* an exact bijection for floating point numbers, because
| there's not a one-to-one mapping between the two sets.
| ohnoitsahuman wrote:
| Counterpoint: Pick a random number between 0 and Infinity.
|
| What set of infinite numbers are you more likely to wind up with?
|
| Do this a million times.
|
| Therefore 1 to Infinity is significantly larger than 0 to 1
| mkl wrote:
| Not in floating point, which is the whole point here. Floats
| can only represent a finite subset of the real numbers (there
| are only 2^32 or 2^64 possible bit patterns, after all). They
| can represent lots of small numbers close together around 0,
| and lots of big numbers with increasingly large gaps between
| them, and there's an infinitely large gap between ~10^308 and
| inf. They are designed so that the possible floating point
| numbers are among the most useful numbers for most human
| calculations.
|
| Even ignoring the realities of floating point, your argument
| depends on sampling uniformly from an infinite interval, which
| is not possible:
| https://math.stackexchange.com/questions/14777/why-isnt-ther...
| alexarnesen wrote:
| I like the approach, but infinities can be counter intuitive.
| See Cantor, Dedekind. The problem is your experiment. We don't
| say "there are N reals between 0 and 1", we talk about
| cardinality. The sets (0,1] and (1,3] in R1 don't have "the
| same number of items" in them; they have the same cardinality.
| jepler wrote:
| That may feel like an intuitive declaration, but that's not how
| it works in standard mathematics. In standard set theory, the
| cardinality of the reals is equal to the cardinality of any
| non-degenerate interval of reals. Wikipedia quotes this fact
| without proof (https://en.wikipedia.org/wiki/Cardinality_of_the
| _continuum#S...).
|
| Here's one hand-wavy proof of why the cardinality of the real
| interval P=[0,1] is the same as the cardinality of the real
| interval Q=[0, infinity]: The function f(x) = 1/x-1 is a
| bijective function that maps the P interval onto the Q
| interval, which also proves the cardinality of the two sets is
| equal. (https://en.wikipedia.org/wiki/Bijection#Cardinality).
|
| If you're not comfortable with 1/0 = infinity as a general
| matter, then simply replace the f(x) I gave with an explicit
| piecewise function f(x) = { 0 if x = [?], else (1/x-1) } and
| the proof still works.
| jcranmer wrote:
| Pick a random number according to _which distribution_?
|
| You sound like you want a uniform distribution (i.e., P(x in
| [a, b]) = b - a / total support of D), but when you have an
| infinite support, the denominator is infinity - 0 = infinity,
| so the probability that x is in any finite interval in that set
| is 0. I've never taken real analysis, so my knowledge here is
| quite shaky, but I'm not even certain that the resulting
| probability distribution function is well-defined as a
| probability distribution function in the first place.
|
| Real analysis, aka, real numbers (and consequently infinite
| sets) are far weirder than you ever expected them to be.
| roywiggins wrote:
| Yes, you can't define a uniform distribution on the reals at
| all, there's no way to make it to sum up to 1, which is
| required. Either it's 0 everywhere and the cumulative
| probability is 0, or it's a positive constant everywhere and
| the integral diverges, or it's not uniform.
| kccqzy wrote:
| That's not how cardinality is measured. The very first class on
| an introduction to set theory course will teach you that
| bijections are used to measure the size of a set.
|
| Consider the tan function. When you give it a number between 0
| and pi/2, it gives you a number between 0 and infinity, and it
| does so in a bijective way. Therefore there are equal numbers
| between 0 and pi/2 as compared to 0 to infinity. Now consider a
| simple linear function that multiplies its input by pi/2. From
| here we know that there are equal numbers between 0 and 1 as
| compared to 0 and pi/2.
| mitthrowaway2 wrote:
| There are several other objections posted, but only one that
| really refutes the heart of your claim.
|
| To judge what set of numbers you're _more likely_ to end up
| with, you need to specify a probability distribution. Without
| any specific information to prefer one number over another, you
| want the highest entropy distribution. It turns out that the
| most "natural" probability distribution for x extending
| between 0 and infinity is not uniform over x, but over its
| _logarithm_ ; therefore, by symmetry arguments, the probability
| is actually more like 50% that you draw a number between 0 and
| 1.
|
| And it turns out that floating point numbers more or less
| respect this property.
| samatman wrote:
| Allow me to offer a brief proof of the contrary.
|
| You may select any real number between zero and infinity. We
| will call this R.
|
| I will give you 1/R, which is between zero and one.
|
| QED.
|
| I believe you're conflating range, where it is trivially true
| that [0,[?]] is of greater _extent_ than (0,1), with quantity,
| where it is not the case that there exists a greater quantity
| of values in the former range than in the latter.
| grodriguez100 wrote:
| > You may select any real number between zero and infinity.
|
| Should that have been "between one and infinity"? Otherwise
| you cannot claim that 1/R is between zero and one.
| mkl wrote:
| > I will note that the author says there are 1,056,964,610
| "normal floating-point numbers" in [0,1) and I am not entirely
| sure why this disagrees with the number I produced.
|
| The reason is there in the quote. You counted both normal and
| subnormal [1] floating point numbers, and they just counted
| normal floating point numbers. On mobile so can't check precise
| details, but that would explain it.
|
| [1] https://en.wikipedia.org/wiki/Subnormal_number
| tialaramex wrote:
| Huh, that Wiki link taught me something I didn't know - I
| hadn't realised a different FP representation could have
| Denormals (a word I knew) which weren't Subnormal (a new word
| to me, more precisely defining what's different about these
| values).
|
| The Subnormals (presumably any Denormals regardless?) tend to
| be annoyingly slow, because the FP hardware usually special
| cases them to some slow but correct silicon not a fast path for
| the usual case. So, in some code it can make sense to "zap"
| subnormals, converting them to zero since they're _almost_ zero
| anyway. Obviously in other code this is a terrible idea, so
| your domain knowledge comes into play when authoring such code.
| For example if we 're using -1 .. +1 as the 0dB for our Karaoke
| software, a subnormal is clearly inaudible noise, but it might
| make our sophisticated reverb simulator (so you can pretend
| you're singing in a big auditorium) a thousand times slower
| than normal, so, zap those subnormals before the simulator and
| we hear the same results but don't get punished with miserable
| performance.
| mkl wrote:
| Denormal and subnormal are synonyms as far as I know and can
| find. That first sentence is awkwardly phrased: "subnormal
| numbers are the subset of denormalized numbers (sometimes
| called denormals)" would be better as "subnormal numbers
| (sometimes called denormals) are the subset of denormalized
| numbers". It does imply there are other "denormal _ized_ "
| numbers though; not sure what they are.
| jcranmer wrote:
| Terminology is confusing!
|
| A floating-point number is generally defined as (-1)^sign *
| (mantissa) * base^exponent, where mantissa is a sequence of
| digits in the base.
|
| A _normalized_ floating-point number has mantissa be a
| fixed number of digits for a given floating-point type.
|
| A _denormalized_ floating-point number is one with a
| different number of digits, where the exponent is the
| minimum possible value.
|
| An _unnormalized_ floating-point number is one with a
| different number of digits, but the exponent is not the
| minimum possible value. (This doesn 't occur in the
| standard IEEE 754 types, but the PPC and the x86 long
| double floating-point types have these numbers).
|
| There's also _noncanonical_ floating-point numbers, which
| are different representations for the same value. These
| occur, for example, with IEEE 754 decimal FP, as well as
| many pre-IEEE 754 types (where instead of subnormals, you
| tend to get noncanonical representations of 0). There 's
| also another kind of scenario where you have multiple
| values that are all still canonical--for example, 0.0 and
| 0.00 are different canonical decimal FP values in IEEE 754
| (they have different exponents, and I'm not going to
| attempt to remember what they are off the top of my head),
| and of course -0 and +0 are distinct values of 0 that work
| differently.
| Asraelite wrote:
| You didn't cover what exactly "subnormal" means
| jcranmer wrote:
| In all practical terms, subnormal is the same as denormal
| (a non-normalized number with the smallest possible
| exponent).
|
| I think this is a situation where there's confusion over
| precision of terminology, so things have shifted to a new
| term which is consistently used. Subnormal is now the new
| preferred terminology for denormal, I think out of
| confusion as to whether or not unnormal numbers were also
| denormal (I have never heard the term so used, but I'm
| also not a great connoisseur of non-IEEE 754 floating-
| point).
|
| There's a similar attempt to shift use of the term
| "mantissa" to "significand", since a mantissa implies
| that the number is in the range [0, 1), whereas the usual
| implementation in a floating point type is to use the
| range [1, base).
| librasteve wrote:
| this is wrong(!)
|
| > A normalized floating-point number has mantissa be a
| fixed number of digits for a given floating-point type.
|
| well, no. in fact...
|
| > A normalized [binary] floating-point number has
| mantissa which is bit shifted so that the first bit to
| the right of the binary point is '1'.
|
| (not a great look in an expert explainer comment)
| stephencanon wrote:
| For IEEE 754-2019 basic formats, subnormal and denormal are
| synonyms.
|
| There are formats (like x87 80-bit as implemented in
| pre-387 hardware1) that are not 754 basic formats which
| admit non-normal encodings that are not subnormal (i.e. not
| smaller in magnitude than the smallest normal number). Some
| people used to call these "denormal" as well (even though
| the x87 docs call them "unnormal"), which caused confusion.
| "Subnormal" is unambiguous.
|
| In decimal IEEE 754 formats there are also non-canonical
| encodings, but these are neither subnormal nor denormal
| (decimal encodings are not, in general, normalized the same
| way binary floating-point is).
|
| -----
|
| 1 on 387 and later these are invalid encodings, and behave
| like a signaling NaN.
| titzer wrote:
| > FP hardware usually special cases them to some slow but
| correct silicon
|
| This is basically only true on Intel chips. Subnormals are
| not that slow on modern ARM or AMD hardware.
| zokier wrote:
| I was going to comment that floats have nextUp operation which
| allows iterating them without needing to go through
| u32/from_bits, but apparently that feature is still behind
| feature gate in Rust
|
| https://github.com/rust-lang/rust/issues/91399
|
| That is despite the fact that it is a IEEE-754 required
| operation! C and friends have equivalent available as nextafter
| https://en.cppreference.com/w/c/numeric/math/nextafter
| lifthrasiir wrote:
| I was more surprised that the proposed method does not include
| nextafter itself, which comes with its own distinct edge cases.
| vlovich123 wrote:
| Reasoning explained here: https://github.com/rust-
| lang/rfcs/blob/master/text/3173-floa...
|
| TLDR: no one really uses it, the spec is ambiguous and
| deprecated, and those that do use it seem to be using it with
| constants that are usually infinities which is better suited
| and clearer with next_up/next_down, and those that don't use
| infinities tend to assume the range of the value and pick a
| constant outside the range which can be dangerous if the
| range changes
| jcranmer wrote:
| IEEE 754-1985 had a nextAfter function (which C implemented
| them as nextafter and nexttoward), but that was removed in IEEE
| 754-2008 in favor of nextUp and nextDown, which C implemented
| as nextup and nextdown, but only as of C23, which still hasn't
| officially been published by ISO.
|
| There's also another wrinkle: next* functions go from
| -subnormal to -0 to subnormal, and don't give you an iteration
| for both -0 and +0, which can be useful for some iteration
| purposes.
| tialaramex wrote:
| I think even with these operations, it's more intuitive to a
| programmer to just try all the bit patterns using
| f32::from_bits on integers _especially_ if they aren 't already
| deeply familiar with how the IEEE representation works, which
| seems like a pre-requisite for this experiment.
| ryandrake wrote:
| I thought it was well known among programmers that (at least
| using IEEE floating point representation) half the set or
| storable floats are between -1.0 and 1.0, 1/4 are between -inf
| and -1.0 and another quarter are between 1.0 and inf.
|
| One practical application of this is: if you have, say a
| percentage value, and fixed point is out of the question, and you
| want to retain as much precision as possible and still use
| floats, _don 't_ store it in a float with range [0.0,100.0].
| Store it with the range [0.0,1.0]. Also why if you're dealing
| with angles, you should not store them in the range [0.0,360.0),
| but instead store them either as radians [0-2p), or better:
| [-p,p), or store them as [-1.0,1.0) and use trig routines
| designed to work with that range.
|
| I always thought this made intuitive sense when you understand
| how the number is encoded. Then again, when I learned
| programming, we weren't "allowed" to use floats until we
| demonstrated that we understood how they were represented in
| memory.
| _aavaa_ wrote:
| Depends on the field, in numerical simulations this is well
| known and a lot of effort goes into normalizing everything to
| fit into that range to minimize numerical issues.
|
| Many new programmers, and those that deal with languages such
| as Python, don't really think about such things, treating
| floats as mathematically precise.
| lifthrasiir wrote:
| More accurately speaking, reduce the total inaccuracy
| contributed by all operations. The inaccuracy will be
| (generally) proportional to the number of operations, but keep
| in mind that some operation would be exceptionally inaccurate
| due to, for example, catastrophic cancelation. So do not
| normalize percentage into [0,1] if the input was already in
| [0,100] and an additional operation is needed for
| normalization. (By the way, cospi etc counts as a single
| instruction here which is why it is so good to have.)
| jacobolus wrote:
| What you are saying here is expressing some
| misunderstandings/misconceptions, and may confuse readers.
|
| There's no reason to prefer floating point values with any
| particular exponent, as long as you are not getting too close
| to the ends, which for double precision is roughly googol^3 or
| 1/googol^3. (These numbers are _absurdly_ big /small, and you
| are generally only going to get to them by multiplying a long
| list of big or small numbers together; if you need to do that
| you might need to occasionally renormalize the result and track
| the exponent separately, or work with logarithms instead.) Even
| for single precision, the exponent limits are about 10^38 or
| 10^(-38), which is very very big.
|
| > _want to retain as much precision as possible and still use
| floats, don 't store it in a float with range [0.0,100.0].
| Store it with the range [0.0,1.0]_
|
| This doesn't make sense to me. There are just as many floating
| point numbers between 64 and 128 as there are between 0.5 and
| 1.0, and the same number between 32 and 64, between 0.25 and
| 0.5, etc. All you did in multiplying by a constant is twirl up
| the mantissa bits and shift the exponent by ~7. Unless you care
| about the precise rounding in the ~16th decimal digit, there is
| limited practical difference. (Well, one tiny difference is you
| are preventing some of the integer-valued percentages from
| having an exact representation, if for some reason you care
| about that. On the flip side, if you need to compose these
| percentages or apply them to some other quantity the range 0-1
| is generally more convenient because you won't have to do an
| extra division by 100.)
|
| > _if you 're dealing with angles, you should not store them in
| the range [0.0,360.0), but instead store them either as radians
| [0-2p), or better: [-p,p), or store them as [-1.0,1.0) and use
| trig routines designed to work with that range._
|
| Floats from 0 to 360 is a perfectly fine representation for
| angles, though you may want to use -180 to 180 if you want to
| accumulate or compare many very small angles in either
| direction, since there is much more precision near e.g.
| -0.00001 than near 359.99999. (Of course, if whatever software
| libraries you are using expect radians, it can be convenient to
| use radians as a representation, but it won't be any more or
| less precise.)
|
| The reason pure mathematicians (and as a consequence most
| scientists) use radians instead is because the trig functions
| are easier to write down as power series and easier to do
| calculus with (using pen and paper) when expressed in terms of
| radians, because it eliminates an annoying extra constant.
|
| Using numbers in the range -1 to 1 can be more convenient than
| radians mainly because p is not exactly representable in
| floating point (it can sometimes be nice to get an exact answer
| for arcsin(1) or the like), and because there are other
| mathematical tools which are nice to express in the interval
| [-1, 1].
|
| Aside: If you are using your angles and trig functions for
| doing geometry (rather than, say, approximating periodic
| functions), let me instead recommend representing your angles
| as a pair of numbers (cos a, sin a), and then using vector
| algebra instead of trigonometry, ideally avoiding angle
| measures altogether except at interfaces with people or code
| expecting them. You'll save a lot of transcendental function
| evaluations and your code will be easier to write and reason
| about.
|
| Aside #2: The biggest thing you should worry about with
| floating point arithmetic is places in your code where two
| nearly equal numbers get subtracted. This results in
| "catastrophic cancellation" that can eat up most of your
| precision. For example, you need to be careful when writing
| code to find the roots of quadratic equations, and shouldn't
| just naively use the "quadratic formula" or one of the two
| roots will often be very imprecise.
| akira2501 wrote:
| Accurate representation of a single quantity is one thing.
| Doing several mathematical operations with that quantity
| while _maintaining_ accuracy is another.
| raphlinus wrote:
| The quadratic solver implementation in kurbo is designed to
| be fast and reasonably precise for a wide range of inputs.
| But for a definitive treatment of how to solve quadratic
| equations, see "The Ins and Outs of Solving Quadratic
| Equations with Floating-Point Arithmetic" by Goualard[2]. I
| thought I understood the problem space pretty well, then I
| came across that.
|
| [1]: https://github.com/linebender/kurbo/blob/b5bd2aa856781c6
| cf46...
|
| [2]: https://cnrs.hal.science/hal-04116310/document
| samatman wrote:
| A stronger argument for storing 'percentages' as [0.0,1.0] can
| be made, the precision involved is seldom a limiting factor.
|
| It has to do with why I put percentage in scare quotes. A
| percentage is a certain way of writing a fraction, one which is
| more convenient in some cases for humans to work with.
| Generally, whatever you're doing with that number is more
| effectively done with the range [0.0,1.0], other than perhaps
| printing it, which is trivial. Carrying it around inside the
| calculation imposes a conceptual burden which is of no use in
| implementing whatever problem one is trying to solve.
|
| It's true that you avoid a strange discontinuity of gradient at
| 1%, but it hardly matters, the real driving force here is that
| any practical use of 60% is realized using 0.6, so if you're
| going to divide your 'percentage' by 100 in order to do N * M%,
| just do it when you read in the number and get it out of the
| way.
| ealloc wrote:
| > [if] you want to retain as much precision as possible and
| still use floats, don't store it in a float with range
| [0.0,100.0]. Store it with the range [0.0,1.0].
|
| I just tested this out and it doesn't seem true.
|
| The two storing methods seem similarly precise over most of the
| range of fractions [0,1], sometimes one gives lower spacing,
| sometimes the other. For instance, for fractions from 0.5 to
| 0.638 we get smaller spacing if using [0,100], but for 0.638 to
| 1 the spacing is smaller if storing in [0,1].
|
| For very small fractions (< 1e-38), it also seems more accurate
| to store in the range [0,100] since you are representing
| smaller numbers with the same bit pattern. That is, because the
| smallest nonzero positive float32 is 1.40129846e-45, so if you
| store as a float32 range [0,1] that's the smallest possible
| representable fraction, but if you're storing as a float in
| range[0,100], that actually represents a fraction
| 1.40129846e-47, which is smaller.
|
| For the general result, see for yourself in python/numpy:
| x = np.linspace(0,1,10000) plt.plot(x,
| np.float64(np.spacing(np.float32(x*100)))/100) # plot spacing
| stored as [0,100] plt.plot(x,
| np.float64(np.spacing(np.float32(x)))) # plot spacing stored
| as [0,1]
| dirtdobber wrote:
| This is pretty easy to verify by just knowing how 32 bit floats
| are represented.
|
| A 32-bit float is represented as:
|
| - 1 bit = sign bit
|
| - 8 bits = exponent; represents a signed value in [-128, 127].
| Note that an exponent value 127 is special and reserved for
| infinity.
|
| - 23 bits = mantissa
|
| Numbers >= 1.0 ==> (sign = 1, 0 <= exponent <= 126, mantissa
| >=0). That's (2^23 * 127) + 1 (to include infinity). This comes
| out to exactly 1,065,353,217 for numbers in the range [1.0,
| +inf].
|
| Numbers < 1.0 and > 0 ==> (sign = 1, exponent < 0, mantissa >=
| 0). That is 2^23 * 128 = 2^30. This comes out to exactly
| 1,073,741,824 for numbers in the range (0.0, 1.0).
| chillingeffect wrote:
| Tiny caveat, this only applies to float formats which include
| subnormals, including ieee754. Other systems may or may not
| include subnormals.
| samatman wrote:
| I think we've reached the point where saying "float" should
| mean IEEE754, and the burden of being more specific falls on
| those working with various legacy formats.
| bell-cot wrote:
| Don't recall where I heard it, but: "Representing numbers with
| floating point is somewhere between lossy compression and a crap
| hash algorithm."
| jacobolus wrote:
| This seems excessively glib. In my view floating point numbers
| are one of the most valuable and significant innovations in the
| history of science, and the widespread availability of
| IEEE-754-compliant float hardware is one of the best things
| about computers. We all owe an enormous debt of gratitude to
| the folks who spent a ton of (technical and political) work to
| make it happen.
|
| Instead I would say: numerical analysis is hard irrespective of
| the number representation, and we should try to spread at least
| basic knowledge about it so people can accomplish their goals
| while avoiding some of the pitfalls.
| jjmarr wrote:
| "Floating points are the worst method of representing
| rational numbers, except for all the others." --Winston
| Churchill probably
| lmm wrote:
| Nah, arbitrary precision rationals are much better if you
| actually only need to represent rationals. And frankly even
| for reals, decimal types are better for most use cases.
| bell-cot wrote:
| Motor vehicles are one of the most valuable and significant
| innovations in history, etc. etc.
|
| But that doesn't make them safe.
| peterhull90 wrote:
| In the HN comment that the article discusses [0] is the
| conclusion that commenter a1369209993 is correct (there are as
| many between 0 & 1 as 1 & +INF) and llm_trw is not correct? I got
| a bit confused.
|
| Also, the article links to a blog by Daniel Lemire [1] in which
| he says (with regard to producing an unbiased random float)
| "picking an integer in [0,2^32) at random and dividing it by
| 2^32, was equivalent to picking a number at random in [0,1)" is
| incorrect and there is a ratio of up to 257:1 in the distribution
| so obtained. Not wanting to disagree with Daniel Lemire but I
| can't see why, and a quick experiment in Python didn't give this
| ratio.
|
| [0]: https://news.ycombinator.com/item?id=41112688
|
| [1]: https://lemire.me/blog/2017/02/28/how-many-floating-point-
| nu...
| kccqzy wrote:
| The blog post explained it perfectly. There are 2^32 integers
| when you pick from [0,2^32). But there are 0x3f800000 floating
| point numbers between 0 and 1. And the former number is not
| divisible by the latter number. Therefore using division by
| 2^32 cannot be unbiased.
|
| It's helpful if you first look at smaller examples. If we were
| to generate random integers in [0,10) by first generating
| random integers in [0,50) and then dividing by 5, that's valid.
| Exactly 5 numbers get mapped to one number each: the numbers
| [0,5) get mapped to 0, [5,10) get mapped to 1 etc. But what if
| you do the same division trick if you instead want to get
| numbers in [0,3)? Do you do the same division trick? Then the
| probability of the number 2 appearing is less than that of 0 or
| 1.
| mbauman wrote:
| So where is the even split point? It's [0, 1.5) and [1.5, Inf)!
| There's `2^62-2^51-1` values in both those sets. Or `2^30-2^22-1`
| for 32 bit.
|
| Or perhaps even more interestingly, there's the same number --
| 2^62 or 2^30 -- in [0, 2) as [1, Inf).
| TZubiri wrote:
| This is why programmers using percentages in code is a pet peeve
| of mine.
|
| applying a 15% discount by multiplying by 15 and dividing by 100
| is unhinged to me.
___________________________________________________________________
(page generated 2024-08-29 23:00 UTC)