[HN Gopher] Floating points between zero and one
       ___________________________________________________________________
        
       Floating points between zero and one
        
       Author : ChadNauseam
       Score  : 70 points
       Date   : 2024-08-29 13:42 UTC (9 hours ago)
        
 (HTM) web link (chadnauseam.com)
 (TXT) w3m dump (chadnauseam.com)
        
       | antishatter wrote:
       | Theres infinite numbers between any two numbers...different sizes
       | of infinity and all that...
        
         | davidt84 wrote:
         | Not between floating point numbers.
        
         | jimhefferon wrote:
         | This post is not about real numbers, though. It's about
         | floating point numbers.
        
           | ot wrote:
           | Even if we were talking about real numbers (which we're not),
           | it would still be true that (0, 1) and (1, +inf) have the
           | same cardinality :)
           | 
           | As touched on in the post, 1/x is an explicit bijection (but
           | any set of real numbers that contains an interval has the
           | same cardinality)
        
             | mbauman wrote:
             | But perhaps that's the interesting point -- 1/x is *cannot
             | be* an exact bijection for floating point numbers, because
             | there's not a one-to-one mapping between the two sets.
        
       | ohnoitsahuman wrote:
       | Counterpoint: Pick a random number between 0 and Infinity.
       | 
       | What set of infinite numbers are you more likely to wind up with?
       | 
       | Do this a million times.
       | 
       | Therefore 1 to Infinity is significantly larger than 0 to 1
        
         | mkl wrote:
         | Not in floating point, which is the whole point here. Floats
         | can only represent a finite subset of the real numbers (there
         | are only 2^32 or 2^64 possible bit patterns, after all). They
         | can represent lots of small numbers close together around 0,
         | and lots of big numbers with increasingly large gaps between
         | them, and there's an infinitely large gap between ~10^308 and
         | inf. They are designed so that the possible floating point
         | numbers are among the most useful numbers for most human
         | calculations.
         | 
         | Even ignoring the realities of floating point, your argument
         | depends on sampling uniformly from an infinite interval, which
         | is not possible:
         | https://math.stackexchange.com/questions/14777/why-isnt-ther...
        
         | alexarnesen wrote:
         | I like the approach, but infinities can be counter intuitive.
         | See Cantor, Dedekind. The problem is your experiment. We don't
         | say "there are N reals between 0 and 1", we talk about
         | cardinality. The sets (0,1] and (1,3] in R1 don't have "the
         | same number of items" in them; they have the same cardinality.
        
         | jepler wrote:
         | That may feel like an intuitive declaration, but that's not how
         | it works in standard mathematics. In standard set theory, the
         | cardinality of the reals is equal to the cardinality of any
         | non-degenerate interval of reals. Wikipedia quotes this fact
         | without proof (https://en.wikipedia.org/wiki/Cardinality_of_the
         | _continuum#S...).
         | 
         | Here's one hand-wavy proof of why the cardinality of the real
         | interval P=[0,1] is the same as the cardinality of the real
         | interval Q=[0, infinity]: The function f(x) = 1/x-1 is a
         | bijective function that maps the P interval onto the Q
         | interval, which also proves the cardinality of the two sets is
         | equal. (https://en.wikipedia.org/wiki/Bijection#Cardinality).
         | 
         | If you're not comfortable with 1/0 = infinity as a general
         | matter, then simply replace the f(x) I gave with an explicit
         | piecewise function f(x) = { 0 if x = [?], else (1/x-1) } and
         | the proof still works.
        
         | jcranmer wrote:
         | Pick a random number according to _which distribution_?
         | 
         | You sound like you want a uniform distribution (i.e., P(x in
         | [a, b]) = b - a / total support of D), but when you have an
         | infinite support, the denominator is infinity - 0 = infinity,
         | so the probability that x is in any finite interval in that set
         | is 0. I've never taken real analysis, so my knowledge here is
         | quite shaky, but I'm not even certain that the resulting
         | probability distribution function is well-defined as a
         | probability distribution function in the first place.
         | 
         | Real analysis, aka, real numbers (and consequently infinite
         | sets) are far weirder than you ever expected them to be.
        
           | roywiggins wrote:
           | Yes, you can't define a uniform distribution on the reals at
           | all, there's no way to make it to sum up to 1, which is
           | required. Either it's 0 everywhere and the cumulative
           | probability is 0, or it's a positive constant everywhere and
           | the integral diverges, or it's not uniform.
        
         | kccqzy wrote:
         | That's not how cardinality is measured. The very first class on
         | an introduction to set theory course will teach you that
         | bijections are used to measure the size of a set.
         | 
         | Consider the tan function. When you give it a number between 0
         | and pi/2, it gives you a number between 0 and infinity, and it
         | does so in a bijective way. Therefore there are equal numbers
         | between 0 and pi/2 as compared to 0 to infinity. Now consider a
         | simple linear function that multiplies its input by pi/2. From
         | here we know that there are equal numbers between 0 and 1 as
         | compared to 0 and pi/2.
        
         | mitthrowaway2 wrote:
         | There are several other objections posted, but only one that
         | really refutes the heart of your claim.
         | 
         | To judge what set of numbers you're _more likely_ to end up
         | with, you need to specify a probability distribution. Without
         | any specific information to prefer one number over another, you
         | want the highest entropy distribution. It turns out that the
         | most  "natural" probability distribution for x extending
         | between 0 and infinity is not uniform over x, but over its
         | _logarithm_ ; therefore, by symmetry arguments, the probability
         | is actually more like 50% that you draw a number between 0 and
         | 1.
         | 
         | And it turns out that floating point numbers more or less
         | respect this property.
        
         | samatman wrote:
         | Allow me to offer a brief proof of the contrary.
         | 
         | You may select any real number between zero and infinity. We
         | will call this R.
         | 
         | I will give you 1/R, which is between zero and one.
         | 
         | QED.
         | 
         | I believe you're conflating range, where it is trivially true
         | that [0,[?]] is of greater _extent_ than (0,1), with quantity,
         | where it is not the case that there exists a greater quantity
         | of values in the former range than in the latter.
        
           | grodriguez100 wrote:
           | > You may select any real number between zero and infinity.
           | 
           | Should that have been "between one and infinity"? Otherwise
           | you cannot claim that 1/R is between zero and one.
        
       | mkl wrote:
       | > I will note that the author says there are 1,056,964,610
       | "normal floating-point numbers" in [0,1) and I am not entirely
       | sure why this disagrees with the number I produced.
       | 
       | The reason is there in the quote. You counted both normal and
       | subnormal [1] floating point numbers, and they just counted
       | normal floating point numbers. On mobile so can't check precise
       | details, but that would explain it.
       | 
       | [1] https://en.wikipedia.org/wiki/Subnormal_number
        
         | tialaramex wrote:
         | Huh, that Wiki link taught me something I didn't know - I
         | hadn't realised a different FP representation could have
         | Denormals (a word I knew) which weren't Subnormal (a new word
         | to me, more precisely defining what's different about these
         | values).
         | 
         | The Subnormals (presumably any Denormals regardless?) tend to
         | be annoyingly slow, because the FP hardware usually special
         | cases them to some slow but correct silicon not a fast path for
         | the usual case. So, in some code it can make sense to "zap"
         | subnormals, converting them to zero since they're _almost_ zero
         | anyway. Obviously in other code this is a terrible idea, so
         | your domain knowledge comes into play when authoring such code.
         | For example if we 're using -1 .. +1 as the 0dB for our Karaoke
         | software, a subnormal is clearly inaudible noise, but it might
         | make our sophisticated reverb simulator (so you can pretend
         | you're singing in a big auditorium) a thousand times slower
         | than normal, so, zap those subnormals before the simulator and
         | we hear the same results but don't get punished with miserable
         | performance.
        
           | mkl wrote:
           | Denormal and subnormal are synonyms as far as I know and can
           | find. That first sentence is awkwardly phrased: "subnormal
           | numbers are the subset of denormalized numbers (sometimes
           | called denormals)" would be better as "subnormal numbers
           | (sometimes called denormals) are the subset of denormalized
           | numbers". It does imply there are other "denormal _ized_ "
           | numbers though; not sure what they are.
        
             | jcranmer wrote:
             | Terminology is confusing!
             | 
             | A floating-point number is generally defined as (-1)^sign *
             | (mantissa) * base^exponent, where mantissa is a sequence of
             | digits in the base.
             | 
             | A _normalized_ floating-point number has mantissa be a
             | fixed number of digits for a given floating-point type.
             | 
             | A _denormalized_ floating-point number is one with a
             | different number of digits, where the exponent is the
             | minimum possible value.
             | 
             | An _unnormalized_ floating-point number is one with a
             | different number of digits, but the exponent is not the
             | minimum possible value. (This doesn 't occur in the
             | standard IEEE 754 types, but the PPC and the x86 long
             | double floating-point types have these numbers).
             | 
             | There's also _noncanonical_ floating-point numbers, which
             | are different representations for the same value. These
             | occur, for example, with IEEE 754 decimal FP, as well as
             | many pre-IEEE 754 types (where instead of subnormals, you
             | tend to get noncanonical representations of 0). There 's
             | also another kind of scenario where you have multiple
             | values that are all still canonical--for example, 0.0 and
             | 0.00 are different canonical decimal FP values in IEEE 754
             | (they have different exponents, and I'm not going to
             | attempt to remember what they are off the top of my head),
             | and of course -0 and +0 are distinct values of 0 that work
             | differently.
        
               | Asraelite wrote:
               | You didn't cover what exactly "subnormal" means
        
               | jcranmer wrote:
               | In all practical terms, subnormal is the same as denormal
               | (a non-normalized number with the smallest possible
               | exponent).
               | 
               | I think this is a situation where there's confusion over
               | precision of terminology, so things have shifted to a new
               | term which is consistently used. Subnormal is now the new
               | preferred terminology for denormal, I think out of
               | confusion as to whether or not unnormal numbers were also
               | denormal (I have never heard the term so used, but I'm
               | also not a great connoisseur of non-IEEE 754 floating-
               | point).
               | 
               | There's a similar attempt to shift use of the term
               | "mantissa" to "significand", since a mantissa implies
               | that the number is in the range [0, 1), whereas the usual
               | implementation in a floating point type is to use the
               | range [1, base).
        
               | librasteve wrote:
               | this is wrong(!)
               | 
               | > A normalized floating-point number has mantissa be a
               | fixed number of digits for a given floating-point type.
               | 
               | well, no. in fact...
               | 
               | > A normalized [binary] floating-point number has
               | mantissa which is bit shifted so that the first bit to
               | the right of the binary point is '1'.
               | 
               | (not a great look in an expert explainer comment)
        
             | stephencanon wrote:
             | For IEEE 754-2019 basic formats, subnormal and denormal are
             | synonyms.
             | 
             | There are formats (like x87 80-bit as implemented in
             | pre-387 hardware1) that are not 754 basic formats which
             | admit non-normal encodings that are not subnormal (i.e. not
             | smaller in magnitude than the smallest normal number). Some
             | people used to call these "denormal" as well (even though
             | the x87 docs call them "unnormal"), which caused confusion.
             | "Subnormal" is unambiguous.
             | 
             | In decimal IEEE 754 formats there are also non-canonical
             | encodings, but these are neither subnormal nor denormal
             | (decimal encodings are not, in general, normalized the same
             | way binary floating-point is).
             | 
             | -----
             | 
             | 1 on 387 and later these are invalid encodings, and behave
             | like a signaling NaN.
        
           | titzer wrote:
           | > FP hardware usually special cases them to some slow but
           | correct silicon
           | 
           | This is basically only true on Intel chips. Subnormals are
           | not that slow on modern ARM or AMD hardware.
        
       | zokier wrote:
       | I was going to comment that floats have nextUp operation which
       | allows iterating them without needing to go through
       | u32/from_bits, but apparently that feature is still behind
       | feature gate in Rust
       | 
       | https://github.com/rust-lang/rust/issues/91399
       | 
       | That is despite the fact that it is a IEEE-754 required
       | operation! C and friends have equivalent available as nextafter
       | https://en.cppreference.com/w/c/numeric/math/nextafter
        
         | lifthrasiir wrote:
         | I was more surprised that the proposed method does not include
         | nextafter itself, which comes with its own distinct edge cases.
        
           | vlovich123 wrote:
           | Reasoning explained here: https://github.com/rust-
           | lang/rfcs/blob/master/text/3173-floa...
           | 
           | TLDR: no one really uses it, the spec is ambiguous and
           | deprecated, and those that do use it seem to be using it with
           | constants that are usually infinities which is better suited
           | and clearer with next_up/next_down, and those that don't use
           | infinities tend to assume the range of the value and pick a
           | constant outside the range which can be dangerous if the
           | range changes
        
         | jcranmer wrote:
         | IEEE 754-1985 had a nextAfter function (which C implemented
         | them as nextafter and nexttoward), but that was removed in IEEE
         | 754-2008 in favor of nextUp and nextDown, which C implemented
         | as nextup and nextdown, but only as of C23, which still hasn't
         | officially been published by ISO.
         | 
         | There's also another wrinkle: next* functions go from
         | -subnormal to -0 to subnormal, and don't give you an iteration
         | for both -0 and +0, which can be useful for some iteration
         | purposes.
        
         | tialaramex wrote:
         | I think even with these operations, it's more intuitive to a
         | programmer to just try all the bit patterns using
         | f32::from_bits on integers _especially_ if they aren 't already
         | deeply familiar with how the IEEE representation works, which
         | seems like a pre-requisite for this experiment.
        
       | ryandrake wrote:
       | I thought it was well known among programmers that (at least
       | using IEEE floating point representation) half the set or
       | storable floats are between -1.0 and 1.0, 1/4 are between -inf
       | and -1.0 and another quarter are between 1.0 and inf.
       | 
       | One practical application of this is: if you have, say a
       | percentage value, and fixed point is out of the question, and you
       | want to retain as much precision as possible and still use
       | floats, _don 't_ store it in a float with range [0.0,100.0].
       | Store it with the range [0.0,1.0]. Also why if you're dealing
       | with angles, you should not store them in the range [0.0,360.0),
       | but instead store them either as radians [0-2p), or better:
       | [-p,p), or store them as [-1.0,1.0) and use trig routines
       | designed to work with that range.
       | 
       | I always thought this made intuitive sense when you understand
       | how the number is encoded. Then again, when I learned
       | programming, we weren't "allowed" to use floats until we
       | demonstrated that we understood how they were represented in
       | memory.
        
         | _aavaa_ wrote:
         | Depends on the field, in numerical simulations this is well
         | known and a lot of effort goes into normalizing everything to
         | fit into that range to minimize numerical issues.
         | 
         | Many new programmers, and those that deal with languages such
         | as Python, don't really think about such things, treating
         | floats as mathematically precise.
        
         | lifthrasiir wrote:
         | More accurately speaking, reduce the total inaccuracy
         | contributed by all operations. The inaccuracy will be
         | (generally) proportional to the number of operations, but keep
         | in mind that some operation would be exceptionally inaccurate
         | due to, for example, catastrophic cancelation. So do not
         | normalize percentage into [0,1] if the input was already in
         | [0,100] and an additional operation is needed for
         | normalization. (By the way, cospi etc counts as a single
         | instruction here which is why it is so good to have.)
        
         | jacobolus wrote:
         | What you are saying here is expressing some
         | misunderstandings/misconceptions, and may confuse readers.
         | 
         | There's no reason to prefer floating point values with any
         | particular exponent, as long as you are not getting too close
         | to the ends, which for double precision is roughly googol^3 or
         | 1/googol^3. (These numbers are _absurdly_ big /small, and you
         | are generally only going to get to them by multiplying a long
         | list of big or small numbers together; if you need to do that
         | you might need to occasionally renormalize the result and track
         | the exponent separately, or work with logarithms instead.) Even
         | for single precision, the exponent limits are about 10^38 or
         | 10^(-38), which is very very big.
         | 
         | > _want to retain as much precision as possible and still use
         | floats, don 't store it in a float with range [0.0,100.0].
         | Store it with the range [0.0,1.0]_
         | 
         | This doesn't make sense to me. There are just as many floating
         | point numbers between 64 and 128 as there are between 0.5 and
         | 1.0, and the same number between 32 and 64, between 0.25 and
         | 0.5, etc. All you did in multiplying by a constant is twirl up
         | the mantissa bits and shift the exponent by ~7. Unless you care
         | about the precise rounding in the ~16th decimal digit, there is
         | limited practical difference. (Well, one tiny difference is you
         | are preventing some of the integer-valued percentages from
         | having an exact representation, if for some reason you care
         | about that. On the flip side, if you need to compose these
         | percentages or apply them to some other quantity the range 0-1
         | is generally more convenient because you won't have to do an
         | extra division by 100.)
         | 
         | > _if you 're dealing with angles, you should not store them in
         | the range [0.0,360.0), but instead store them either as radians
         | [0-2p), or better: [-p,p), or store them as [-1.0,1.0) and use
         | trig routines designed to work with that range._
         | 
         | Floats from 0 to 360 is a perfectly fine representation for
         | angles, though you may want to use -180 to 180 if you want to
         | accumulate or compare many very small angles in either
         | direction, since there is much more precision near e.g.
         | -0.00001 than near 359.99999. (Of course, if whatever software
         | libraries you are using expect radians, it can be convenient to
         | use radians as a representation, but it won't be any more or
         | less precise.)
         | 
         | The reason pure mathematicians (and as a consequence most
         | scientists) use radians instead is because the trig functions
         | are easier to write down as power series and easier to do
         | calculus with (using pen and paper) when expressed in terms of
         | radians, because it eliminates an annoying extra constant.
         | 
         | Using numbers in the range -1 to 1 can be more convenient than
         | radians mainly because p is not exactly representable in
         | floating point (it can sometimes be nice to get an exact answer
         | for arcsin(1) or the like), and because there are other
         | mathematical tools which are nice to express in the interval
         | [-1, 1].
         | 
         | Aside: If you are using your angles and trig functions for
         | doing geometry (rather than, say, approximating periodic
         | functions), let me instead recommend representing your angles
         | as a pair of numbers (cos a, sin a), and then using vector
         | algebra instead of trigonometry, ideally avoiding angle
         | measures altogether except at interfaces with people or code
         | expecting them. You'll save a lot of transcendental function
         | evaluations and your code will be easier to write and reason
         | about.
         | 
         | Aside #2: The biggest thing you should worry about with
         | floating point arithmetic is places in your code where two
         | nearly equal numbers get subtracted. This results in
         | "catastrophic cancellation" that can eat up most of your
         | precision. For example, you need to be careful when writing
         | code to find the roots of quadratic equations, and shouldn't
         | just naively use the "quadratic formula" or one of the two
         | roots will often be very imprecise.
        
           | akira2501 wrote:
           | Accurate representation of a single quantity is one thing.
           | Doing several mathematical operations with that quantity
           | while _maintaining_ accuracy is another.
        
           | raphlinus wrote:
           | The quadratic solver implementation in kurbo is designed to
           | be fast and reasonably precise for a wide range of inputs.
           | But for a definitive treatment of how to solve quadratic
           | equations, see "The Ins and Outs of Solving Quadratic
           | Equations with Floating-Point Arithmetic" by Goualard[2]. I
           | thought I understood the problem space pretty well, then I
           | came across that.
           | 
           | [1]: https://github.com/linebender/kurbo/blob/b5bd2aa856781c6
           | cf46...
           | 
           | [2]: https://cnrs.hal.science/hal-04116310/document
        
         | samatman wrote:
         | A stronger argument for storing 'percentages' as [0.0,1.0] can
         | be made, the precision involved is seldom a limiting factor.
         | 
         | It has to do with why I put percentage in scare quotes. A
         | percentage is a certain way of writing a fraction, one which is
         | more convenient in some cases for humans to work with.
         | Generally, whatever you're doing with that number is more
         | effectively done with the range [0.0,1.0], other than perhaps
         | printing it, which is trivial. Carrying it around inside the
         | calculation imposes a conceptual burden which is of no use in
         | implementing whatever problem one is trying to solve.
         | 
         | It's true that you avoid a strange discontinuity of gradient at
         | 1%, but it hardly matters, the real driving force here is that
         | any practical use of 60% is realized using 0.6, so if you're
         | going to divide your 'percentage' by 100 in order to do N * M%,
         | just do it when you read in the number and get it out of the
         | way.
        
         | ealloc wrote:
         | > [if] you want to retain as much precision as possible and
         | still use floats, don't store it in a float with range
         | [0.0,100.0]. Store it with the range [0.0,1.0].
         | 
         | I just tested this out and it doesn't seem true.
         | 
         | The two storing methods seem similarly precise over most of the
         | range of fractions [0,1], sometimes one gives lower spacing,
         | sometimes the other. For instance, for fractions from 0.5 to
         | 0.638 we get smaller spacing if using [0,100], but for 0.638 to
         | 1 the spacing is smaller if storing in [0,1].
         | 
         | For very small fractions (< 1e-38), it also seems more accurate
         | to store in the range [0,100] since you are representing
         | smaller numbers with the same bit pattern. That is, because the
         | smallest nonzero positive float32 is 1.40129846e-45, so if you
         | store as a float32 range [0,1] that's the smallest possible
         | representable fraction, but if you're storing as a float in
         | range[0,100], that actually represents a fraction
         | 1.40129846e-47, which is smaller.
         | 
         | For the general result, see for yourself in python/numpy:
         | x = np.linspace(0,1,10000)         plt.plot(x,
         | np.float64(np.spacing(np.float32(x*100)))/100)  # plot spacing
         | stored as [0,100]         plt.plot(x,
         | np.float64(np.spacing(np.float32(x))))  # plot spacing stored
         | as [0,1]
        
       | dirtdobber wrote:
       | This is pretty easy to verify by just knowing how 32 bit floats
       | are represented.
       | 
       | A 32-bit float is represented as:
       | 
       | - 1 bit = sign bit
       | 
       | - 8 bits = exponent; represents a signed value in [-128, 127].
       | Note that an exponent value 127 is special and reserved for
       | infinity.
       | 
       | - 23 bits = mantissa
       | 
       | Numbers >= 1.0 ==> (sign = 1, 0 <= exponent <= 126, mantissa
       | >=0). That's (2^23 * 127) + 1 (to include infinity). This comes
       | out to exactly 1,065,353,217 for numbers in the range [1.0,
       | +inf].
       | 
       | Numbers < 1.0 and > 0 ==> (sign = 1, exponent < 0, mantissa >=
       | 0). That is 2^23 * 128 = 2^30. This comes out to exactly
       | 1,073,741,824 for numbers in the range (0.0, 1.0).
        
       | chillingeffect wrote:
       | Tiny caveat, this only applies to float formats which include
       | subnormals, including ieee754. Other systems may or may not
       | include subnormals.
        
         | samatman wrote:
         | I think we've reached the point where saying "float" should
         | mean IEEE754, and the burden of being more specific falls on
         | those working with various legacy formats.
        
       | bell-cot wrote:
       | Don't recall where I heard it, but: "Representing numbers with
       | floating point is somewhere between lossy compression and a crap
       | hash algorithm."
        
         | jacobolus wrote:
         | This seems excessively glib. In my view floating point numbers
         | are one of the most valuable and significant innovations in the
         | history of science, and the widespread availability of
         | IEEE-754-compliant float hardware is one of the best things
         | about computers. We all owe an enormous debt of gratitude to
         | the folks who spent a ton of (technical and political) work to
         | make it happen.
         | 
         | Instead I would say: numerical analysis is hard irrespective of
         | the number representation, and we should try to spread at least
         | basic knowledge about it so people can accomplish their goals
         | while avoiding some of the pitfalls.
        
           | jjmarr wrote:
           | "Floating points are the worst method of representing
           | rational numbers, except for all the others." --Winston
           | Churchill probably
        
             | lmm wrote:
             | Nah, arbitrary precision rationals are much better if you
             | actually only need to represent rationals. And frankly even
             | for reals, decimal types are better for most use cases.
        
           | bell-cot wrote:
           | Motor vehicles are one of the most valuable and significant
           | innovations in history, etc. etc.
           | 
           | But that doesn't make them safe.
        
       | peterhull90 wrote:
       | In the HN comment that the article discusses [0] is the
       | conclusion that commenter a1369209993 is correct (there are as
       | many between 0 & 1 as 1 & +INF) and llm_trw is not correct? I got
       | a bit confused.
       | 
       | Also, the article links to a blog by Daniel Lemire [1] in which
       | he says (with regard to producing an unbiased random float)
       | "picking an integer in [0,2^32) at random and dividing it by
       | 2^32, was equivalent to picking a number at random in [0,1)" is
       | incorrect and there is a ratio of up to 257:1 in the distribution
       | so obtained. Not wanting to disagree with Daniel Lemire but I
       | can't see why, and a quick experiment in Python didn't give this
       | ratio.
       | 
       | [0]: https://news.ycombinator.com/item?id=41112688
       | 
       | [1]: https://lemire.me/blog/2017/02/28/how-many-floating-point-
       | nu...
        
         | kccqzy wrote:
         | The blog post explained it perfectly. There are 2^32 integers
         | when you pick from [0,2^32). But there are 0x3f800000 floating
         | point numbers between 0 and 1. And the former number is not
         | divisible by the latter number. Therefore using division by
         | 2^32 cannot be unbiased.
         | 
         | It's helpful if you first look at smaller examples. If we were
         | to generate random integers in [0,10) by first generating
         | random integers in [0,50) and then dividing by 5, that's valid.
         | Exactly 5 numbers get mapped to one number each: the numbers
         | [0,5) get mapped to 0, [5,10) get mapped to 1 etc. But what if
         | you do the same division trick if you instead want to get
         | numbers in [0,3)? Do you do the same division trick? Then the
         | probability of the number 2 appearing is less than that of 0 or
         | 1.
        
       | mbauman wrote:
       | So where is the even split point? It's [0, 1.5) and [1.5, Inf)!
       | There's `2^62-2^51-1` values in both those sets. Or `2^30-2^22-1`
       | for 32 bit.
       | 
       | Or perhaps even more interestingly, there's the same number --
       | 2^62 or 2^30 -- in [0, 2) as [1, Inf).
        
       | TZubiri wrote:
       | This is why programmers using percentages in code is a pet peeve
       | of mine.
       | 
       | applying a 15% discount by multiplying by 15 and dividing by 100
       | is unhinged to me.
        
       ___________________________________________________________________
       (page generated 2024-08-29 23:00 UTC)