[HN Gopher] Ideal divisors: when a division compiles down to jus...
___________________________________________________________________
Ideal divisors: when a division compiles down to just a
multiplication
Author : chmaynard
Score : 85 points
Date : 2021-04-28 17:18 UTC (5 hours ago)
(HTM) web link (lemire.me)
(TXT) w3m dump (lemire.me)
| whalesalad wrote:
| I have been seeing this Wordpress theme a lot lately and it is so
| bad the way the content is pushed so far to the right. On my 27"
| monitor the left margin of the text is essentially the center of
| the display.
| jlarocco wrote:
| I have a 27" monitor also, and it looks fine to me.
| tachyonbeam wrote:
| 27" monitor on Chrome here. Looks pretty far to the right.
| tzs wrote:
| How about if the browser window is maximized?
|
| What I get in that case is 23 cm for the darkish blue left
| sidebar and 36 cm for the light blue right area that contains
| the article. Within that right area, the article is in 19 cm
| white rectangle with just under 2 cm margins.
|
| So, starting from the left here is what I see:
|
| 15 cm empty dark blue background,
|
| 6 cm of sidebar text on dark blue background,
|
| 2 more cm of empty dark blue background,
|
| 2 cm of empty light blue background,
|
| A tad under 2 cm of empty white background,
|
| About 14.5 cm of text on white background,
|
| A tad under 2 cm of empty white background,
|
| 15.5 cm of empty light blue background.
|
| It really is quite ugly. It looks like something I would do
| because I completely suck at CSS.
| dhosek wrote:
| I almost never1 run a browser maximized on anything bigger
| than my 16" laptop screen (and even that not so much). Most
| of the time, I have browser windows that are 1/2 (on my 24"
| monitor) or 1/3 (on my ultrawide monitor) of the width and
| the full height (2/3 width on my laptop)).
|
| 1 The exception being browser windows to run sites like
| Netflix where I'm really using it as a video interface.
| tyingq wrote:
| One thing that often works in this situation with chrome is the
| "Emulate CSS Media Type: Print" option from the rendering tab
| in Dev Tools. The theme appears to be the wordpress default
| theme from 2015, so you should see less of it as times goes on.
| omgJustTest wrote:
| Floating point, while convenient does have substantial drawbacks
| in the implementations of circuits that we have today. Efficiency
| gains like this somewhat expose facets that are normally hidden.
| It will be interesting to see the transition to either analog NN
| or quantum computers, which inherently have better
| representations of continuous values.
|
| Edit: Additionally the serial vs parallel processing paradigm
| shift is a main advantage of the switch too! Queued systems seem
| to have severe scheduling limitations, which the real-world does
| not always inherently have.
| svat wrote:
| The linked article is about integer division and has nothing to
| do with floating point, and in fact I can't see how this idea
| could be applied to floating-point arithmetic. Can it?
| wizzwizz4 wrote:
| Not directly, no. The underlying mathematics _kind of_ can,
| but you 'd need dedicated circuitry (or bit-hacking) for it
| because of the floating point parts, and it would fall over
| around subnormals.
| andrewfromx wrote:
| i'd like to find these numbers for base9 vs base10. see
| https://stackoverflow.com/questions/67226727/is-there-a-way-...
| if you look at that sample golang code, is that the right way to
| go about this? Or is there a much simpler way already built into
| a high level language like go?
| [deleted]
| jvickers wrote:
| In JavaScript (specifically node / V8) if dividing multiple
| values by the same divisor, is it fastest to calculate the
| reciprocal and then multiply it?
|
| How about other languages (and compiler systems)?
|
| Are there binary tricks worth knowing about to quickly calculate
| reciprocals?
| brrrrrm wrote:
| It's typically faster to calculate a reciprocal (especially at
| compile time) and then multiply it, but this is often at the
| expense of precision. In C++, for example, the -O3 flag on gcc
| will still emit a division (divss) to preserve compliant
| floating point precision in this example:
| https://godbolt.org/z/PY3Wjno4P
|
| However, when we enable `-ffast-math`, we permit the compiler
| to do a bit more aggressive optimization in this domain
| (trading off precision), and we see multiplication (mulss) is
| emitted: https://godbolt.org/z/KEb1nc43z
| svat wrote:
| If you've never seen it before, this is a beautiful blog post on
| related matters: "Optimising pointer subtraction with 2-adic
| integers" http://blog.sigfpe.com/2010/05/optimising-pointer-
| subtractio...
| kevinventullo wrote:
| A few months back I wrote a blog post very much inspired by
| Dan's post, though a little more academic:
| https://kevinventullo.com/2020/12/21/2-adic-logarithms-and-f...
|
| The idea is to exploit deeper structures in p-adic number
| theory to develop useful and novel algorithms with unsigned
| integer arithmetic, e.g. fast integer exponentiation. Feedback
| welcome!
|
| (Warning: math heavy)
| thechao wrote:
| > I don't know how gcc generates its approximate 2-adic
| reciprocals. Possibly it uses something based on the Euclidean
| GCD algorithm. I wasn't able to find the precise line of source
| in a reasonable time.
|
| It's 0x1FFF...F / 7 = 0x249...249249249.
|
| Just about the oddest thing I've seen; I guess, if I were the
| author, it'd be my "1 in 10000" day?
| xyzzy_plugh wrote:
| Makes more sense in binary: 0xFFF..F
| 111111111111111111111111 0x7
| 000000000000000000000111 0x249..9
| 001001001001001001001001
| tragomaskhalos wrote:
| Genuine question: why would any compiler bother to optimise
| division with such obtuse dividends, other than a purists'
| satisfaction ?
| warkdarrior wrote:
| Division is orders of magnitude slower than multiplication.
| Tuna-Fish wrote:
| Less than one order of magnitude these days. On Tiger Lake,
| integer multiply is 3 cycles, while integer division is 12
| for 32-bit integers and 15 for 64-bit.
|
| Of course, the multiply is fully pipelined while the division
| is only partially, but even that leaves only a 10-fold
| difference.
|
| Division is much faster these days than it used to be. It's
| still worth it to have compiler optimizations for it, but
| it's not important to avoid it anymore.
| gowld wrote:
| Why do some people assume decimal ten as the base of the
| orders of magnitude when discussing binary arithmetic?
| cma wrote:
| Timespan/cycle count is almost always denoted in base 10
| in conversation.
| Fordec wrote:
| > Less than one order of magnitude these days.
|
| > only a 10-fold difference.
|
| A ten fold difference is _literally_ one order of
| magnitude. Which is what the post you 're replying to
| factually stated.
| cma wrote:
| orders usually means 2 or more
| jbay808 wrote:
| Embedded systems programmers don't always have such luxury!
| martincmartin wrote:
| They're often special cases of general algorithms. You can do
| many divisions as combinations of multiplications & shifts; in
| this case, the resulting code simplifies down to the smallest
| possible.
| Denvercoder9 wrote:
| There's no specific optimisation for these specific numbers.
| There's a more general optimisation to turn division by a fixed
| number into a multiplication and shift where possible (because
| division is a _lot_ slower than multiplication), and there is
| another optimisation that eliminates useless shifts. Combined
| they yield this result for these specific numbers.
| joe_the_user wrote:
| Yeah, a more detailed description of this general method
| seems missing from the article. Anyone care to provide it?
| chrchang523 wrote:
| See the "Labor of Division" blog posts by the author of
| libdivide:
|
| https://ridiculousfish.com/blog/posts/labor-of-division-
| epis...
|
| https://ridiculousfish.com/blog/posts/labor-of-division-
| epis...
| joe_the_user wrote:
| So basically calculating the multiplicative inverse of a
| number and storing it as something like a float (digits +
| exponent) with the digits and exponent stored separately
| to avoid overflow, underflow, accuracy issues.
|
| And it's faster if and only if you're repeating division
| by a single number since you're still doing the
| equivalent of divide to get the invest (but repeated
| division is actually quite common so this is useful).
| pkaye wrote:
| The book "Hackers Delight" goes over a lot of these kinds
| of software algorithms for integers.
| layoutIfNeeded wrote:
| >I call the corresponding divisors ideal
|
| Not a particularly good name, as "ideal" already has an
| established meaning in algebra:
| https://en.m.wikipedia.org/wiki/Ideal_(ring_theory)
| gowld wrote:
| It's not an ideal name, even though it's ideal.
| withinboredom wrote:
| This is computer science, not algebra.
| Tainnor wrote:
| algebra plays a huge role in CS, e.g. cryptography
| chrchang523 wrote:
| See also https://libdivide.com/ , when you need to repeatedly
| divide by a constant that isn't known until runtime.
___________________________________________________________________
(page generated 2021-04-28 23:00 UTC)