[HN Gopher] Everything I know about the fast inverse square root...
___________________________________________________________________
Everything I know about the fast inverse square root algorithm
Author : atan2
Score : 91 points
Date : 2024-06-01 11:16 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| rogerallen wrote:
| If you're interested, Wikipedia also has a decent discussion of
| this function and it's history.
| https://en.wikipedia.org/wiki/Fast_inverse_square_root
| schmorptron wrote:
| I really liked this video about it when I saw it a while ago:
| https://www.youtube.com/watch?v=p8u_k2LIZyo
| jxyxfinite wrote:
| Did John not write the code? Or did he copy paste it from
| somewhere?
| jb1991 wrote:
| wikipedia says:
|
| > Brian Hook may have brought the algorithm from 3dfx to id
| Software.
| Rinzler89 wrote:
| And at 3dfx he probably learned it from some ex-SGI guy, and
| that guy learned it doing his PhD at Stanford from someone
| who worked for ed Ed Catmull and so on. The lore goes deep
| with stuff like this. There's rarely just one author but many
| who improve the formula over time.
| fragmede wrote:
| there's an unrecognized genius out there that realized you
| can just apply a bit flip to get that out there. That John
| Carmack hasn't claimed it as his invention when he could
| have, speaks volumes about his character.
| Rinzler89 wrote:
| Except he couldn't have even if he wanted to. There's
| always the risk some former 3dfx or SGI graybeard comes
| out with some paper binder with the original
| implementation, and calls you out on your bullshit if you
| do. When you're such a famous public figure no way you
| ever risk lying in public. Even if he were to getaway
| with it there was nothing for him to gain: he already has
| all the fame and money from honest work, there's no point
| risking it all to claim some piece of code.
| kqr wrote:
| Then you call it a parallel discovery!
| Waterluvian wrote:
| I'm not sure anyone should be impressed by someone not
| stealing even if they had a chance to. That's just a
| basic human expectation.
| tsuica wrote:
| You'd think so.
| wongarsu wrote:
| The origin is really unclear. It's not certain which id
| employee added it to Quake 3 and where they got it from. John
| claims it wasn't him. Gary Tarolli, one of the founders of
| 3dfx, claims to remember tweaking the value of the hex constant
| to the value known today. But he says he didn't come up with
| the algorithm either. Maybe somebody else at 3dfx did, maybe he
| got it from somewhere else.
| johndough wrote:
| If your computer was built after 1999, it probably supports the
| SSE instruction set. It contains the _mm_rsqrt_ps instruction,
| which is faster and will give you four reciprocal square roots at
| once: https://www.intel.com/content/www/us/en/docs/intrinsics-
| guid...
|
| That being said, the techniques discussed here are not totally
| irrelevant (yet). There still exists some hardware with fast
| instructions for float/int conversion, but lacking rsqrt, sqrt,
| pow, log instructions, which can all be approximated with this
| nice trick.
| casey2 wrote:
| He posts c code and says that it computes the inverse square
| root, but the code he posted is undefined c, so while it could
| compute the inverse square root of it's input it could also do a
| long list of other stuff.
|
| One of the many reasons you should stay away from real world
| languages when talking about algorithms unless you are an expert
| in the language.
| tadfisher wrote:
| The code he posted is straight from the Quake 3 source release,
| a program which has certainly been installed and used on
| hundreds on millions of real computers. UB pedantry makes no
| difference to the actual behavior of the program as
| demonstrated.
| wizzwizz4 wrote:
| If we're talking _real-world_ languages, the code is perfectly
| well-defined. It 's undefined behaviour in Abstract C, the C of
| the Specification, which to my knowledge nobody has ever
| implemented, but Quake 3 was not written in Abstract C. It was
| written in Visual C++ 2003, gcc 2.95, and some other specific C
| implementation.
|
| The code snippet that the article claims is C:
| int32_t compute_magic(void) { double sigma = 0.0450465;
| double expression = 1.5 * pow(2.0, 23.0) * (127.0 - sigma);
| int32_t i = expression; return i; }
|
| which, as far as I can tell, is perfectly well-defined Abstract
| C.
| TillE wrote:
| Specifically, I think the original code will always do what
| you expect as long as sizeof(long) == sizeof(float), and the
| alignment is the same. In reality no compiler is gonna do
| weird stuff unless your hardware target is weird.
| cgrealy wrote:
| Minor correction: Q3 came out in 99, so it was probably
| written in either VC++ 97 or VC6.
|
| Otherwise, carry on.
| atleastoptimal wrote:
| Something interesting is I've seen this mythologized as evidenced
| of the "cracked engineer" theory, wherein random engineers will
| accidentally stumble upon incredibly complex discoveries in the
| course of their day to day work and expect no fanfare for this,
| besting the scientists and researchers who take the traditional
| route but aren't spurred by necessity. People, like xkcd,
| purported this was either Carmack or some random employee who
| figured it out in a few hours when stuck on a problem then
| waltzed onto the next one. In truth, as noted in the document, it
| has a history that goes back decades in academia, this was simply
| the most notable time it was implemented.
|
| I think it speaks to an issue I see in the software engineering
| world where people assume that collaboration is for low-IQ people
| and all great innovation comes from some super-genius working on
| their own for long enough, and isn't required to share how they
| arrived at their findings. I'm sure the mythology of this
| algorithm is propelled somewhat by the enigmatic character of
| writing "what the fuck" next to adding the constant, implying a
| mystical element to its utility that was arrived at without
| needing to clarify it in some long boring research paper.
| galangalalgol wrote:
| I think the xkcd comic was more about academia vs product
| engineering in for profit entities. If anything, it is the
| collaborative effort of many people working on a product with
| deadlines that ends up with some of them coming up with really
| interesting solutions just due to the number of people looking.
| The lack of sharing between companies does increase the
| incidence of the outsider effect, at the massive cost of
| reinventing basic knowledge over and over. The only reason
| breakthroughs still happen despite that is just the large
| number of people trying random stuff that probably won't work
| motivated by unrealistic deadlines. They don't know it probably
| won't work though...
| fragmede wrote:
| https://xkcd.com/664/ for the curious.
| xeonmc wrote:
| Relevant topic: https://news.ycombinator.com/item?id=40485313
| hypeatei wrote:
| > where people assume that collaboration is for low-IQ people
|
| I've also seen a phenomenon where developers believe their code
| is so valuable and amazing that they don't share it; it's
| really ordinary code but useful to novices nonetheless. Some
| communities I participated in when I was younger (e.g. PS3
| jailbreak scene) were very protective of their code for no
| reason. Executables were obfuscated, nothing was open source,
| and developers were very hostile or intentionally trolled you
| when asking questions about their software.
| layer8 wrote:
| What's up with the diagonal lines in the graph backgrounds?
| zzo38computer wrote:
| I wrote a implementation in MMIX: %
| Constants FISRCON GREG #5FE6EB50C7B537A9 THREHAF GREG
| #3FF8000000000000 % Save half of the original
| number OR $2,$0,0 INCH $2,#FFF0
| % Bit level hacking SRU $1,$0,1 SUBU
| $0,FISRCON,$1 % First iteration FMUL
| $1,$2,$0 FMUL $1,$1,$0 FSUB
| $1,THREHAF,$1 FMUL $0,$0,$1 % Second
| iteration FMUL $1,$2,$0 FMUL $1,$1,$0
| FSUB $1,THREHAF,$1 FMUL $0,$0,$1
|
| This implementation makes an assumption that the original number
| is greater than 2^-1021.
| DrNosferatu wrote:
| Note that you're actually using 3 distinct magic numbers - not
| only the single 0x5f3759df, but also 0.5 and 1.5;
|
| So for further accuracy we can instead have:
|
| conv.i = 0x5F1FFFF9 - ( conv.i >> 1 ); conv.f _= 0.703952253f_ (
| 2.38924456f - x * conv.f * conv.f ); return conv.f;
|
| [from Wikipedia]
| ncruces wrote:
| I collected a few of these here:
|
| https://github.com/ncruces/fastmath/blob/main/fast.go
|
| Also see this StackOverflow:
|
| https://stackoverflow.com/questions/32042673/optimized-low-a...
___________________________________________________________________
(page generated 2024-06-02 23:00 UTC)