[HN Gopher] Why the CORDIC algorithm lives rent-free in my head
       ___________________________________________________________________
        
       Why the CORDIC algorithm lives rent-free in my head
        
       Author : todsacerdoti
       Score  : 267 points
       Date   : 2024-05-11 07:18 UTC (15 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | 1oooqooq wrote:
       | so, it's a https://en.wikipedia.org/wiki/Farey_sequence (aka
       | mediant, aka naive fraction sum for approximation) but for
       | angles?
        
         | kevindamm wrote:
         | More like a binary search, but instead of adjusting an array
         | index you're iterating an orthonormal matrix which represents
         | rotation about the origin.
        
           | tapatio wrote:
           | Or a second order step response.
        
         | nico wrote:
         | That's super cool
         | 
         | Thank you for that link, very interesting diagrams and
         | structures
         | 
         | They look very similar to what a neural network might be in
         | certain cases, and it seems like the structures are trying to
         | map some sort of duality
        
         | ykonstant wrote:
         | It is not, because the subdivisions in CORDIC do not possess
         | the best approximation properties of Farey fractions. For that,
         | you would have to partition the circle into major/minor arcs
         | instead, in the sense of Hardy-Littlewood's circle method. But
         | that would be computationally very expensive, whereas this
         | binary subdivision is fast, and is made faster using the shift-
         | add tricks.
        
           | 1oooqooq wrote:
           | i see. thanks. assumed it was just bit shift tricks
           | implementation. but indeed, no mediant.
        
       | nico wrote:
       | Cool algorithm. Might come in handy soon for running neural
       | networks on less powerful hardware :)
        
         | KeplerBoy wrote:
         | Neural networks usually don't use trigonometry or complex
         | numbers.
        
           | nico wrote:
           | From the neural network Wikipedia page:
           | 
           | > The signal each neuron outputs is calculated from this
           | number, according to its activation function
           | 
           | The activation function is usually a sigmoid, which is also
           | usually defined in terms of trigonometric functions
           | 
           | Which neural networks don't use trigonometric functions or
           | equivalent?
        
             | KeplerBoy wrote:
             | Most activations functions are based on exponential
             | functions.
             | 
             | I don't immediately see how cordic would help you calculate
             | e^(x) unless x is complex valued (which it is usually not).
        
               | adgjlsfhk1 wrote:
               | see section B of https://eprints.soton.ac.uk/267873/1/tca
               | s1_cordic_review.pdf. Generalized cordic can compute
               | hyperbolic trig functions which gives you exponentials.
               | That said, it's still not useful for ML because it's
               | pretty hard to beat polynomials and tables.
        
               | KeplerBoy wrote:
               | Interesting, thanks.
        
             | PeterisP wrote:
             | In current neural networks the activation function usually
             | is not sigmoid but something like ReLU ( y = 0 if x<0 else
             | x ), and in any case the computation of activations is not
             | meaningful part of total compute - for non-tiny networks,
             | almost all the effort is spent on the the large matrix
             | multiplications of the large layers before the activation
             | function, and as the network size grows, they become even
             | less relevant, as the amount of activation calculations
             | grows linearly with layer size but the core computation
             | grows superlinearly (n^2.8 perhaps?).
        
         | SJC_Hacker wrote:
         | The primary advantage of CORDIC is that it saves space in the
         | design.
         | 
         | Lookup tables with interpolation are much faster
        
           | nimish wrote:
           | Polynomial interpolation for sin/cos is even better when you
           | have access to an FPU: https://gist.github.com/publik-
           | void/067f7f2fef32dbe5c27d6e21...
        
       | sunpazed wrote:
       | I have a few vintage programmable HP calculators that implement
       | CORDIC on their 4-bit CPUs. You can also program them to
       | calculate the Taylor expansion of sin() as another approximate
       | method.
       | 
       | If you liked this, it's worth reading Donald Knuth seminal "The
       | Art of Computer Programming" which explains a number of
       | mathematical algorithms by example.
        
       | aftbit wrote:
       | > Now, it's fairly obvious that rotating by say 22.75@ is the
       | same as rotating by 45@ and then -22.5@ - i.e. we can break up a
       | rotation into smaller parts, with both positive and negative
       | components.
       | 
       | Wouldn't that be rotating by 22.5deg? Is this an error in the
       | article or my understanding?
        
         | AlexSW wrote:
         | An error in the article.
        
           | aftbit wrote:
           | I made a PR
           | 
           | https://github.com/francisrstokes/githublog/pull/7
        
       | blowski wrote:
       | "Lives rent-free in my head" is a horrible cliche. By now, it has
       | the same effect as saying "it's nice" but uses lots more letters.
        
         | rnmmrnm wrote:
         | I didn't finish yet but I was thinking he's talking about
         | memorizing the algorithm to calculate sines in your head.
        
         | Der_Einzige wrote:
         | It's the best kind of cliche, it's Marxist in nature. Left wing
         | radicals know how to meme!
        
           | __MatrixMan__ wrote:
           | What kind of Marxist keeps their ideas in line by collecting
           | rent from some of them but not others?
        
         | junon wrote:
         | And what does this pedantry solve, exactly? It clearly is
         | something the author thinks often enough about and enjoys
         | talking about.
         | 
         | Let people enjoy things.
        
         | wodenokoto wrote:
         | I never liked that phrase either. So what exactly lives in your
         | head and pays rent?
        
           | sublinear wrote:
           | Knowledge work
        
           | __MatrixMan__ wrote:
           | Kubernetes, which I would evict if it stopped paying rent.
        
           | pynappo wrote:
           | the "rent" in the saying is the effort to learn and upkeep a
           | certain piece of knowledge in your head, to the point where
           | you can recall it as needed
           | 
           | living rent free generally means that you are able to easily
           | recall a piece of knowledge (and often do, even if you don't
           | want to).
           | 
           | something that pays rent might be like an important concept
           | for an upcoming school test, for a class you're not
           | interested in.
        
             | strken wrote:
             | I always thought rent was utility, and a piece of knowledge
             | or an opinion that didn't pay rent was one that provided
             | little or no practical benefit to you.
             | 
             | The term in its original use described an obsession with a
             | particular group or thing and called it out as pointless. I
             | don't think it would make sense for rent to mean effort in
             | this original sense, nor in the broader way it's used here.
        
         | alexjplant wrote:
         | > "Lives rent-free in my head" is a horrible cliche.
         | 
         | Yes it is, much like "dumpster fire" or "toxic" or
         | "gaslighting" or any other contemporary Reddit-isms that are
         | associated with hyper-reductionist cultural hot takes. My
         | personal distaste for them, however, has no bearing on the fact
         | that they have very real meaning and people enjoy using them.
         | 
         | > By now, it has the same effect as saying "it's nice" but uses
         | lots more letters.
         | 
         | The commonly-held meaning of this is that it describes a
         | pervasive, intrusive thought, not a nice one. This particular
         | turn of phrase is often used to describe how politicians devote
         | energy to talking about other politicians that they don't like,
         | for instance.
        
         | B1FF_PSUVM wrote:
         | I take it as short for the "why are you still carrying her?"
         | parable https://medium.com/@soninilucas/two-monks-and-a-woman-
         | zen-st... - something you should have let go and forgotten.
         | 
         | Doesn't seem to be what the author meant in this case.
        
           | magicalhippo wrote:
           | Having never heard the expression before, I thought it meant
           | it wouldn't be forgotten, like learning to ride a bicycle
           | kinda thing.
        
         | beefnugs wrote:
         | I thought it was more often "negative" like you cant stop
         | thinking about some dumbass getting disproportionate affect on
         | the world
        
       | brcmthrowaway wrote:
       | This used to be all the rage when folks were into DSP
        
         | tapatio wrote:
         | cos + jsin baby.
        
       | apitman wrote:
       | While the author mentions this is mostly applicable to things
       | like FPGAs, there's also an application in gamedev (or any
       | distributed physics simulation). Floating point calculations are
       | tricky to get deterministic across platforms[0]. One solution is
       | to skip floats entirely and implement a fixed-point physics
       | engine. You'd need something like CORDIC to implement all the
       | trig functionality.
       | 
       | I started working on such a thing as a fun side project a few
       | years ago but never finished it. One of these days I hope to get
       | back to it.
       | 
       | [0]: https://randomascii.wordpress.com/2013/07/16/floating-
       | point-...
        
         | boulos wrote:
         | That blog post is now a decade old, but includes an important
         | quote:
         | 
         | > The IEEE standard does guarantee some things. It guarantees
         | more than the floating-point-math-is-mystical crowd realizes,
         | but less than some programmers might think.
         | 
         | Summarizing the blog post, it highlights a few things though
         | some less clearly than I would like:                 * x87 was
         | wonky            * You need to ensure rounding modes, flush-to-
         | zero, etc. are consistently set            * Some older
         | processors don't have FMA            * Approximate instructions
         | (mmsqrtps et al.) don't have a consistent spec            *
         | Compilers may reassociate expressions
         | 
         | For _small_ routines and self-written libraries, it 's
         | straightforward, if painful to ensure you avoid all of that.
         | 
         | Briefly mentioned in the blog post is IEEE-754 (2008) made the
         | spec more explicit, and effectively assumed the death of x87.
         | It's 2024 now, so you can definitely avoid x87. Similarly, FMA
         | is part of the IEEE-754 2008 spec, and has been built into all
         | modern processors since (Haswell and later on Intel).
         | 
         | There are still cross-architecture differences like 8-wide AVX2
         | vs 4-wide NEON that can trip you up, but if you are writing
         | assembly or intrinsics or just C that inspect with Compiler
         | Explorer or objdump, you can look at the output and say "Yep,
         | that'll be consistent".
        
           | Someone wrote:
           | > but if you are writing assembly or intrinsics or just C
           | that inspect with Compiler Explorer or objdump, you can look
           | at the output and say "Yep, that'll be consistent".
           | 
           | Surely people have written tooling for those checks for
           | various CPUs?
           | 
           | Also, is it that 'simple'? Reading
           | https://github.com/llvm/llvm-project/issues/62479,
           | calculations that the compiler does and that only end up in
           | the executable as constants can make results different
           | between architectures or compilers (possibly even compiler
           | runs, if it runs multi-threaded and constant folding order
           | depends on timing, but I find it hard to imagine how exactly
           | that would happen).
           | 
           | So, you'd want to check constants in the code, too, but then,
           | there's no guarantee that compilers do the same constant
           | folding. You can try to get more consistent by being really
           | diligent in using _constexpr_ , but that doesn't guarantee
           | that, either.
        
             | boulos wrote:
             | The same reasoning applies though. The compiler is just
             | another program. Outside of doing constant folding on
             | things that are unspec'ed or not required (like mmsqrtps
             | and most transcendentals), you should get consistent
             | results even between architectures.
             | 
             | Of course the specific line linked to in that GH issue _is_
             | showing that LLVM will attempt constant folding of various
             | trig functions:
             | 
             | https://github.com/llvm/llvm-
             | project/blob/faa43a80f70c639f40...
             | 
             | but the IEEE-754 spec does also _recommend_ correctly
             | rounded results for those (https://en.wikipedia.org/wiki/IE
             | EE_754#Recommended_operation...).
             | 
             | The majority of code I'm talking about though uses
             | constants that are some long, explicit number, and doesn't
             | do any math on them that would then be amenable to constant
             | folding itself.
             | 
             | That said, lines like:
             | 
             | https://github.com/llvm/llvm-
             | project/blob/faa43a80f70c639f40...
             | 
             | are more worrying, since that may differ from what people
             | expect dynamically (though the underlying stuff supports
             | different denormal rules).
             | 
             | Either way, thanks for highlighting this! Clearly the
             | answer is to just use LLVM/clang regardless of backend :).
        
             | mjcohen wrote:
             | Years ago, I was programming in Ada and ran across a case
             | where the value of a constant in a program differed from
             | the same constant being converted at runtime. Took a while
             | to track that one down.
        
         | teleforce wrote:
         | The author did mention about fixed point was very popular for
         | gamedev before floating point becoming popular due the
         | increased in hardware capability, and most likely CORDIC was
         | being used as well together with fixed point.
         | 
         | > In fact, before IEEE 754 became the popular standard that it
         | is today, fixed point was used all the time (go and ask any
         | gamedev who worked on stuff between 1980 and 2000ish and
         | they'll tell you all about it).
        
           | aappleby wrote:
           | Was gamedev between 1980 and 2000ish, can confirm. PS1 had no
           | floating point unit.
        
       | rmorey wrote:
       | I remember in high school precalc, we learned about the Taylor
       | series, and my teacher told us that it was how trig functions
       | were actually implemented on calculators. Well I looked it up and
       | found that it was actually CORDIC, and went and had some fun
       | implementing it in TI Basic
        
       | the__alchemist wrote:
       | Side note for 2023: Some modern MCUs are low cost, but have FPUs.
       | A good example is the STM32G4. So, unlike on an M0 MCU or w/e,
       | you can freely use f32, if you don't want fixed point. And you
       | can get these for ~$1-2 / MCU.
       | 
       | That said... The G4 also has a hardware CORDIC peripheral that
       | implements this algo for fixed-point uses. Is this mainly to
       | avoid floating point precision losses? You program it using
       | registers, but aren't implementing CORDIC yourself on the CPU;
       | dedicated hardware inside the IC does it.
        
       | teleforce wrote:
       | Fun facts, in addition to sine and cosine calculation and
       | generation, CORDIC can also be used for many operations including
       | log, exponent, square root, vector magnitude, polar-cartesian
       | conversion and vector rotation. Spoiler alerts, the authors
       | provided the teaser regarding these propects in the conclusions.
       | 
       | I've got the feeling that by using quaternion instead of
       | conventional orthonormal matrices, CORDIC based operations can be
       | executed more efficiently (less compute cycles and memory) and
       | effectively (less errors) [1].
       | 
       | [1] CORDIC Framework for Quaternion-based Joint Angle Computation
       | to Classify Arm Movements:
       | 
       | https://core.ac.uk/works/8439118
        
         | nimish wrote:
         | It can be extended to arbitrary Lie groups IIRC
        
       | pintor-viejo wrote:
       | Meagher's octree system notably used only integer arithmetic with
       | no integer multiplication/division:
       | 
       | > Efficient (linear time) algorithms have been developed for the
       | Boolean operations (union, intersection and difference),
       | geometric operations (translation, scaling and rotation),
       | N-dimensional interference detection, and display from any point
       | in space with hidden surfaces removed. The algorithms require
       | neither floating-point operations, integer multiplications, nor
       | integer divisions.
       | 
       | https://doi.org/10.1016/0146-664X(82)90104-6
       | 
       | This facilitated creation of fast, custom VLSI graphics
       | acceleration hardware for octree representation.
        
       ___________________________________________________________________
       (page generated 2024-05-11 23:00 UTC)