[HN Gopher] The Impossible Optimization, and the Metaprogramming...
       ___________________________________________________________________
        
       The Impossible Optimization, and the Metaprogramming to Achieve It
        
       Author : melodyogonna
       Score  : 62 points
       Date   : 2025-10-28 10:53 UTC (4 days ago)
        
 (HTM) web link (verdagon.dev)
 (TXT) w3m dump (verdagon.dev)
        
       | Xcelerate wrote:
       | > Eliminate redundant matrix operations (like two transposes next
       | to each other)
       | 
       | In 2016, I was trying to construct orthogonal irreducible matrix
       | representations of various groups ("irreps"). The problem was
       | that most of the papers describing how to construct these
       | matrices used a recursive approach that depended on having
       | already constructed the matrix elements of a lower dimensional
       | irrep. Thus the irrep dimension n became quite an annoying
       | parameter, and function calls were very slow because you had to
       | construct the irrep for each new group element from the ground up
       | on every single call.
       | 
       | I ended up using Julia's @generated functions to dynamically
       | create new versions of the matrix construction code for each
       | distinct value of n for each type of group. So essentially it
       | would _generate_ "unrolled" code on the fly and then use LLVM to
       | compile that a single time, after which all successive calls for
       | a specific group and irrep dimension were extremely fast. Was
       | really quite cool. The only downside was that you couldn't
       | generate very high dimensional irreps because LLVM would begin to
       | struggle with the sheer volume of code it needed to compile, but
       | for my project at the time that wasn't much of a concern.
        
       | fragmede wrote:
       | Depending on the userbase of the site, simply checking for
       | @gmail.com at the end, I'd bet, would result in a quick win, as
       | well as restricting the username's alphabet to allowed Gmail
       | characters.
       | 
       | The other optimization I'd guess at would be to
       | async/thread/process the checking before and after the @ symbol,
       | so they can run in parallel (ish). Extra cpu time, but speed >
       | CPU cycle count for this benchmark.
        
         | embedding-shape wrote:
         | Tell us you used to work at Google, without telling us.
         | 
         | "simply do X" is such a programmer fallacy at this point I'm
         | surprised we don't have a catchy name for it yet, together with
         | a XKCD for making the point extra clear.
        
           | fragmede wrote:
           | Tell us you don't actually work with any Google engineers...
           | blah blah blah
           | 
           | The trope is "At Google we..." and then casually mention
           | "violating" the CAP theorum with Spanner or something.
           | 
           | It _is_ simple, and I really do hope any first year CS
           | student could extract a substring from a string. Have LLMs so
           | atrophied our programming ability that extraction of a
           | substring is considered evidence of a superior programmer?
        
         | gus_massa wrote:
         | [Rehashing an old comment]
         | 
         | In the math department, we had a Moodle the students in the
         | first year of my university in Argentina.
         | 
         | When we started like 15 years ago, the emails of the students
         | and TA were evenly split in 30% Gmail, 30% Yahoo!, 30% Hotmail
         | and 10% others (very aproximate numbers).
         | 
         | Now the students have like 80% Gmail, 10% Live/Outlook/Hotmail
         | and 10% others/Yahoo. Some of the TA are much older, so perhaps
         | "only" 50% use Gmail.
         | 
         | The difference is huge. I blame the mandatory gmail account for
         | the cell phone.
         | 
         | So, checking only @gmail.com is too strict, but a first fast
         | check for @gmail.com and later the complete regex may improve
         | the speed a lot in the real word.
        
           | bee_rider wrote:
           | Maybe I am old, but I like to keep as much communication as
           | possible going through the university email. It just feels
           | more official somehow.
        
       | omnicognate wrote:
       | The language here is Mojo, which the article seems to assume you
       | know and doesn't say enough for you to deduce until half way
       | through and after multiple code examples. I don't know how you're
       | supposed to know this as even the blog it's on is mostly about
       | Vale. From the intro I was expecting it to be about C++.
        
         | totalperspectiv wrote:
         | The author works for Modular. He shared the write up on the
         | Mojo Discord. I think Mojo users were the intended audience.
        
       | SuperV1234 wrote:
       | https://github.com/hanickadot/compile-time-regular-expressio...
        
         | canucker2016 wrote:
         | FYI: This is a C++ template version of compile time regex
         | class.
         | 
         | A 54:47 presentation at CppCon 2018 is worth more than a
         | thousand words...
         | 
         | see https://www.youtube.com/watch?v=QM3W36COnE4
         | 
         | followup CppCon 2019 video at
         | https://www.youtube.com/watch?v=8dKWdJzPwHw
         | 
         | As the above github repo mentions, more info at
         | https://www.compile-time.re/
        
       | taeric wrote:
       | Gave me a smile to see the shout out to LISP in there.
       | 
       | Reading this take on it, it feels like a JIT compiler could also
       | accomplish a fair bit of this? I'm also reminded of the way a lot
       | of older programs would generate tables during build time. I'm
       | assuming that is still fairly common?
        
         | pfdietz wrote:
         | Yes, this is all straightforward with Lisp macros. Beyond that,
         | you can call the compile function in Common Lisp and do all
         | this at run time too.
        
         | BoingBoomTschak wrote:
         | In fact, there's https://github.com/telekons/one-more-re-
         | nightmare for CL.
        
       | Archit3ch wrote:
       | > Mojo, D, Nim, and Zig can do it, and C++ as of C++20. There are
       | likely some other languages that can do it, but these are the
       | only ones that can truly run normal run-time code at compile time
       | 
       | Pretty sure Julia can do it.
        
         | jburgy wrote:
         | https://bur.gy/2022/05/27/what-makes-julia-delightful.html
         | confirms your hunch
        
       | spectraldrift wrote:
       | Having never heard of mojo before, I found this article
       | fascinating. It provides a great example of how a toy regex
       | parser works and an excellent explanation of why vanilla regex
       | tends to be slow. It also presents a novel solution: compiling
       | the regex into regular code, which can then be optimized by the
       | compiler.
        
         | convolvatron wrote:
         | this is literally how 'lex' works. the one written in 1987 by
         | Vern Paxson.
        
           | jlokier wrote:
           | The original is 'lex', written in 1975 by Mike Lesk and Eric
           | Schmidt.
           | 
           | Yes, that Eric Schmidt, CEO of Google.
           | 
           | 1987 was the clone, 'flex' :-)
           | 
           | It did "compiling the regex into regular code, which can then
           | be optimized by the compiler" before the C programming
           | language as we know it was created. I think 'lex' was
           | compiling regex to C before the C language even had 'struct'
           | types, 'printf' or 'malloc'.
        
       | abeppu wrote:
       | To tie this specific example to a larger framework: In scala
       | land, Tiark Rompf's Lightweight Modular Staging system handled
       | this class of metaprogramming elegantly, and the 'modular' part
       | included support of multiple compilation targets. The idea was
       | that one could incrementally define/extend DSLs that produce an
       | IR, optimizations in that IR, and code generation for chunks of
       | DSLs. Distinctions about stage are straight-forward type-
       | signature changes. The worked example in this post is very
       | similar to one of the tutorials for that system: https://scala-
       | lms.github.io/tutorials/regex.html
       | 
       | Unfortunately, so far as I can tell:
       | 
       | - LMS has not been updated for years and never moved to scala 3.
       | https://github.com/TiarkRompf/virtualization-lms-core
       | 
       | - LMS was written to also use "scala-virtualized" which is in a
       | similar situation
       | 
       | There's a small project to attempt to support it with
       | virtualization implemented in scala 3 macros, but it's missing
       | some components:
       | https://github.com/metareflection/scala3-lms?tab=readme-ov-f...
       | 
       | I'd love to see this fully working again.
        
       | aappleby wrote:
       | I did something similar to this with C++ templates - it's Parsing
       | Expression Grammar based, so not full regex, but enough for a lot
       | of tasks:                 using sign      = Atoms<'+', '-'>;
       | using digit     = Range<'0', '9'>;       using onenine   =
       | Range<'1', '9'>;       using digits    = Some<digit>;       using
       | integer   = Seq<Opt<Atom<'-'>>, Oneof<Seq<onenine, digits>,
       | digit>>;       using fraction  = Seq<Atom<'.'>, digits>;
       | using exponent  = Seq<Atoms<'e', 'E'>, Opt<sign>, digits>;
       | using number    = Seq<integer, Opt<fraction>, Opt<exponent>>;
       | 
       | and I've confirmed that it does all get inlined and optimized on
       | -O3.
       | 
       | JSON parser example here -
       | https://github.com/aappleby/matcheroni/blob/main/examples/js...
        
       | lisper wrote:
       | > can compilers really execute general code at compile-time?
       | 
       | Cue the smug Lisp weenies laughing quietly in the background.
        
       | cadamsdotcom wrote:
       | Seems like an optimization that could be applied quite generally
       | - as the author mentions at the end there's lots of places this
       | could be used.
       | 
       | The problem with applying this technique generally is the amount
       | of code generated. But what if you can optimize that too..
       | perhaps share the common parts of the AST between the copies of
       | the code that are generated, and overlay the changes with some
       | datastructure.
        
       ___________________________________________________________________
       (page generated 2025-11-01 23:01 UTC)