hngopher.com

       [HN Gopher] We rewrote our Rust WASM parser in TypeScript and it...
       ___________________________________________________________________
        
       We rewrote our Rust WASM parser in TypeScript and it got faster
        
       Author : zahlekhan
       Score  : 279 points
       Date   : 2026-03-20 21:48 UTC (1 days ago)
        
 (HTM) web link (www.openui.com)
 (TXT) w3m dump (www.openui.com)
        
       | blundergoat wrote:
       | The real win here isn't TS over Rust, it's the O(N2) -> O(N)
       | streaming fix via statement-level caching. That's a 3.3x
       | improvement on its own, independent of language choice. The WASM
       | boundary elimination is 2-4x, but the algorithmic fix is what
       | actually matters for user-perceived latency during streaming.
       | Title undersells the more interesting engineering imo.
        
         | shmerl wrote:
         | More like a misleading clickbait.
        
         | sroussey wrote:
         | Yeah, though the n^2 is overstating things.
         | 
         | One thing I noticed was that they time each call and then use a
         | median. Sigh. In a browser. :/ With timing attack defenses
         | build into the JS engine.
        
           | fn-mote wrote:
           | For those of us not in the know, what are we expecting the
           | results of the defenses to be here?
        
             | sroussey wrote:
             | Jitter. It make precise timings unreliable. Time the entire
             | time of 1000 runs and divide by 1000 instead of starting
             | and stopping 1000 timers.
        
         | Aurornis wrote:
         | > Title undersells the more interesting engineering imo.
         | 
         | Thanks for cutting through the clickbait. The post is
         | interesting, but I'm so tired of being unnecessarily
         | clickbaited into reading articles.
        
         | socalgal2 wrote:
         | same for uv but no one takes that message. They just think
         | "rust rulez!" and ignore that all of uv's benefits are algo,
         | not lang.
        
           | estebank wrote:
           | Some architectures are made easier by the choice of
           | implementation language.
        
             | crubier wrote:
             | In my experience Rust typically makes it a little bit
             | harder to write the most efficient algo actually.
        
               | catlifeonmars wrote:
               | That's usually ok bc in most code your N is small and
               | compiler optimizations dominate.
        
               | Defletter wrote:
               | Would you be willing to give an example of this?
        
               | lukeweston1234 wrote:
               | Not OP, but one example where it is a bit harder to do
               | something in Rust that in C, C++, Zig, etc. is mutability
               | on disjoint slices of an array. Rust offers a few
               | utilities, like chunks_by, split_at, etc. but for certain
               | data structures and algorithms it can be a bit annoying.
               | 
               | It's also worth noting that unsafe Rust != C, and you are
               | still battling these rules. With enough experience you
               | gain an understanding of these patterns and it goes away,
               | and you also have these realy solid tools like Miri for
               | finding undefined behavior, but it can be a bit of a
               | hastle.
        
               | catlifeonmars wrote:
               | Has no one written a python! macro for this use case?
        
               | foldr wrote:
               | Mutating tree structures tends to be a fiddle (especially
               | if you want parent pointers).
        
             | EdwardDiego wrote:
             | UV also has the distinct advantage in dependency resolution
             | that it didn't have to implement the backwards compatible
             | stuff Pip does, I think Astral blogged on it. If I can find
             | it, I'll edit the link in.
             | 
             |  _edit_ wasn 't Astral, but here's the blog post I was
             | thinking of. https://nesbitt.io/2025/12/26/how-uv-got-so-
             | fast.html
             | 
             | That said, your point is very much correct, if you watch or
             | read the Jane Street tech talk Astral gave, you can see how
             | they really leveraged Rust for performance like turning
             | Python version identifiers into u64s.
        
           | rowanG077 wrote:
           | That's a pretty big claim. I don't doubt that a lot of uv's
           | benefits are algo. But everything? Considering that running
           | non IO-bound native code should be an order of magnitude
           | faster than python.
        
             | thfuran wrote:
             | More than one, I'd think.
        
             | jeremyjh wrote:
             | Its a pretty well-supported claim. uv skips doing a number
             | of things that generate file I/O. File I/O is far more
             | costly than the difference in raw computation. pip can't
             | drop those for compatibility reasons.
             | 
             | https://nesbitt.io/2025/12/26/how-uv-got-so-fast.html
        
               | rowanG077 wrote:
               | I don't think the article you linked supports the claim
               | that none of UV performance improvements are related to
               | using rust over python at all. In fact it directly states
               | the exact opposite. They have an entire section dedicated
               | to why using Rust has direct performance advantages for
               | UV.
        
               | jeremyjh wrote:
               | What it says is this:
               | 
               | > uv is fast because of what it doesn't do, not because
               | of what language it's written in. The standards work of
               | PEP 518, 517, 621, and 658 made fast package management
               | possible. Dropping eggs, pip.conf, and permissive parsing
               | made it achievable. Rust makes it a bit faster still.
        
               | rowanG077 wrote:
               | Yes exactly! That quote directly disproves that all of
               | the improvements UV has over competitors is because of
               | algos, not because of rust.
               | 
               | So the claim is not well supported at all by the article
               | as you stated, in fact the claim is literally disproven
               | by the article.
        
               | jeremyjh wrote:
               | You are right. 99% is not 100%.
        
               | rowanG077 wrote:
               | I don't think the article has substantive numbers. You'd
               | have to re-implement UV in python to do that. I don't
               | think anyone did that. It would be interesting at least
               | to see how much UV spends in syscalls vs PIP and make a
               | relative estimate based on that.
        
               | kyralis wrote:
               | This is either an overly pedantic take or a disingenuous
               | one. The very first line that the parent quoted is
               | 
               | > uv is fast because of what it doesn't do, not because
               | of what language it's written in.
               | 
               | The fact that the language had a small effect ("a bit")
               | does not invalidate the statement that algorithmic
               | improvements are the reason for the relative speed. In
               | fact, there's no reason to believe that rust without the
               | algorithmic version would be notably faster at all. Sure,
               | "all" is an exaggeration, but the point made still stands
               | in the form that most readers would understand it:
               | algorithmic improvements are the important difference
               | between the systems.
        
               | rowanG077 wrote:
               | I think we might be talking past each other a bit.
               | 
               | The specific claim I was responding to was that all of
               | uv's performance improvements come from algorithms rather
               | than the language. My point was just that this is a
               | stronger claim than what the article supports, the
               | article itself says Rust contributes "a bit" to the
               | speed, so it's not purely algorithmic.
               | 
               | I do agree with the broader point that algorithmic and
               | architectural choices are the main reason uv is fast, and
               | I tried to acknowledge that, apparently unsuccessfully,
               | in my very my first comment ("I don't doubt that a lot of
               | uv's benefits are algo. But everything?").
        
               | ambicapter wrote:
               | You are being very pedantic here.
        
               | staticassertion wrote:
               | Do you actually believe that UV would be as fast if it
               | were written in Python?
        
               | tinco wrote:
               | It would come pretty close, probably close enough that
               | you wouldn't be able to tell the difference on 90% of
               | projects.
        
               | staticassertion wrote:
               | Vague. What's pretty close? I mean, even for IO bound
               | tasks you can pretty quickly validate that the
               | performance between languages is not close at all - 10 to
               | 100x difference.
        
               | tinco wrote:
               | Sure, within 100ms. Who cares what the performance
               | multiples are?
        
               | staticassertion wrote:
               | That literally makes no sense. 100ms... out of what? Is
               | it 1ms vs 100ms? 100000ms vs 100100ms?
               | 
               | Anyway, dubious claim since a Python interpreter will
               | take 10s of milliseconds just to print out its version.
               | 
               | Do you have any evidence? I can point at techempower
               | benchmarks showing IO bound tasks are still 10-100x
               | faster in native languages vs Python/JS.
        
               | tinco wrote:
               | I'm saying that the Rust might execute in 50ms and the
               | Python in 150ms. You are the one not making sense, we are
               | talking about application performance, why are you _not_
               | measuring that in milliseconds.
               | 
               | That is assuming Rust is 100x faster than Python btw,
               | 49ms of I/O, 1ms of Rust, 100ms of Python.
        
               | staticassertion wrote:
               | > I'm saying that the Rust might execute in 50ms and the
               | Python in 150ms.
               | 
               | Okay, so the Rust code would be 3x as fast. Feels
               | arbitrary, but sure.
               | 
               | > You are the one not making sense, we are talking about
               | application performance, why are you not measuring that
               | in milliseconds.
               | 
               | I explained why your post made no sense already...
               | 
               | > That is assuming Rust is 100x faster than Python btw,
               | 49ms of I/O, 1ms of Rust, 100ms of Python.
               | 
               | That's not how anything works. Different languages will
               | perform differently on IO work, different runtimes will
               | degrade under IO differently, etc. That's why even basic
               | echo HTTP servers perform radically differently in Python
               | vs Rust.
               | 
               | This isn't how computers work and it's not even how math
               | works.
               | 
               | This conversation has become nonsensical. The thing we
               | can agree with is this - no, uv would not be as fast if
               | it were written in Python.
        
               | jeremyjh wrote:
               | > Different languages will perform differently on IO
               | work,
               | 
               | IO is executed by kernel, file system or network drivers.
               | IO performance is not dependent at all on which language
               | makes the syscalls.
               | 
               | > The thing we can agree with is this - no, uv would not
               | be as fast if it were written in Python.
               | 
               | In this thread, we are talking about the speed of uv in
               | terms of user experience - how long a person waits for
               | command line operations to complete. Things that pip
               | takes multiple seconds to do, uv will do in dozens of
               | milliseconds. If uv were written in python, it would take
               | dozens of ms + a few dozens more, which means absolutely
               | fuck all nothing in the context of the thousands of
               | milliseconds saved over pip.
               | 
               | Its possible a user might perceive a slight difference in
               | larger projects, but if pip had been uv-but-in-python,
               | the uv-in-rust project would never have been started in
               | the first place because no one would have bothered
               | switching.
               | 
               | > This conversation has become nonsensical.
               | 
               | Agreed. No one in this thread is disputing that Rust code
               | is faster than Python, only that in this case it is
               | completely insignificant in the face of all the useless
               | file and network I/O that pip is doing, and uv is not.
        
               | tinco wrote:
               | > That's not how anything works. Different languages will
               | perform differently on IO work, different runtimes will
               | degrade under IO differently, etc. That's why even basic
               | echo HTTP servers perform radically differently in Python
               | vs Rust.
               | 
               | > This isn't how computers work and it's not even how
               | math works.
               | 
               | What are you disagreeing with? There's some baseline
               | amount of I/O that the kernel does for you, that's what
               | I'm assuming is 50ms, and everything else like runtime
               | degrading is overhead due to the language/platform
               | choice. I'm saying Rust is upwards of 100x faster in that
               | regard thanks to its zero cost abstraction philosophy.
               | You can't just include the I/O baseline in a claim about
               | Rust's performance advantage. You'll be really
               | disappointed when Rust doesn't download your files 100x
               | as fast as the Python file downloader.
               | 
               | Anyway, I'm sorry I provoked your antagonism with my
               | terse messages, I wasn't trying to be blase. I believe uv
               | is the sort of tool that wouldn't suffer much from the
               | downsides of Python and that in most situations the
               | reduced runtime overhead of Rust would have a negligible
               | impact on the user experience. I'm not arguing that they
               | shouldn't build uv in Rust. Most situations is not all
               | situations, and when a tool is used so widely you'll hit
               | all edge cases, from the point where the 10s of
               | milliseconds of startup time matters to the point where
               | Pythons I/O overhead matters at scale.
        
           | coldtea wrote:
           | Just the fact that I can install a single binary is 10x
           | better than an equally fast Python implementation.
        
         | azakai wrote:
         | O(N2) -> O(N) was 3.3x faster, but before that, eliminating the
         | boundary (replacing wasm with JS) led to speedups of 2.2x,
         | 4.6x, 3.0x (see one table back).
         | 
         | It looks like neither is the "real win". both the language and
         | the algorithm made a big difference, as you can see in the
         | first column in the last table - going to wasm was a big
         | speedup, and improving the algorithm on top of that was another
         | big speedup.
        
         | nulltrace wrote:
         | Yeah the algorithmic fix is doing most of the work here. But
         | call that parser hundreds of times on tiny streaming chunks and
         | the WASM boundary cost per call adds up fast. Same thing would
         | happen with C++ compiled to WASM.
        
           | hrmtst93837 wrote:
           | WASM boundary overhead is only half the story. Once you start
           | bouncing tiny chunks across JS and WASM over and over, the
           | data shuffling and memory layout mismatch can trash cache
           | behavior, pile on allocation churn, and turn a nice benchmark
           | into something that looks nothing like a parser living inside
           | a streaming pipeline. That's why most 'language duel' posts
           | feel beside the point.
        
         | catlifeonmars wrote:
         | You're not wrong, but that win would not get as many views.
         | It's not clickbaity enough
        
         | adastra22 wrote:
         | No AI generated comments on HN please.
        
         | wolvesechoes wrote:
         | > The real win here isn't TS over Rust
         | 
         | Kinda is. We came up with abstractions to help reason about
         | what really matters. The more you need to deal with auxillary
         | stuff (allocations, lifetimes), more likely you will miss the
         | big issue.
        
           | coldtea wrote:
           | The opposite: the more you rely on abstractions the more you
           | miss the lower level optimization opportunities and loose
           | understanding of algorithms and hardware.
        
             | wolvesechoes wrote:
             | > of algorithms
             | 
             | Yes, sprinkling your code logic with malloc, .clone() or
             | lifetime annotations on the other hand brings algorithmic
             | enlightenment.
        
               | coldtea wrote:
               | Dealing and having to think about the cost of malloc,
               | clone() and lifetimes, brings algorithmic enlightenment
               | more than working on an high abstraction ivory tower
               | where things "magically happen".
               | 
               | Is your argument that the average Python or Typescript
               | dev gets to think and care more about algorithms than the
               | average C/C++/Rust dev?
        
         | zahrevsky wrote:
         | They even directly conclude at the end of the article that
         | improvements in algorithm are more important than the choice of
         | language:
         | 
         | > Algorithmic complexity improvements dominate language-level
         | optimisations. Going from O(N2) to O(N) in the streaming case
         | had a larger practical impact than switching from WASM to
         | TypeScript.
         | 
         | Yet they still have chosen to put the "Rust rewrite" part in
         | the title. I almost think it's a click bait.
        
       | dmix wrote:
       | That blog post design is very nice. I like the 'scrollspy'
       | sidebar which highlights all visible headings.
       | 
       | Claude tells me this is https://www.fumadocs.dev/
        
         | sroussey wrote:
         | Interesting, thanks. I need make some good docs soon.
        
           | dmix wrote:
           | Good documentation is always worth the effort. Markdown
           | explaining your products is gold these days with LLMs.
        
       | nine_k wrote:
       | "We rewrote this code from language _L_ to language _M_ , and the
       | result is better!" No wonder: it was a chance to rectify
       | everything that was tangled or crooked, avoid every known bad
       | decision, and apply newly-invented better approaches.
       | 
       | So this holds even for _L = M_. The speedup is not in the
       | language, but in the rewriting and rethinking.
        
         | MiddleEndian wrote:
         | Now they just need a third party who's never seen the original
         | to rewrite their TypeScript solution in Rust for even more
         | gains.
        
           | nine_k wrote:
           | Indeed! But only after a year or so of using it in
           | production, so that the drawbacks would be discovered.
        
         | baranul wrote:
         | Truth. You can see improvement, even rewriting code in the same
         | language.
        
         | azakai wrote:
         | You're generally right - rewrites let you improve the code -
         | but they do have an actual reason the new language was better:
         | avoiding copies on the boundary.
         | 
         | They say they measured that cost, and it was most of the
         | runtime in the old version (though they don't give exact
         | numbers). That cost does not exist at all in the new version,
         | simply because of the language.
        
           | necovek wrote:
           | It's doing copies and (de)serialization on both sides into
           | native data types.
           | 
           | If they used raw byte structures, implemented the caching
           | improvements on the wasm side, the copies might not be as
           | bad.
           | 
           | But they still have an issue with multi-language stack:
           | complexity also has a cost.
           | 
           | Python/C combo does not have this issue because you can work
           | with Python types natively in C, but otherwise, this is a
           | cross-language conversion issue, and not a Rust issue at all.
        
         | awesome_dude wrote:
         | I think that they were honest about that to a degree, they
         | pointed out that one source of the speed up was caused by the
         | python fixing a big they hadn't noticed in the C++
         | 
         | Edit: fixed phone typos
        
         | rabisg wrote:
         | One of the authors here. While that's generally true, in this
         | case it wasn't time that helped us learn what worked. It was a
         | nagging sense that the architecture wasn't right, just days
         | before launch, along with heavy instrumentation to test our
         | assumptions.
        
         | johnisgood wrote:
         | I have been saying this for a while now (thought it was
         | obvious), and often I get downvoted when I point this out.
        
       | spankalee wrote:
       | I was wondering why I hadn't heard of Open UI doing anything with
       | WASM.
       | 
       | This new company chose a very confusing name that has been used
       | by the Open UI W3C Community Group for over 5 years.
       | 
       | https://open-ui.org/
       | 
       | Open UI is the standards group responsible for HTML having
       | popovers, customizable select, invoker commands, and accordions.
       | They're doing great work.
        
       | caderosche wrote:
       | What is the purpose of the Rust WASM parser? Didn't understand
       | that easily from the article. Would love a better explanation.
        
         | joshuanapoli wrote:
         | They use a bespoke language to define LLM-generated UI
         | components. I think that this is supposed to prevent
         | exfiltration if the LLM is prompt-injected. In any case, the
         | parser compiles chunks streaming from the LLM to build a live
         | UI. The WASM parser restarted from the beginning upon each
         | chunk received. Fixing this algorithm to work more
         | incrementally (while porting from Rust to TypeScript) improved
         | performance a lot.
        
       | evmar wrote:
       | By the way, I did a deeper dive on the problem of serializing
       | objects across the Rust/JS boundary, noticed the approach used by
       | serde wasn't great for performance, and explored improving it
       | here: https://neugierig.org/software/blog/2024/04/rust-wasm-to-
       | js....
        
         | slopinthebag wrote:
         | Did you try something like msgpack or bebop?
        
       | SCLeo wrote:
       | They should rewrite it in rust again to get another 3x
       | performance increase /s
        
       | slowhadoken wrote:
       | Am I mistaken or isn't TypeScript just Golang under the hood
       | these days?
        
         | iainmerrick wrote:
         | Hmm, there's an in-progress rewrite of the TypeScript compiler
         | in Go; is that what you mean?
         | 
         | I don't think that's actually out yet, and more importantly, it
         | doesn't change anything at runtime -- your code still runs in a
         | JS engine (V8, JSC etc).
        
           | koakuma-chan wrote:
           | npm i -D @typescript/native-preview
           | 
           | You can use it today.
        
         | jeremyjh wrote:
         | There is too much wrong here to call it a mistake.
        
         | wiseowise wrote:
         | Yes, you've uncovered grand conspiracy.
        
       | szmarczak wrote:
       | > Attempted Fix: Skip the JSON Round-Trip > We integrated serde-
       | wasm-bindgen
       | 
       | So you're reinventing JSON but binary? V8 JSON nowadays is highly
       | optimized [1] and can process gigabytes per second [2], I doubt
       | it is a bottleneck here.
       | 
       | [1] https://v8.dev/blog/json-stringify [2]
       | https://github.com/simdjson/simdjson
        
         | kam wrote:
         | No, serde-wasm-bindgen implements the serde Serializer
         | interface by calling into JS to directly construct the JS
         | objects on the JS heap without an intermediate
         | serialization/deserialization. You pay the cost of one or more
         | FFI calls for every object though.
         | 
         | https://docs.rs/serde-wasm-bindgen/
        
           | szmarczak wrote:
           | Indeed, you're right. However, it still needs to encode and
           | decode strings. WASM just needs native interop.
        
       | neuropacabra wrote:
       | This is very unusual statement :-D
        
       | nallana wrote:
       | Why not a shared buffer? Serializing into JSON on this hot path
       | should be entirely avoidable
        
         | mavdol04 wrote:
         | I think a shared array just avoids the copy, not the
         | serialization which is the main problem as they showed with
         | serde-wasm-bindgen test
        
           | notnullorvoid wrote:
           | You can avoid the serialization in WASM by pushing structured
           | bytes to the SharedArrayBuffer, then do serialization in JS
           | which should be relatively cheap compared to pushing JSON
           | strings across the boundary.
        
       | ivanjermakov wrote:
       | Good software is usually written on 2nd+ try.
        
       | joaohaas wrote:
       | God I hate AI writing.
       | 
       | That final summary benchmark means nothing. It mentions
       | 'baseline' value for the 'Full-stream total' for the rust
       | implementation, and then says the `serde-wasm-bindgen` is '+9-29%
       | slower', but it never gives us the baseline value, because
       | clearly the only benchmark it did against the Rust codebase was
       | the per-call one.
       | 
       | Then it mentions: "End result: 2.2-4.6x faster per call and
       | 2.6-3.3x lower total streaming cost."
       | 
       | But the "2.6-3.3x" is by their own definition a comparison
       | against the naive TS implementation.
       | 
       | I really think the guy just prompted claude to "get this shit
       | fast and then publish a blog post".
        
         | chvish wrote:
         | This. It's so annoying to read these types of blogs now where
         | the writer clearly didn't put the effort to understand things
         | fully or atleast review the blog their LLM wrote. Who is this
         | useful for?
        
         | JimDabell wrote:
         | The article as a whole makes no sense. They are generating UI
         | with an LLM. How fast the UI appears to the user is going to be
         | completely dictated by the speed of the LLM, not the speed of
         | the serialisation.
        
         | rabisg wrote:
         | as an author of the blog - ouch did a little bit more than
         | prompt claude but a lot of claude prompting was definitely
         | involved
         | 
         | I understand your frustration with AI writing though. We are a
         | small team and given our roadmap it was either use LLMs to help
         | collate all the internal benchmark results file into a blog or
         | never write it so we chose the former. This was a genuinely
         | surprising and counterintuitive result for us, which is why we
         | wanted to share it. Happy to clarify any of the numbers if
         | helpful.
        
       | nssnsjsjsjs wrote:
       | Rewrite bias. Yoy want to also rewrite the Rust one in Rust for
       | comparison.
        
         | jeremyjh wrote:
         | It would be surprising if rewriting in Rust could change the
         | WASM boundary tax that the article identified as the actual
         | problem.
        
           | rabisg wrote:
           | (author here) We'd be really surprised if a rewrite could fix
           | the boundary tax but if it does, we'd happily move over to
           | it. People (including me) really underestimate how insanely
           | fast browser's JSON.parse is
        
       | rented_mule wrote:
       | Something not unlike this happened to me when moving some batch
       | processing code from C++ to Python 1.4 (this was 1997). The batch
       | started finishing about 10x faster. We refused to believe it at
       | first and started looking to make sure the work was actually
       | being done. It was.
       | 
       | The port had been done in a weekend just to see if we could use
       | Python in production. The C++ code had taken a few months to
       | write. The port was pretty direct, function for function. It was
       | even line for line where language and library differences didn't
       | offer an easier way.
       | 
       | A couple of us worked together for a day to find the reason for
       | the speedup. Just looking at the code didn't give us any clues,
       | so we started profiling both versions. We found out that the port
       | had accidentally fixed a previously unknown bug in some code that
       | built and compared cache keys. After identifying the small
       | misbehaving function, we had to study the C++ code pretty hard to
       | even understand what the problem was. I don't remember the exact
       | nature of the bug, but I do remember thinking that particular
       | type of bug would be hard to express in Python, and that's
       | exactly why it was accidentally fixed.
       | 
       | We immediately started moving the rest of our back end to Python.
       | Most things were slower, but not by much because most of our back
       | end was i/o bound. We soon found out that we could make
       | algorithmic improvements so much more quickly, so a lot of the
       | slowest things got a lot faster than they had ever been. And,
       | most importantly, we (the software developers) got quite a bit
       | faster.
        
         | asa400 wrote:
         | Fun story! Performance is often highly unintuitive, and even
         | counterintuitive (e.g. going from C++ to Python). Very much an
         | art as well as a science.
         | 
         | Crazy how many stories like this I've heard of how doing
         | performance work helped people uncover bugs and/or hidden
         | assumptions about their systems.
        
           | staticassertion wrote:
           | It doesn't come off as unintuitive by my read. They had a bug
           | that led to a massive performance regression. Rewriting the
           | code didn't have that bug so it led to a performance
           | improvement.
           | 
           | They found that they had fewer bugs in Python so they
           | continued with it.
        
             | harpiaharpyja wrote:
             | I think a lot of people (especially those who are only
             | peripherally involved in development, like management)
             | don't really consider performance regressions at all when
             | thinking about how to get software to go faster.
             | 
             | Meanwhile my experience has been that whenever there has
             | been a performance issue severe enough to actually matter,
             | it's often been the result of some kind of performance bug,
             | not so much language, runtime, or even algorithm choices
             | for that matter.
             | 
             | Hence whenever the topic of how to improve performance
             | comes up, I always, _always_ insist that we profile first.
        
               | staticassertion wrote:
               | My experience has been that performance bugs show up in
               | lots of places and I'm very lucky when it's just a bug.
               | The far more painful performance issues are language and
               | runtime limitations.
               | 
               | But, of course, profiling is always step one.
        
         | asveikau wrote:
         | > After identifying the small misbehaving function, we had to
         | study the C++ code pretty hard to even understand what the
         | problem was. I don't remember the exact nature of the bug, but
         | I do remember thinking that particular type of bug would be
         | hard to express in Python, and that's exactly why it was
         | accidentally fixed.
         | 
         | Pure speculation, but I would guess this has something to do
         | with a copy constructor getting invoked in a place you wouldn't
         | guess, that ends up in a critical path.
        
           | NooneAtAll3 wrote:
           | good ol' shallow-vs-deep copy
        
           | andrewflnr wrote:
           | Given the context, I'm thinking bad cache keys resulting in
           | spurious cache misses, where the keys are built in some low-
           | level way. Cache misses almost certainly have a bigger
           | asymptotic impact than extra copies, unless that copy
           | constructor is really heavy.
        
             | asveikau wrote:
             | I'm just remembering a performance issue I heard of eons
             | ago where a sorting function comparison callback
             | inadvertently allocated memory. It made sorting very slow.
             | Someone said in a meeting that sorting was slow, and we all
             | had a laugh about "shouldn't have used the bubble sort!"
             | But it was the key comparison doing something stupid.
        
           | branko_d wrote:
           | My guess would be bad hashing, resulting in too many
           | collisions.
        
         | ameixaseca wrote:
         | My experience is the exact opposite.
         | 
         | This was particularly true for one of the projects I've worked
         | with in the past, where Python was chosen as the main language
         | for a monitoring service.
         | 
         | In short, it proved itself to be a disaster: just the Python
         | process collecting and parsing the metrics of all programs
         | consumed 30-40% of the processing power of the lower end boxes.
         | 
         | In the end, the project went ahead for a while more, and we had
         | to do all sorts of mitigations to get the performance impact to
         | be less of an issue.
         | 
         | We did consider replacing it all by a few open source tools
         | written in C and some glue code, the initial prototype used few
         | MBs instead of dozens (or even hundreds) of MBs of memory,
         | while barely registering any CPU load, but in the end it was
         | deemed a waste of time when the whole project was terminated.
        
           | serial_dev wrote:
           | Another anecdote, the team couldn't improve concurrency
           | reliably in Python, they rewrote the service in about a month
           | (ten years ago) in Go, everything ran about 20x faster.
        
           | wiseowise wrote:
           | > but in the end it was deemed a waste of time when the whole
           | project was terminated.
           | 
           | The main lesson of the story. Just pick Python and move fast,
           | kids. It doesn't matter how fast your software is if nobody
           | uses it.
        
             | stephantul wrote:
             | This is it. Getting something on the table for stakeholders
             | to look at trumps anything else.
        
               | ameixaseca wrote:
               | It would have taken the same time, if not less, given the
               | extra time for mitigations, trying different optimization
               | techniques, runtimes, etc.
               | 
               | One of the reasons the project was killed was that we
               | couldn't port it to our line of low powered devices
               | without a full rewrite in C.
               | 
               | Please note this was more than a decade ago, way before
               | Rust was the language it was today. I wouldn't chose
               | anything else besides Rust today since it gives the best
               | of both worlds: a truly high level language with low
               | level resource controls.
        
             | littlestymaar wrote:
             | And this is why pretty much all commercial software is
             | terrible and runs slower than the equivalent 20 years ago
             | despite incredible advance in hardware.
        
               | philipallstar wrote:
               | For lots of software there wasn't an equivalent 20 years
               | ago because there wasn't a language that would let
               | developers explore semi-specified domains fast enough to
               | create something useful. Unless it was visual basic, but
               | we can't use that, because what would all the UX people
               | be for?
        
             | Aeolun wrote:
             | You can use Go and get the best of both worlds.
        
               | nickserv wrote:
               | One of the slowest, most ineficient code bases I've ever
               | worked on was in Go.
               | 
               | The mentality was "the language is fast, so as long as it
               | compiles we're good"... Yeah that worked out about as
               | well as you'd expect.
        
               | zeroc8 wrote:
               | But that has nothing to do with the language.
        
               | nickserv wrote:
               | Absolutely, and it's a good language when used properly.
               | This was more of a problem with the hype surrounding it.
        
             | lowbloodsugar wrote:
             | I would agree except for the python part. Sure, you gotta
             | move fast, but if you survive a year you still gotta move
             | fast, and I've never seen a python code base that was still
             | coherent after a year. Expert pythonistas will claim,
             | truthfully, that they have such a code base but the same
             | can be said of expert rustaceans. I would stick to
             | typescript or even Java. It will still be a shitshow after
             | a year but not quite as fucked as python.
        
               | miki123211 wrote:
               | https://github.com/polarsource/polar/tree/main/server
               | 
               | If you're writing FastAPI (and you should be if you're
               | doing a greenfield REST API project in Python in 2026),
               | just s/copy/steal/ what those guys are doing and you'll
               | be fine.
        
             | Someone wrote:
             | > Just pick Python and move fast, kids. It doesn't matter
             | how fast your software is if nobody uses it.
             | 
             | The reason nobody uses your software could be that it is
             | too slow. As an example, if you write a video encoder or
             | decoder, using pure Python might work for postage-stamp
             | sized video because today's hardware is insanely fast, but
             | even, it likely will be easier to get the same speed in a
             | language that's better suited to the task.
        
               | wiseowise wrote:
               | Learning that it's too slow takes users.
        
             | bjoli wrote:
             | if input() == "dynamic scope?": defined = "happyhappy"
             | print(defined)
             | 
             | I'd rather not use python. The ick gets me every time.
        
           | czhu12 wrote:
           | Ditto for me. I had gotten so used to building web backends
           | in Ruby and running at 700MB minimum. When I finally got
           | around to writing a rust backend, it registered in the
           | metrics as 0MB, so I thought for sure the application had
           | crashed.
           | 
           | Turns out the metrics just rounded to the nearest 5MB
        
           | naasking wrote:
           | > just the Python process collecting and parsing the metrics
           | of all programs consumed 30-40% of the processing power of
           | the lower end boxes.
           | 
           | Just write the parsing loop in something faster like C or
           | Rust, instead of the whole thing.
        
           | Traubenfuchs wrote:
           | He struggled with the algorithms, you struggled with the
           | runtime.
           | 
           | You are not the same.
        
         | tda wrote:
         | Ome advantage of python is that it is so slow that if you
         | choose the wrong algorithm or data structure that soon gets
         | obvious. And for complicated stuff this is exactly where I find
         | the LLMs struggle. So I make a first version in Python, and
         | only when I am happy with the results and the speed feels
         | reasonable compared to the problem complexity, I ask Claude
         | Code to port the critical parts to Rust.
        
           | rabisg wrote:
           | The last part is really interesting. It feels like the whole
           | world will soon become Python/JS because thats what LLMs are
           | good at. Very few people will then take the pain of
           | optimizing it
        
             | eru wrote:
             | The LLMs are pretty good at optimising.
             | 
             | Not because they are brilliant, but because they are pretty
             | good at throwing pretty much all known techniques at a
             | problem. And they also don't tire of profiling and running
             | experiments.
        
               | elcritch wrote:
               | Not just profiling, but decoding protocols too.
               | 
               | Recently I tried Codex/GPT5 with updating a bluetooth
               | library for batteries and it was able to start capturing
               | bluetooth packets and comparing them with the libraries
               | other models. It was indefatigable. I didn't even know if
               | was so easy to capture BLE packets.
        
               | anthk wrote:
               | Wireshark would do that. But you need to understand low
               | level tools because in case on some BGP attack you all
               | LLM developers will be fired in the spot.
               | 
               | Flakey internet connection: most of current 'soy devs'
               | would be useless. Even more with boosted up chatbots.
        
               | mirsadm wrote:
               | Not in my experience. They're pretty good at getting
               | average performance which is often better than most
               | programmers seem to be willing to aim for.
        
               | miki123211 wrote:
               | If there's one thing LLMs are really, really good at,
               | it's having a target and then hitting / improving upon
               | that target.
               | 
               | If you have a comprehensive test suite or a realistic
               | benchmark, saying "make tests pass" or "make benchmark go
               | up" works wonders.
               | 
               | LLMs are really good at knowing patterns, we still need
               | programmers to know which pattern to apply when. We'll
               | soon reach a point where you'll be able to say "X is
               | slow, do autoresearch on X" and X will just magically get
               | faster.
               | 
               | The reason we can't yet isn't because LLMs are stupid,
               | it's because autoresearch is a relatively new (last month
               | or so) concept and hasn't yet entered into LLM
               | pretraining corpora. LLMs can already do this, you just
               | need to be a little bit more explicit in explaining
               | exactly what you need them to do.
        
               | philipallstar wrote:
               | I've not tried this yet, but doesn't it use up loads of
               | tokens? How do you do it efficiently?
        
             | 9rx wrote:
             | _> JS because thats what LLMs are good at._
             | 
             | That has not been my experience. JS/TS requires the most
             | hand-holding, by far. LLMs are no doubt assumed to be good
             | at JS due to the sheer amount of training data, but a lot
             | of those inputs are of really poor quality, and even among
             | the high quality inputs there isn't a whole lot of
             | consistency in how they are written. That seems to trip up
             | the LLMs. If anything, LLMs might finally be what breaks
             | the JS camel's back. Although browser dominance still makes
             | that unlikely.
             | 
             |  _> Very few people will then take the pain of optimizing
             | it_
             | 
             | Today's LLMs rarely take the initiative to write
             | benchmarks, but if you ask it will and then will iterate on
             | optimizing using the benchmark results as feedback. It
             | works fairly well. There is a conceivable near future where
             | LLMs or LLM tools will start doing this automatically.
        
               | rabisg wrote:
               | My experience is from trying to get the React Native
               | example to work with OpenUI. Felt Sonnet/Opus was much
               | better at figuring out whats wrong with the current React
               | implementation and fixing it than it was with React
               | Native
               | 
               | But yes I see what you mean and I think people are trying
               | to solve it with skills and harnesses at the application
               | layer but its not there yet
        
         | shevy-java wrote:
         | > We immediately started moving the rest of our back end to
         | Python. Most things were slower, but not by much because most
         | of our back end was i/o bound.
         | 
         | Would be kind of cool if e. g. python or ruby could be as fast
         | as C or C++.
         | 
         | I wonder if this could be possible, assuming we could modify
         | both to achieve that as outcome. But without having a language
         | that would be like C or C++. Right now there is a strange
         | divide between "scripting" languages and compiled ones.
        
           | nubg wrote:
           | @dang this is an ai slop account, check his other comments
        
         | peter_retief wrote:
         | I suspect that you used highly optimized algorithms written for
         | python, like the vector algorithms in numpy? You will struggle
         | to write better code, at least I would.
        
           | masklinn wrote:
           | Python 1.4 would be mid-late 90s long before numpy and vector
           | algorithms would have been available.
           | 
           | I suspect it's more likely to be something like passing
           | std::string by value not realising that would copy the string
           | every time, especially with the statement that the mistake
           | would be hard to express in Python.
        
             | johnisgood wrote:
             | Everything is new to the uninitiated. :P
        
         | WalterBright wrote:
         | > We soon found out that we could make algorithmic improvements
         | so much more quickly
         | 
         | It's true that writing code in C doesn't automatically make it
         | faster.
         | 
         | For example, string manipulation. 0-terminated strings (the
         | default in C) are, frankly, an abomination. String processing
         | code is a tangle of strlen, strcpy, strncpy, strcat, all of
         | which require repeated passes over the string looking for the
         | 0. (Even worse, reloading the string into the cache just to
         | find its length makes things even slower.)
         | 
         | Worse is the problem that, in order to slice a string, you have
         | to malloc some memory and copy the string. And then carefully
         | manage the lifetime of that slice.
         | 
         | The fix is simple - use length-delimited strings. D relies on
         | them to great effect. You can do them in C, but you get no
         | succor from the language. I've proposed a simple enhancement
         | for C to make them work
         | https://www.digitalmars.com/articles/C-biggest-mistake.html but
         | nobody in the C world has any interest in it (which baffles me,
         | it is so simple!).
         | 
         | Another source of slowdown in C is I've discovered over the
         | years that C is not a plastic language, it is a brittle one.
         | The first algorithm you select for a C project gets so welded
         | into it that it cannot be changed without great difficulty.
         | (And we all know that algorithms are the key to speed, not
         | coding details.) Why isn't C plastic?
         | 
         | It's because one cannot switch back and forth between a
         | reference type and a value type without extensively rewriting
         | every use of it. For example:                   struct S { int
         | a; }         int foo(struct S s) { return s.a; }         int
         | bar(struct S *s) { return s->a; }
         | 
         | If you want to switch between reference and value, you've got
         | to go through all your code swapping . and ->. It's just too
         | tedious and never happens. In D:                   struct S {
         | int a; }         int foo(S s) { return s.a; }         int bar(S
         | *s) { return s.a; }
         | 
         | I discovered while working on D that there is _no reason_ for
         | the C and C++ - > operator to even exist, the . operator covers
         | both bases!
        
         | zeroonetwothree wrote:
         | I ported Python to C++ one time and it ran 10c faster with 10x
         | less memory usage with no architectural changes
        
       | slopinthebag wrote:
       | This article is obviously AI generated and besides being jarring
       | to read, it makes me really doubt its validity. You can get
       | substantially faster parsing versus `JSON.parse()` by parsing
       | structured binary data, and it's also faster to pass a byte array
       | compared to a JSON string from wasm to the browser. My guess is
       | not only this article was AI generated, but also their
       | benchmarks, and perhaps the implementation as well.
        
         | StilesCrisis wrote:
         | It's vibe code all the way down!
        
       | jeremyjh wrote:
       | > The openui-lang parser converts a custom DSL emitted by an LLM
       | into a React component tree.
       | 
       | > converts internal AST into the public OutputNode format
       | consumed by the React renderer
       | 
       | Why not just have the LLM emit the JSON for OutputNode ? Why is a
       | custom "language" and parser needed at all? And yes, there is a
       | cost for marshaling data, so you should avoid doing it where
       | possible, and do it in large chunks when its not possible to
       | avoid. This is not an unknown phenomenon.
        
       | envguard wrote:
       | The WASM story is interesting from a security angle too. WASM
       | modules inheriting the host's memory model means any parsing bugs
       | that trigger buffer overreads in the Rust code could surface in
       | ways that are harder to audit at the JS boundary. Moving to
       | native TS at least keeps the attack surface in one runtime, even
       | if the theoretical memory safety guarantees go down.
        
       | marcosdumay wrote:
       | It would be great if people stopped dismissing the problem that
       | WASM not being a first-class runtime for the web causes.
        
       | kennykartman wrote:
       | I dream of the day in which there is no need to pass by JS and
       | Wasm can do all the job by itself. Meanwhile, we are stuck.
        
       | vmsp wrote:
       | Not directly related to the post but what does OpenUI do? I'm
       | finding it interesting but hard to understand. Is it an
       | intermediate layer that makes LLMs generate better UI?
        
         | rabisg wrote:
         | Its the library that bridges the gap between LLMs and live UI.
         | Best example would be to imagine you want to build interactive
         | charts within your AI agent (like Claude)
         | 
         | The most obvious approach would be to let LLMs generate code
         | and render it but that introduces problems like safety, UI
         | consistency and speed. OpenUI solves those problems and
         | provides a safe, consistent and token optimized runtime for the
         | LLMs to render live UI
        
           | aquariusDue wrote:
           | Is it kinda similar to the new GenUI SDK for Flutter in that
           | sense?
           | 
           | https://docs.flutter.dev/ai/genui
        
             | rabisg wrote:
             | Haven't looked in depth but yes it feels like they are
             | solving the same problem.
             | 
             | This is an alternative to json-render by Vercel or A2UI by
             | Google which I'm guessing the flutter implementation is
             | based on
        
       | owenpalmer wrote:
       | So this is an issue with WASM/JS interop, not with Rust per se?
        
       | measurablefunc wrote:
       | I tried a similar experiment recently w/ FFT transform for wav
       | files in the browser and javascript was faster than wasm. It was
       | mostly vibe coded Rust to wasm but FFT is a well-known algorithm
       | so I don't think there were any low hanging performance
       | improvements left to pick.
        
         | wintermute4282 wrote:
         | It looks like FFTW3 is working on wasm support:
         | https://github.com/FFTW/fftw3/issues/293
         | 
         | You could also try pretty fast fft:
         | https://github.com/JorenSix/pffft.wasm
        
       | simonbw wrote:
       | Yeah if you're serializing and deserializing data across the JS-
       | WASM boundary (or actually between web workers in general whether
       | they're WASM or not) the data marshaling costs can add up. There
       | is a way of sharing memory across the boundary though without any
       | marshaling: TypedArrays and SharedArrayBuffers. TypedArrays let
       | you transfer ownership of the underlying memory from one worker
       | (or the main thread) to another without any copying.
       | SharedArrayBuffers allow multiple workers to read and write to
       | the same contiguous chunk of memory. The downside is that you
       | lose all the niceties of any JavaScript types and you're
       | basically stuck working with raw bytes.
       | 
       | You still do get some latency from the event loop, because
       | postMessage gets queued as a MacroTask, which is probably on the
       | order of 10ms. But this is the price you have to pay if you want
       | to run some code in a non-blocking way.
        
         | jesse__ wrote:
         | This should be the top comment
        
         | osullivj wrote:
         | Strongly agree from an Emscripten C++ wasm pov: it's key to
         | minimise emscripten::val roundtrips. Caches must be designed
         | for rectilinear data geometry, and SharedArrayBuffers are the
         | way for bulk data. But only JS allows us to express asynchrony,
         | so we need an on_completion callback design at the lang
         | boundary.
        
           | tankenmate wrote:
           | Indeed a whole class of issues become moot if you just don't
           | use javascript anywhere. In the browser world this is
           | obviously difficult/impossible; I look forward to the day
           | when WASM can run natively in a browser and doesn't need
           | javascript at all, DOM, network, etc, etc. On the server
           | side? Just steer clear of the javascript ecosystem
           | altogether.
        
         | fHr wrote:
         | So the actual processing is faster in rust/c/c++ but the
         | marshaling costs are so big so ts is faster in this case? No
         | vlue how something like swc does this but there it's way faster
         | then babel.
        
       | sakesun wrote:
       | I heard a lot of similar stories in the past when I started using
       | Python 20+ years ago. A number of people claimed their solutions
       | got faster when develop in Python, mainly because Python make it
       | easier to quickly pivot to experiment with various alternative
       | methods, hence finally yield at more efficient outcome at the
       | end.
        
       | horacemorace wrote:
       | I'm more of a dabbler dev/script guy than a dev but Every.
       | single. thing I ever write in javascript ends up being incredibly
       | fast. It forces me to think in callbacks and events and promises.
       | Python and C (or async!) seem easy and sorta lazy in comparison.
        
       | jesse__ wrote:
       | This somehow reminds me of the days when the fastest way to deep
       | copy an object in javascript was to round trip through toString.
       | I thought that was gross then, and I think this is gross now
        
       | athrowaway3z wrote:
       | Its also worth underlining that it's not just "The parsing
       | computation is fast enough that V8's JIT eliminates any Rust
       | advantage", but specifically that this kind of straight-forward
       | well-defined data structures and mutation, without any strange
       | eval paths or global access is going to be JITed to near native
       | speed relatively easily.
        
       | mwcampbell wrote:
       | I hope we can still get to a point where wasm modules can
       | directly access the web platform APIs and get JS out of the
       | picture entirely. After all, those APIs themselves are
       | implemented in C++ (and maybe some Rust now).
        
       | shevy-java wrote:
       | So ...
       | 
       | Rust.
       | 
       | WASM.
       | 
       | TypeScript.
       | 
       | I am slowly beginning to understand why WASM did not really
       | succeed.
        
       | bulbar wrote:
       | Is this an outlier or has Rust started to be part of the
       | establishment and being 'old' so that people want to share their
       | "moving away from Rust" stories?
       | 
       | I didn't mind reading articles that are not about how Rust is
       | great in theory (and maybe practice).
        
         | quotemstr wrote:
         | There's a certain segment of the industry that's always chasing
         | the newest thing. Many of them like Zig for some ghastly
         | reason.
         | 
         | That said, Rust does have real problems. Manual memory
         | management _sucks_. People think GC is expensive? Well, keep in
         | mind malloc() and free() take global locks! People just have
         | totally bogus mental models of what drives performance. These
         | models lead them to technical nonsense.
        
         | zozbot234 wrote:
         | This story is about moving away from WASM for an application
         | that's unsuitable for it. It's not really about Rust.
        
           | notnullorvoid wrote:
           | It's not an unsuitable application for WASM. They could've
           | drastically reduced the WASM boundary impact if instead of
           | mapping to JSON in Rust they streamed out structured bytes to
           | JS then mapped to JSON there. And the streaming fix was
           | language independent.
           | 
           | So it's more so a story about architectural mistakes.
        
       | Dwedit wrote:
       | JS and WASM share the main arraybuffer. It's just very not-
       | javascript-like to try to use an arraybuffer heap, because then
       | you don't have strings or objects, just index,size pairs into
       | that arraybuffer.
       | 
       | Anyway, Javascript is no stranger to breaking changes. Compare
       | Chromium 47 to today. Just add actual integers as another
       | breaking change, then WASM becomes almost unnecessary.
        
       | fHr wrote:
       | I almost can't believe this swc for example is 80x faster then
       | babeljs.
        
       | gettingoverit wrote:
       | In ye olden days of WASM just added to the browser, the
       | difference between native JS and boost::spirit in WASM was x200.
       | 
       | In their worst case it was just x5. We clearly have some progress
       | here.
        
       | pjmlp wrote:
       | This is why, when a programming language already has tooling for
       | compilers, being it ahead of time, or dynamic, it pays off to
       | first go around validating algorithms and data structures before
       | a full rewrite.
       | 
       | Additionally even after those options are exhausted, only a key
       | parts might need a rewrite, not the whole thing.
       | 
       | However, I wonder how many care about actually learning about
       | algorithms, data structures and mechanical sympathy in the age of
       | Electron apps.
       | 
       | It feels quite often that a rewrite is chosen, because knowing
       | how to actually apply those skills is the CS stuff many think
       | isn't worthwhile learning about.
        
         | coldtea wrote:
         | > _However, I wonder how many care about actually learning
         | about algorithms, data structures and mechanical sympathy in
         | the age of Electron apps._
         | 
         | Never mind the age of Electron apps, even fewer care about
         | those in the age of agents.
        
           | pjmlp wrote:
           | Agreed, however I would assert that in the age of agents,
           | programming languages will become irrelevant to most, other
           | those lucky enough druids to write AI runtime stack, at the
           | AI overlords.
           | 
           | And those will still care about CS.
        
       | moomin wrote:
       | "We saw huge speed-ups when changing technology."
       | 
       | Looks inside
       | 
       | "The old implementation had some really inappropriate choices."
       | 
       | Every time.
        
       | LunaSea wrote:
       | This has been known by Node.js developers for a while with many
       | C++ core and NPM modules being rewritten in JavaScript to improve
       | performance.
        
       | bluelightning2k wrote:
       | Great write up. It feels like craft in the age of slop.
       | 
       | Not sold about the fundamental idea of OpenUI though. XML is a
       | great fit for DSLs and UI snippets.
        
         | twoodfin wrote:
         | Are you kidding? To the extent this was "crafted" it was by an
         | LLM from somebody's notes in a prompt.
         | 
         | The other day, someone linked back to this 2018 post on finding
         | a cache coherency bug in the Xbox 360 CPU:
         | 
         | https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-d...
         | 
         | So much more genuinely engaging than _any_ of the AI-"enhanced"
         | sloppy, confused, trite writing that gets to the front page
         | here daily because it's been hyper-optimized for upvotes.
        
         | rabisg wrote:
         | We tried all formats - XML, json, jsonl, even toon - before
         | deciding that we need to invest in OpenUI Lang
         | 
         | The primary motivation was speed and schema cohesion. We were
         | running a JSON based format, Thesys C1, in production for a
         | year before we realized we cannot add features fast enough
         | because we were fighting the LLMs at multiple levels. It's
         | probably too much to write in a comment but we'd like to write
         | about the motivation and all the things we tried ona a separate
         | blog soon
        
       | diablevv wrote:
       | The real lesson here isn't "TypeScript beats Rust" - it's that
       | WASM has non-trivial overhead that's easy to underestimate. The
       | JS engine has spent decades being optimized specifically for the
       | patterns JS/TS code tends to produce. When you cross the WASM
       | boundary, you pay for it: serialization, memory copies, the
       | impedance mismatch between WASM's linear memory model and JS's
       | garbage-collected heap.
       | 
       | For a parser specifically, you're probably spending a lot of time
       | creating and discarding small AST nodes. That's exactly the kind
       | of workload where V8's generational GC shines and where WASM's
       | manual memory management becomes a liability rather than an
       | asset.
       | 
       | The interesting question is whether this scales. A parser that
       | runs on small inputs in a browser is a very different beast from
       | one processing multi-megabyte files in a tight loop. At some
       | point the WASM version probably wins - the question is whether
       | that workload actually exists in your product.
        
       | mohsen1 wrote:
       | When there is a solid test harness, AI Coding can do magic!
       | 
       | It was able to beat XZ on its own game by a good margin:
       | 
       | https://github.com/mohsen1/fesh
        
         | applfanboysbgon wrote:
         | > I had no idea how any of this works.
         | 
         | This is apparent. xz's own game is not "a specialized
         | compression pre-processor for x86_64 ELF binaries.". xz's own
         | game is a general-purpose compression utility suited for a
         | range of tasks, not optimized for one ridiculously specific
         | domain. Also, any compression benchmark really ought to include
         | speed of de/compression, not only compression ratio, as
         | compression algorithms occupy along a scale trying to maximize
         | one trade-off or another.
        
           | mohsen1 wrote:
           | I never claimed to beat xz as a general-purpose compressor.
           | .tar.xz is the dominant format for Linux source tarballs and
           | distro packages. So optimizing for ELF + x86_64 is optimizing
           | for a very real and common case, not some toy benchmark.
           | 
           | btw goal of the project was _not_ building a production ready
           | solution. It was curious case of black box software
           | development. Compression is great because input and output
           | are precise bits. As for speed, I think it 's comparable
           | since it's using most of XZ infra anyways.
        
       | rpodraza wrote:
       | Press x to doubt
        
       | gavinray wrote:
       | Why weren't you able to use WASM shared heaps to get zero-copy
       | behavior?
       | 
       | AFAIK, you can create a shared memory block between WASM <-> JS:
       | 
       | https://developer.mozilla.org/en-US/docs/WebAssembly/Referen...
       | 
       | Then you'd only need to parse the SharedArrayBuffer at the end on
       | the JS side
        
       ___________________________________________________________________
       (page generated 2026-03-21 23:01 UTC)