[HN Gopher] Optimizing Ruby's JSON, Part 1
       ___________________________________________________________________
        
       Optimizing Ruby's JSON, Part 1
        
       Author : todsacerdoti
       Score  : 224 points
       Date   : 2024-12-18 00:08 UTC (22 hours ago)
        
 (HTM) web link (byroot.github.io)
 (TXT) w3m dump (byroot.github.io)
        
       | hahahacorn wrote:
       | Great read & great work from the author, is there any reason to
       | use Oj going forward?
        
         | byroot wrote:
         | Author here.
         | 
         | Oj has an extremely large API that I have no intent on
         | emulating in the default json gem, things such as "SAJ" (SAX
         | style parsing), various escaping schemes etc.
         | 
         | My goal is only to make it unnecessary for the 95% or so use
         | case, so yes, Oj will remain useful to some people for a bunch
         | of uses cases.
        
           | onli wrote:
           | Sax style parsing is a godsend when dealing with large files,
           | regardless of json or xml. It's indeed what made me switch to
           | a different json library in a Ruby project of mine (I'd have
           | to look it up, but probably to oj).
        
       | thiago_fm wrote:
       | I love byroot's work. I'm always surpised not only by the kind of
       | contributions he does but the sheer size of how much he does,
       | insane productivity.
       | 
       | Wish he would write more often, I've tried to get into ruby-core
       | type of work more than once, but never found something that
       | matched my skills so I could positively contribute and after a
       | few weeks of no results the motivation would wear off, as it's
       | really difficult to have the context he has shared in the
       | article, for example.
       | 
       | If more Ruby C people would write more often, I bet there'd be
       | more people with the skills that are needed to improve Ruby
       | further.
       | 
       | The C profiler advice was great. Maybe I could just get a Ruby
       | gem with C code and start playing again on optimizations :-)
        
         | benoittgt wrote:
         | There is this great serie of Peter Zhu too.
         | https://blog.peterzhu.ca/ruby-c-ext/ Even if it's C extension,
         | it helps understanding some concepts.
         | 
         | But I agree with you.
        
           | thiago_fm wrote:
           | That's awesome, I personally wasn't aware of that series from
           | Peter Zhu. Thanks!
        
         | wkjagt wrote:
         | > insane productivity
         | 
         | He's insanely productive, but also insanely smart. I used to
         | work in the same office as him at Shopify, and he's the kind of
         | person whose level just seems unattainable.
        
           | richardlblair wrote:
           | Fully agree. He's also really patient and kind. He's the type
           | of person who's really really smart and doesn't make you feel
           | really really dumb. He takes the time to thoroughly explain
           | things, without being condescending. I always enjoyed when
           | our paths crossed because I knew I was going to learn
           | something.
        
       | mfkp wrote:
       | Love the write-up on this topic, very easy to follow and makes me
       | want to benchmark and optimize some of my ruby code now. Thanks
       | for putting in the effort and also writing the post, byroot!
        
       | bornelsewhere wrote:
       | Does ruby json use intrinsics? Could it?
       | 
       | Also, how does this play with the various JITs?
        
         | byroot wrote:
         | Not too sure what you mean by intrinsincs.
         | 
         | The `json` gem is implemented in C, so it's a black box for
         | YJIT (the reference implementation's JIT).
         | 
         | The TruffleRuby JIT used to interpret C extensions with sulong
         | so it could JIT across languages barrier, but AFAIK they
         | recently stopped doing that because of various compatibility
         | issues.
         | 
         | Also on TruffleRuby the JSON parser is implemented in C, but
         | the encoder is in pure Ruby [0]
         | 
         | [0]
         | https://github.com/ruby/json/blob/e1f6456499d497f33f69ae4c1a...
        
           | bornelsewhere wrote:
           | Thanks!
           | 
           | Sorry about misuse of "intrinsics". There is a simdjson
           | library that uses SIMD instructions for speed. Would such an
           | approach be feasible in the ruby json library?
        
             | byroot wrote:
             | Ah I see.
             | 
             | TL;DR; it's possible, but lots of work, and not that huge
             | of a gain in the context of a Ruby JSON parser.
             | 
             | `ruby/json` doesn't use explicit SIMD instructions, some
             | routines are written in a way that somewhat expects
             | compilers to be able to auto-vectorize, but it's never a
             | given.
             | 
             | In theory using SIMD would be possible as proven by
             | SIMDjson, but it's very (edit) UNlikely we'll do it because
             | of multiple reasons.
             | 
             | First for portability, we have to stick with raw C99, no
             | C++ allowed, so that prevent using SIMDjson outright.
             | 
             | In theory, we could implement the same sort of logic with
             | support for various processors that have various level of
             | SIMD support and have runtime dispatch for it would be
             | terribly tedious. So it's not a reasonable amount of
             | complexity for the amount of time I and other people are
             | willing to spend on the library.
             | 
             | Then there's the fact that it wouldn't do as big as a
             | difference as you'd think. I do happen to have made some
             | bindings for simdjson in https://github.com/Shopify/heap-
             | profiler, because I had an use case for parsing gigabytes
             | of JSON, and it helps quite a bit there.
             | 
             | But I'll hopefully touch on that in a future blog post, the
             | actual JSON parsing part is entirely dwarfed by the work
             | needed to build the resulting Ruby objects tree.
        
               | thiago_fm wrote:
               | Curious about the next post.
               | 
               | My naive/clueless mind always wonders if it wouldn't make
               | sense to make a new class of Ruby objects that are much
               | simpler and would yield both less memory consumption and
               | GC optimizations that could be used for such cases.
               | 
               | Without a different object model it's hard to imagine
               | optimizations that could greatly improve Ruby execution
               | speed for CRuby, or make the GC much faster (huge issue
               | for big applications), but maybe it's because I don't
               | know much :-)
        
       | izietto wrote:
       | First, if the author is going to read this, let me thank you for
       | your work. As a Rails developer, I find the premises very
       | relatable.
       | 
       | Again, as a Rails developer, a pain point is different naming
       | conventions regarding Ruby hash keys versus JS/JSON object keys.
       | JavaScript/JSON typically uses camelCase, while Ruby uses
       | snake_case. This forces me to perform tedious and often disliked
       | transformations between these conventions in my Rails projects,
       | requiring remapping for every JSON object. This process is both
       | annoying and potentially performance-intensive. What alternative
       | approaches exist, and are there ways to improve the performance
       | of these transformations?
        
         | rajaravivarma_r wrote:
         | I don't have a solution for the performance problem. But for
         | the camelCase to snake_case conversion, I can see potential
         | solutions.
         | 
         | 1. If you are using axios or other fetch based library, then
         | you can use an interceptor that converts the camelCase
         | JavaScript objects to 'snake_case' for request and vice versa
         | for response.
         | 
         | 2. If you want to control that on the app side, then you can
         | use a helper method in ApplicationController, say
         | `json_params`, that returns the JSON object with snake_case
         | keys. Similarly wrap the `render json: json_object` into a
         | helper method like `render_camel_case_json_response` and use
         | that in all the controllers. You can write a custom Rubocop to
         | make this behaviour consistent.
         | 
         | 3. Handle the case transformation in a Rack middleware. This
         | way you don't have to enforce developers to use those helper
         | methods.
        
           | thiago_fm wrote:
           | I believe his point is that this transformation could be done
           | maybe in C and therefore have better performance, it could be
           | a flag to the JSON conversion.
           | 
           | I find the idea good, maybe it even already exists?
        
             | byroot wrote:
             | It could be done relatively efficiently in C indeed, but it
             | would be yet another option, imposing et another
             | conditional, and as I mention in the post (and will keep
             | hammering in the followups) conditions is something you
             | want to avoid for performance.
             | 
             | IMO that's the sort of conversion that would be better
             | handled by the "presentation" layer (as in
             | ActiveModel::Serializers and al).
             | 
             | In these gems you usually define something like:
             | class UserSerializer < AMS::Serializer           attributes
             | :first_name, :email         end
             | 
             | It wouldn't be hard for these libraries to apply a
             | transformation on the attribute name at almost zero cost.
        
         | xcskier56 wrote:
         | We use a gem called olive branch. Yes it's going to give you a
         | performance hit, but it keeps you sane which is very worthwhile
        
           | izietto wrote:
           | This one? https://github.com/vigetlabs/olive_branch Looks
           | interesting, unfortunately its latest update is from 3 years
           | ago
        
             | sensanaty wrote:
             | If you look at the source [1], you'll see what it's doing
             | is very simple (the file I linked is basically the whole
             | library, everything else is Gem-specific things and tests).
             | You can even skip the gem and implement it yourself, not a
             | big dependency at all, so no need for constant maintenance
             | in this case :p
             | 
             | [1] https://github.com/vigetlabs/olive_branch/blob/main/lib
             | /oliv...
        
             | werdnapk wrote:
             | I don't understand the issue with it not being updated for
             | 3 years. Perhaps it's stable and requires no updates?
             | 
             | If the author says it's no longer maintained, then that's
             | something different.
        
               | izietto wrote:
               | This is my concern: https://github.com/vigetlabs/olive_br
               | anch/blob/8bd792610945f...
               | 
               | No guarantee it's going to work with Ruby > 2.7 || Rails
               | > 6.1.
               | 
               | When I upgrade components I want to know other components
               | are not going to break anything.
        
               | werdnapk wrote:
               | Ok, updates regarding dependencies is definitely a good
               | point.
        
         | revskill wrote:
         | Me too. And even crystal language has the same issue.
        
         | SkyPuncher wrote:
         | I love Rails, but if I could go back in time and tell them to
         | avoid one thing it would be their strict adherence to naming
         | conventions.
         | 
         | I've spent more time in my career debugging the magic than I
         | would have by simply defining explicit references.
        
           | caseyohara wrote:
           | > if I could go back in time and tell them to avoid one thing
           | it would be their strict adherence to naming conventions
           | 
           |  _Monkey paw curls._ Rails probably wouldn't have reached
           | popularity were it not for the strict adherence to naming
           | conventions. The Rails value prop is productivity and the
           | ethos of "convention over configuration" is what makes that
           | possible.
           | 
           | "Convention over configuration" was coined by DHH himself.
           | https://en.wikipedia.org/wiki/Convention_over_configuration
           | Without it, you don't have Rails.
        
             | SkyPuncher wrote:
             | The problem is the convention is not always clear or
             | consistent. When it breaks, it can be very difficult to
             | debug.
             | 
             | In most cases, we're talking about trivial amount of extra
             | code. Things like a handful of config lines at the top of a
             | class.
        
           | jkmcf wrote:
           | I'm ok with most of the naming conventions, but the
           | pluralization is one I loathe. The necessity of custom
           | inflections should have been a strong smell, IMO.
        
             | regularfry wrote:
             | Semantically it makes sense. It's not Rails' fault the
             | English language is a terrible serialisation format.
        
               | dmurray wrote:
               | But it can be Rails' fault for choosing to default to a
               | terrible serialisation format.
        
               | ysavir wrote:
               | What would you propose as an alternative?
        
             | SkyPuncher wrote:
             | I agree. I think it also would have been better to
             | explicitly name "collections" as unique from "items".
             | 
             | GooseCollection is more meaningful than Geese (or Gooses).
        
         | ysavir wrote:
         | > Again, as a Rails developer, a pain point is different naming
         | conventions regarding Ruby hash keys versus JS/JSON object
         | keys. JavaScript/JSON typically uses camelCase, while Ruby uses
         | snake_case.
         | 
         | Most APIs I've come across use snake_case for their keys in
         | JSON requests and responses. I rarely come across camelCase in
         | JSON keys. So I'm happy to just write snake_case keys and let
         | my backend stay simple and easy, and let the API consumer
         | handle any transformations.
         | 
         | I use the same approach another comment points out, using Axios
         | transformers to convert back and forth as necessary.
        
         | BurningFrog wrote:
         | Sounds like something you could fix once and for all with some
         | metaprogramming converting between the two case cases?
        
       | rfl890 wrote:
       | IIRC the branch predictor hint is useless on modern CPUs
        
         | Someone wrote:
         | It _was_ useless on modern CPUs, but has become somewhat useful
         | again on some CPUs. https://www.phoronix.com/news/GCC-Clang-
         | Intel-x86-Branch-Hin...:
         | 
         |  _"Starting with the Redwood Cove microarchitecture, if the
         | predictor has no stored information about a branch, the branch
         | has the Intel SSE2 branch taken hint (i.e., instruction prefix
         | 3EH), When the codec decodes the branch, it flips the branch's
         | prediction from not-taken to taken. It then flushes the
         | pipeline in front of it and steers this pipeline to fetch the
         | taken path of the branch.
         | 
         | ...
         | 
         | The hint is only used when the predictor does not have stored
         | information about the branch. To avoid code bloat and reducing
         | the instruction fetch bandwidth, don't add the hint to a branch
         | in hot code--for example, a branch inside a loop with a high
         | iteration count--because the predictor will likely have stored
         | information about that branch. Ideally, the hint should only be
         | added to infrequently executed branches that are mostly taken,
         | but identifying those branches may be difficult. Compilers are
         | advised to add the hints as part of profile-guided
         | optimization, where the one-sided execution path cannot be laid
         | out as a fall-through. The Redwood Cove microarchitecture
         | introduces new performance monitoring events to guide hint
         | placement."_
        
       | meisel wrote:
       | Very fun read! I'm curious though, when it comes to non-Ruby-
       | specific optimizations, like the lookup table for escape
       | characters, why not instead leverage an existing library like
       | simdjson that's already doing this sort of thing?
        
         | byroot wrote:
         | I somewhat answered that in
         | https://news.ycombinator.com/item?id=42450085
         | 
         | In short, since `ruby/json` ships with Ruby, it has to be
         | compatible with its constraints, which today means plain c99,
         | and no c++. There would also probably be a licensing issue with
         | simdjson (Apache 2), but not sure.
         | 
         | Overall there's a bunch of really nice c++ libraries I'd love
         | to use, like dragonbox, but just can't.
         | 
         | Another thing is that last time I checked, simdjson only
         | provided a parser, the ruby/json gem does both parsing and
         | encoding so it would only help on half the problem space.
        
           | meisel wrote:
           | yyjson is a very fast C89 compliant C parser that can both
           | parse and generate JSON
        
             | mort96 wrote:
             | The benefit of a Ruby-specific JSON parser is that the
             | parser can directly output Ruby objects. Generic C JSON
             | parsers generally have their own data model, so instead of
             | just parsing JSON text into Ruby objects you'd be parsing
             | JSON text into an intermediate data structure and then walk
             | that to generate Ruby objects. That'd necessarily use more
             | memory, and it'd probably be slower too unless the parser
             | is _way_ faster.
             | 
             | Same applies to generating JSON: you'd have to first walk
             | the Ruby object graph to build a yyjson JSON tree, then
             | hand that over to yyjson.
        
               | meisel wrote:
               | All of the would be a big savings in code complexity and
               | a win for reliability, compared to doing new untested
               | optimizations. If memory usage is a concern, I'm sure
               | there's a fast C SAX parser out there (or maybe one
               | within yyjson)
        
               | mort96 wrote:
               | I don't understand what you're getting at. If performance
               | is a concern, integrating a different parser written in C
               | isn't desirable, as it would probably be slower than the
               | existing parser for the reasons I mentioned (or at least
               | be severely slowed down by the conversion step), so you
               | need to optimize the Ruby-specific parser. If performance
               | isn't a concern, keeping the old, battle-tested Ruby
               | parser unmodified would surely be better for reliability
               | than trying to integrate yyjson.
        
               | meisel wrote:
               | Take a look at SAX parsers
        
         | akira2501 wrote:
         | What I love about this article is it's actual engineering work
         | on an existing code base. It doesn't seek to just replace
         | things or swap libraries in an effort to be marginally faster.
         | It digs into the actual code and seeks to genuinely improve it
         | not only for speed but for efficiency. This simply does not get
         | done enough in modern projects.
         | 
         | I wonder if it was done more regularly would we even end up
         | with libraries like simdjson or oj in the first place? The
         | problem domain simply isn't _that_ hard.
        
       | semiquaver wrote:
       | Part 2 is available:
       | https://byroot.github.io/ruby/json/2024/12/18/optimizing-rub...
        
       | lexicality wrote:
       | Possibly I've missed it, but is there anything saying how long
       | the new version takes to parse/encode the Twitter JSON dump with
       | all optimisations applied?
        
         | byroot wrote:
         | There's quite a few more commits to go before the final result,
         | but you can see it in the release notes:                 -
         | https://github.com/ruby/json/releases/tag/v2.7.3       -
         | https://github.com/ruby/json/releases/tag/v2.8.0
        
           | _gtly wrote:
           | Unrelated to json, but a Rails commit by you 2024-10-29:
           | ActiveModel::Type::Integer#serialize 3.17x - 9.67x faster!
           | 
           | https://github.com/rails/rails/commit/70ca4ab91af15714cae4e8.
           | ..
        
           | kristianp wrote:
           | Note that urls in preformatted text are not rendered as
           | links.
           | 
           | Clickable:
           | 
           | https://github.com/ruby/json/releases/tag/v2.7.3
           | 
           | https://github.com/ruby/json/releases/tag/v2.8.0
        
       | Lammy wrote:
       | Great work and great articles.
       | 
       | > Yet another patch in Mame's PR was to use one of my favorite
       | performance tricks, what's called a "lookup table".
       | 
       | One thing stood out to me here as a fellow lookup-table-liker
       | that I would like to mention even though it is probably not
       | relevant to a generic JSON generator/parser which has to handle
       | arbitrary String Encodings.
       | 
       | The example optimized code uses `String#each_char` which incurs
       | an extra object allocation for each iteration compared to
       | `String#each_codepoint` which works with Immediates. If you are
       | parsing/generating something where the Encoding is guaranteed,
       | defining the LUT in terms of codepoints saves a bunch of GC
       | pressure from the throwaway single-character String objects which
       | have to be collected. One of the pieces of example code even uses
       | `#each_char` and _then_ compares its `#ord`, so it 's already
       | halfway there.
       | 
       | I don't feel like compiling 3.4-master to test it too, but I just
       | verified this in Ruby 3.3.6:                 irb(main):001>
       | "aaa".-@.each_char { p _1.object_id }       4440       4460
       | 4480       => "aaa"       irb(main):002> "aaa".-@.each_codepoint
       | { p _1.object_id }       24709       24709       24709       =>
       | "aaa"       irb(main):003> RUBY_VERSION       => "3.3.6"
       | 
       | Apologies for linking my own hobby codebase, but here are two
       | examples of my own simple LUT parsers/generators where I achieved
       | even more performance by collecting the codepoints and then
       | turning it into a `String` in a single shot with `Array#pack`:
       | 
       | - One from my filetype-guessing library that turns Media Type
       | strings into key `Structs` both when parsing the shared-mime-info
       | Type definitions and when taking user input to get the Type
       | object for something like `image/jpeg`:
       | https://github.com/okeeblow/DistorteD/blob/fbb987428ed14d710...
       | (Comment contains some before-and-after allocation comparisons)
       | 
       | - One from my support library that turns POSIX Globs into Ruby
       | `Regexp` objects like Python's stdlib `fnmatch.translate`:
       | https://github.com/okeeblow/DistorteD/blob/NEW%E2%80%85SENSA...
       | 
       | Disclaimer: I haven't benchmarked this with YJIT which might
       | render my entire experience invalid :)
        
         | byroot wrote:
         | > The example optimized code uses `String#each_char` which
         | incurs an extra object allocation for each iteration
         | 
         | You're right. To be closer to the real C code, I should have
         | used `each_byte`, given the C code works with bytes and not
         | codepoints nor characters.
         | 
         | But this is mostly meant as pseudo-code to convey the general
         | idea, not as actually efficient code, so not a big deal.
        
       ___________________________________________________________________
       (page generated 2024-12-18 23:00 UTC)