[HN Gopher] Optimizing Ruby's JSON, Part 1
___________________________________________________________________
Optimizing Ruby's JSON, Part 1
Author : todsacerdoti
Score : 224 points
Date : 2024-12-18 00:08 UTC (22 hours ago)
(HTM) web link (byroot.github.io)
(TXT) w3m dump (byroot.github.io)
| hahahacorn wrote:
| Great read & great work from the author, is there any reason to
| use Oj going forward?
| byroot wrote:
| Author here.
|
| Oj has an extremely large API that I have no intent on
| emulating in the default json gem, things such as "SAJ" (SAX
| style parsing), various escaping schemes etc.
|
| My goal is only to make it unnecessary for the 95% or so use
| case, so yes, Oj will remain useful to some people for a bunch
| of uses cases.
| onli wrote:
| Sax style parsing is a godsend when dealing with large files,
| regardless of json or xml. It's indeed what made me switch to
| a different json library in a Ruby project of mine (I'd have
| to look it up, but probably to oj).
| thiago_fm wrote:
| I love byroot's work. I'm always surpised not only by the kind of
| contributions he does but the sheer size of how much he does,
| insane productivity.
|
| Wish he would write more often, I've tried to get into ruby-core
| type of work more than once, but never found something that
| matched my skills so I could positively contribute and after a
| few weeks of no results the motivation would wear off, as it's
| really difficult to have the context he has shared in the
| article, for example.
|
| If more Ruby C people would write more often, I bet there'd be
| more people with the skills that are needed to improve Ruby
| further.
|
| The C profiler advice was great. Maybe I could just get a Ruby
| gem with C code and start playing again on optimizations :-)
| benoittgt wrote:
| There is this great serie of Peter Zhu too.
| https://blog.peterzhu.ca/ruby-c-ext/ Even if it's C extension,
| it helps understanding some concepts.
|
| But I agree with you.
| thiago_fm wrote:
| That's awesome, I personally wasn't aware of that series from
| Peter Zhu. Thanks!
| wkjagt wrote:
| > insane productivity
|
| He's insanely productive, but also insanely smart. I used to
| work in the same office as him at Shopify, and he's the kind of
| person whose level just seems unattainable.
| richardlblair wrote:
| Fully agree. He's also really patient and kind. He's the type
| of person who's really really smart and doesn't make you feel
| really really dumb. He takes the time to thoroughly explain
| things, without being condescending. I always enjoyed when
| our paths crossed because I knew I was going to learn
| something.
| mfkp wrote:
| Love the write-up on this topic, very easy to follow and makes me
| want to benchmark and optimize some of my ruby code now. Thanks
| for putting in the effort and also writing the post, byroot!
| bornelsewhere wrote:
| Does ruby json use intrinsics? Could it?
|
| Also, how does this play with the various JITs?
| byroot wrote:
| Not too sure what you mean by intrinsincs.
|
| The `json` gem is implemented in C, so it's a black box for
| YJIT (the reference implementation's JIT).
|
| The TruffleRuby JIT used to interpret C extensions with sulong
| so it could JIT across languages barrier, but AFAIK they
| recently stopped doing that because of various compatibility
| issues.
|
| Also on TruffleRuby the JSON parser is implemented in C, but
| the encoder is in pure Ruby [0]
|
| [0]
| https://github.com/ruby/json/blob/e1f6456499d497f33f69ae4c1a...
| bornelsewhere wrote:
| Thanks!
|
| Sorry about misuse of "intrinsics". There is a simdjson
| library that uses SIMD instructions for speed. Would such an
| approach be feasible in the ruby json library?
| byroot wrote:
| Ah I see.
|
| TL;DR; it's possible, but lots of work, and not that huge
| of a gain in the context of a Ruby JSON parser.
|
| `ruby/json` doesn't use explicit SIMD instructions, some
| routines are written in a way that somewhat expects
| compilers to be able to auto-vectorize, but it's never a
| given.
|
| In theory using SIMD would be possible as proven by
| SIMDjson, but it's very (edit) UNlikely we'll do it because
| of multiple reasons.
|
| First for portability, we have to stick with raw C99, no
| C++ allowed, so that prevent using SIMDjson outright.
|
| In theory, we could implement the same sort of logic with
| support for various processors that have various level of
| SIMD support and have runtime dispatch for it would be
| terribly tedious. So it's not a reasonable amount of
| complexity for the amount of time I and other people are
| willing to spend on the library.
|
| Then there's the fact that it wouldn't do as big as a
| difference as you'd think. I do happen to have made some
| bindings for simdjson in https://github.com/Shopify/heap-
| profiler, because I had an use case for parsing gigabytes
| of JSON, and it helps quite a bit there.
|
| But I'll hopefully touch on that in a future blog post, the
| actual JSON parsing part is entirely dwarfed by the work
| needed to build the resulting Ruby objects tree.
| thiago_fm wrote:
| Curious about the next post.
|
| My naive/clueless mind always wonders if it wouldn't make
| sense to make a new class of Ruby objects that are much
| simpler and would yield both less memory consumption and
| GC optimizations that could be used for such cases.
|
| Without a different object model it's hard to imagine
| optimizations that could greatly improve Ruby execution
| speed for CRuby, or make the GC much faster (huge issue
| for big applications), but maybe it's because I don't
| know much :-)
| izietto wrote:
| First, if the author is going to read this, let me thank you for
| your work. As a Rails developer, I find the premises very
| relatable.
|
| Again, as a Rails developer, a pain point is different naming
| conventions regarding Ruby hash keys versus JS/JSON object keys.
| JavaScript/JSON typically uses camelCase, while Ruby uses
| snake_case. This forces me to perform tedious and often disliked
| transformations between these conventions in my Rails projects,
| requiring remapping for every JSON object. This process is both
| annoying and potentially performance-intensive. What alternative
| approaches exist, and are there ways to improve the performance
| of these transformations?
| rajaravivarma_r wrote:
| I don't have a solution for the performance problem. But for
| the camelCase to snake_case conversion, I can see potential
| solutions.
|
| 1. If you are using axios or other fetch based library, then
| you can use an interceptor that converts the camelCase
| JavaScript objects to 'snake_case' for request and vice versa
| for response.
|
| 2. If you want to control that on the app side, then you can
| use a helper method in ApplicationController, say
| `json_params`, that returns the JSON object with snake_case
| keys. Similarly wrap the `render json: json_object` into a
| helper method like `render_camel_case_json_response` and use
| that in all the controllers. You can write a custom Rubocop to
| make this behaviour consistent.
|
| 3. Handle the case transformation in a Rack middleware. This
| way you don't have to enforce developers to use those helper
| methods.
| thiago_fm wrote:
| I believe his point is that this transformation could be done
| maybe in C and therefore have better performance, it could be
| a flag to the JSON conversion.
|
| I find the idea good, maybe it even already exists?
| byroot wrote:
| It could be done relatively efficiently in C indeed, but it
| would be yet another option, imposing et another
| conditional, and as I mention in the post (and will keep
| hammering in the followups) conditions is something you
| want to avoid for performance.
|
| IMO that's the sort of conversion that would be better
| handled by the "presentation" layer (as in
| ActiveModel::Serializers and al).
|
| In these gems you usually define something like:
| class UserSerializer < AMS::Serializer attributes
| :first_name, :email end
|
| It wouldn't be hard for these libraries to apply a
| transformation on the attribute name at almost zero cost.
| xcskier56 wrote:
| We use a gem called olive branch. Yes it's going to give you a
| performance hit, but it keeps you sane which is very worthwhile
| izietto wrote:
| This one? https://github.com/vigetlabs/olive_branch Looks
| interesting, unfortunately its latest update is from 3 years
| ago
| sensanaty wrote:
| If you look at the source [1], you'll see what it's doing
| is very simple (the file I linked is basically the whole
| library, everything else is Gem-specific things and tests).
| You can even skip the gem and implement it yourself, not a
| big dependency at all, so no need for constant maintenance
| in this case :p
|
| [1] https://github.com/vigetlabs/olive_branch/blob/main/lib
| /oliv...
| werdnapk wrote:
| I don't understand the issue with it not being updated for
| 3 years. Perhaps it's stable and requires no updates?
|
| If the author says it's no longer maintained, then that's
| something different.
| izietto wrote:
| This is my concern: https://github.com/vigetlabs/olive_br
| anch/blob/8bd792610945f...
|
| No guarantee it's going to work with Ruby > 2.7 || Rails
| > 6.1.
|
| When I upgrade components I want to know other components
| are not going to break anything.
| werdnapk wrote:
| Ok, updates regarding dependencies is definitely a good
| point.
| revskill wrote:
| Me too. And even crystal language has the same issue.
| SkyPuncher wrote:
| I love Rails, but if I could go back in time and tell them to
| avoid one thing it would be their strict adherence to naming
| conventions.
|
| I've spent more time in my career debugging the magic than I
| would have by simply defining explicit references.
| caseyohara wrote:
| > if I could go back in time and tell them to avoid one thing
| it would be their strict adherence to naming conventions
|
| _Monkey paw curls._ Rails probably wouldn't have reached
| popularity were it not for the strict adherence to naming
| conventions. The Rails value prop is productivity and the
| ethos of "convention over configuration" is what makes that
| possible.
|
| "Convention over configuration" was coined by DHH himself.
| https://en.wikipedia.org/wiki/Convention_over_configuration
| Without it, you don't have Rails.
| SkyPuncher wrote:
| The problem is the convention is not always clear or
| consistent. When it breaks, it can be very difficult to
| debug.
|
| In most cases, we're talking about trivial amount of extra
| code. Things like a handful of config lines at the top of a
| class.
| jkmcf wrote:
| I'm ok with most of the naming conventions, but the
| pluralization is one I loathe. The necessity of custom
| inflections should have been a strong smell, IMO.
| regularfry wrote:
| Semantically it makes sense. It's not Rails' fault the
| English language is a terrible serialisation format.
| dmurray wrote:
| But it can be Rails' fault for choosing to default to a
| terrible serialisation format.
| ysavir wrote:
| What would you propose as an alternative?
| SkyPuncher wrote:
| I agree. I think it also would have been better to
| explicitly name "collections" as unique from "items".
|
| GooseCollection is more meaningful than Geese (or Gooses).
| ysavir wrote:
| > Again, as a Rails developer, a pain point is different naming
| conventions regarding Ruby hash keys versus JS/JSON object
| keys. JavaScript/JSON typically uses camelCase, while Ruby uses
| snake_case.
|
| Most APIs I've come across use snake_case for their keys in
| JSON requests and responses. I rarely come across camelCase in
| JSON keys. So I'm happy to just write snake_case keys and let
| my backend stay simple and easy, and let the API consumer
| handle any transformations.
|
| I use the same approach another comment points out, using Axios
| transformers to convert back and forth as necessary.
| BurningFrog wrote:
| Sounds like something you could fix once and for all with some
| metaprogramming converting between the two case cases?
| rfl890 wrote:
| IIRC the branch predictor hint is useless on modern CPUs
| Someone wrote:
| It _was_ useless on modern CPUs, but has become somewhat useful
| again on some CPUs. https://www.phoronix.com/news/GCC-Clang-
| Intel-x86-Branch-Hin...:
|
| _"Starting with the Redwood Cove microarchitecture, if the
| predictor has no stored information about a branch, the branch
| has the Intel SSE2 branch taken hint (i.e., instruction prefix
| 3EH), When the codec decodes the branch, it flips the branch's
| prediction from not-taken to taken. It then flushes the
| pipeline in front of it and steers this pipeline to fetch the
| taken path of the branch.
|
| ...
|
| The hint is only used when the predictor does not have stored
| information about the branch. To avoid code bloat and reducing
| the instruction fetch bandwidth, don't add the hint to a branch
| in hot code--for example, a branch inside a loop with a high
| iteration count--because the predictor will likely have stored
| information about that branch. Ideally, the hint should only be
| added to infrequently executed branches that are mostly taken,
| but identifying those branches may be difficult. Compilers are
| advised to add the hints as part of profile-guided
| optimization, where the one-sided execution path cannot be laid
| out as a fall-through. The Redwood Cove microarchitecture
| introduces new performance monitoring events to guide hint
| placement."_
| meisel wrote:
| Very fun read! I'm curious though, when it comes to non-Ruby-
| specific optimizations, like the lookup table for escape
| characters, why not instead leverage an existing library like
| simdjson that's already doing this sort of thing?
| byroot wrote:
| I somewhat answered that in
| https://news.ycombinator.com/item?id=42450085
|
| In short, since `ruby/json` ships with Ruby, it has to be
| compatible with its constraints, which today means plain c99,
| and no c++. There would also probably be a licensing issue with
| simdjson (Apache 2), but not sure.
|
| Overall there's a bunch of really nice c++ libraries I'd love
| to use, like dragonbox, but just can't.
|
| Another thing is that last time I checked, simdjson only
| provided a parser, the ruby/json gem does both parsing and
| encoding so it would only help on half the problem space.
| meisel wrote:
| yyjson is a very fast C89 compliant C parser that can both
| parse and generate JSON
| mort96 wrote:
| The benefit of a Ruby-specific JSON parser is that the
| parser can directly output Ruby objects. Generic C JSON
| parsers generally have their own data model, so instead of
| just parsing JSON text into Ruby objects you'd be parsing
| JSON text into an intermediate data structure and then walk
| that to generate Ruby objects. That'd necessarily use more
| memory, and it'd probably be slower too unless the parser
| is _way_ faster.
|
| Same applies to generating JSON: you'd have to first walk
| the Ruby object graph to build a yyjson JSON tree, then
| hand that over to yyjson.
| meisel wrote:
| All of the would be a big savings in code complexity and
| a win for reliability, compared to doing new untested
| optimizations. If memory usage is a concern, I'm sure
| there's a fast C SAX parser out there (or maybe one
| within yyjson)
| mort96 wrote:
| I don't understand what you're getting at. If performance
| is a concern, integrating a different parser written in C
| isn't desirable, as it would probably be slower than the
| existing parser for the reasons I mentioned (or at least
| be severely slowed down by the conversion step), so you
| need to optimize the Ruby-specific parser. If performance
| isn't a concern, keeping the old, battle-tested Ruby
| parser unmodified would surely be better for reliability
| than trying to integrate yyjson.
| meisel wrote:
| Take a look at SAX parsers
| akira2501 wrote:
| What I love about this article is it's actual engineering work
| on an existing code base. It doesn't seek to just replace
| things or swap libraries in an effort to be marginally faster.
| It digs into the actual code and seeks to genuinely improve it
| not only for speed but for efficiency. This simply does not get
| done enough in modern projects.
|
| I wonder if it was done more regularly would we even end up
| with libraries like simdjson or oj in the first place? The
| problem domain simply isn't _that_ hard.
| semiquaver wrote:
| Part 2 is available:
| https://byroot.github.io/ruby/json/2024/12/18/optimizing-rub...
| lexicality wrote:
| Possibly I've missed it, but is there anything saying how long
| the new version takes to parse/encode the Twitter JSON dump with
| all optimisations applied?
| byroot wrote:
| There's quite a few more commits to go before the final result,
| but you can see it in the release notes: -
| https://github.com/ruby/json/releases/tag/v2.7.3 -
| https://github.com/ruby/json/releases/tag/v2.8.0
| _gtly wrote:
| Unrelated to json, but a Rails commit by you 2024-10-29:
| ActiveModel::Type::Integer#serialize 3.17x - 9.67x faster!
|
| https://github.com/rails/rails/commit/70ca4ab91af15714cae4e8.
| ..
| kristianp wrote:
| Note that urls in preformatted text are not rendered as
| links.
|
| Clickable:
|
| https://github.com/ruby/json/releases/tag/v2.7.3
|
| https://github.com/ruby/json/releases/tag/v2.8.0
| Lammy wrote:
| Great work and great articles.
|
| > Yet another patch in Mame's PR was to use one of my favorite
| performance tricks, what's called a "lookup table".
|
| One thing stood out to me here as a fellow lookup-table-liker
| that I would like to mention even though it is probably not
| relevant to a generic JSON generator/parser which has to handle
| arbitrary String Encodings.
|
| The example optimized code uses `String#each_char` which incurs
| an extra object allocation for each iteration compared to
| `String#each_codepoint` which works with Immediates. If you are
| parsing/generating something where the Encoding is guaranteed,
| defining the LUT in terms of codepoints saves a bunch of GC
| pressure from the throwaway single-character String objects which
| have to be collected. One of the pieces of example code even uses
| `#each_char` and _then_ compares its `#ord`, so it 's already
| halfway there.
|
| I don't feel like compiling 3.4-master to test it too, but I just
| verified this in Ruby 3.3.6: irb(main):001>
| "aaa".-@.each_char { p _1.object_id } 4440 4460
| 4480 => "aaa" irb(main):002> "aaa".-@.each_codepoint
| { p _1.object_id } 24709 24709 24709 =>
| "aaa" irb(main):003> RUBY_VERSION => "3.3.6"
|
| Apologies for linking my own hobby codebase, but here are two
| examples of my own simple LUT parsers/generators where I achieved
| even more performance by collecting the codepoints and then
| turning it into a `String` in a single shot with `Array#pack`:
|
| - One from my filetype-guessing library that turns Media Type
| strings into key `Structs` both when parsing the shared-mime-info
| Type definitions and when taking user input to get the Type
| object for something like `image/jpeg`:
| https://github.com/okeeblow/DistorteD/blob/fbb987428ed14d710...
| (Comment contains some before-and-after allocation comparisons)
|
| - One from my support library that turns POSIX Globs into Ruby
| `Regexp` objects like Python's stdlib `fnmatch.translate`:
| https://github.com/okeeblow/DistorteD/blob/NEW%E2%80%85SENSA...
|
| Disclaimer: I haven't benchmarked this with YJIT which might
| render my entire experience invalid :)
| byroot wrote:
| > The example optimized code uses `String#each_char` which
| incurs an extra object allocation for each iteration
|
| You're right. To be closer to the real C code, I should have
| used `each_byte`, given the C code works with bytes and not
| codepoints nor characters.
|
| But this is mostly meant as pseudo-code to convey the general
| idea, not as actually efficient code, so not a big deal.
___________________________________________________________________
(page generated 2024-12-18 23:00 UTC)