[HN Gopher] Tiny JITs for a Faster FFI
___________________________________________________________________
Tiny JITs for a Faster FFI
Author : hahahacorn
Score : 298 points
Date : 2025-02-12 22:20 UTC (1 days ago)
(HTM) web link (railsatscale.com)
(TXT) w3m dump (railsatscale.com)
| pestatije wrote:
| FFI - Foreign Function Interface, or how to call C from Ruby
| tonetegeatinst wrote:
| The totally safe and sane approach is to write C code that gets
| passed data via the command line during execution, then vomits
| results to the command line or just into a memory page.
|
| Then just execute the c program with your flags or data in the
| terminal using ruby and viola, ruby can run C code.
| grandempire wrote:
| This. I think many people do not understand Unix processes
| and don't realizing how rare it is to need bindings, ffi, and
| many libraries.
|
| How many programs have an https client in them because they
| need to make one request and didn't know they could use curl?
| nirvdrum wrote:
| Can you please elaborate on this because I'm struggling to
| follow your suggestion. Shelling out to psql every time I
| want to run an SQL query is going to be prohibitively slow.
| It seems to me you'd need bindings in almost the exact same
| cases you'd use a shared library if you were writing in C
| and that's really all bindings are anyway -- a bridge
| between the VM and a native library.
| grandempire wrote:
| Spawning a process isn't the right tool for ALL
| X-language communication. But sometimes it is - and the
| bias tends to be to overlook these opportunities. When
| you are comfortable using libraries, you make more
| libraries. When you know how to use programs, you more
| often make programs.
|
| > Shelling out to psql
|
| I would recommend using a Postgres connection library,
| because that's how Postgres is designed.
|
| Note that ongoing communication can still work with a
| multi-process stdin/stdout design. This is how email
| protocols work. So someone could design a SQL client that
| works this way.
|
| I have absolutely written batch import scripts which
| simply spawn psql for every query, with great results.
|
| > in almost the exact same cases you'd use a shared
| library
|
| That's the thing. Libraries are an entangling
| relationship (literally in your binary). Programs in
| contrast have a clean, modular interface.
|
| So for example you can choose to load the imagemagick
| library, or you can spawn imagemagick. Which one is
| better depends, but most often you don't need the
| library.
|
| Here is a list of examples I have seen solved with a
| library that were completely unnecessary:
|
| - identify the format of an image.
|
| - zip some data
|
| - post usage analytics to a web server at startup
|
| - diff two pieces of data
|
| - convert math symbols to images
|
| - convert x format to y
|
| I have even seen discourse online that suggests that if
| you are serious about AI your web stack needs to be in
| python - as if you can't spawn a process for an AI job.
| nirvdrum wrote:
| > Spawning a process isn't the right tool for ALL
| X-language communication. But sometimes it is
|
| I'm with you here.
|
| > ...many people do not understand Unix processes and
| don't realizing how rare it is to need bindings, ffi, and
| many libraries
|
| But, this is a much stronger claim.
|
| I can't tell if you're making a meta point or addressing
| something in the Ruby ecosystem. I mentioned database
| library bindings because that's far and away the most
| common usage in the Ruby ecosystem, particularly because
| of its frequent adoption for web applications.
|
| The author is advocating for not using native code at all
| if you can avoid it. Keep as much code in Ruby as you can
| and let the JIT optimize it. But, if you do need
| bindings, it'd be great if you didn't have to write a
| native extension. There are a lot of ways to shoot
| yourself in the foot and they complicate the build
| pipeline. However, historically, FFI has been much slower
| than writing a native extension. The point of this post
| is to explore a way to speed up FFI in the cases where
| you need it.
|
| It needs to be taken on faith that the author is doing
| this work because he either has performance sensitive
| code or needs to work with a library. Spawning a new
| process in that case is going to be slower than any of
| the options explored in the post. Writing and
| distributing a C/C++/Zig/Rust/Go application as part of a
| Rails app is a big hammer to swing and complicates
| deployments (a big part of the reason to move away from
| native extensions). It's possible the author is just
| complicating things unnecessarily, but he's demonstrated
| a clear mastery of the technologies involved so I'm
| willing to give him the benefit of the doubt.
|
| A frequent critique of Ruby is that it's slow. Spawning
| processes for operations on the hot path isn't going to
| help that perception. I agree there are cases where
| shelling out makes sense. E.g., spawning out to use
| ImageMagick has proven to be a better approach than using
| bindings when I want to make image thumbnails. But, those
| are typically handled in asynchronous jobs. I'm all for
| anything we can do to speed up the hot path and it's
| remarkable how much performance was easily picked up.
| grandempire wrote:
| You interpreted my comment as an attack on the author. If
| they have a special case they want to optimize FFI for
| this seems like a great way to do it.
|
| > Spawning processes for operations on the hot path isn't
| going to help that perception.
|
| Yes it will. Because then you do the number crunching in
| C/Java/Rust etc instead of Ruby. And the OS will do
| multi-core scheduling for free. This is playing to Ruby's
| strength as a high level orchestrator.
|
| I think you're vastly overestimating the fork/exec
| overhead on Linux. If your hot path is this hot you
| better not be calling Ruby functions.
|
| > Writing and distributing a C/C++/Zig/Rust/Go
| application as part of a Rails app is a big hammer to
| swing
|
| You're already doing it. Do you have a database? Nginx?
| Me cached? Imagemagick? Do you use containers?
|
| Consider the alternative: is the ideal world the one
| where all software is available as a ruby package?
| nirvdrum wrote:
| > You interpreted my comment as an attack on the author.
| If they have a special case they want to optimize FFI for
| this seems like a great way to do it.
|
| My mistake. I assumed the conversation was relevant to
| the post. I hadn't realized the topic had diverged.
|
| > I think you're vastly overestimating the fork/exec
| overhead on Linux. If your hot path is this hot you
| better not be calling Ruby functions.
|
| We're talking about running a Ruby application. Let's use
| Rails to make it more concrete. That's going to be a lot
| of Ruby on the hot path. Avoiding that is tantamount to
| rewriting the application in another language. But, Ruby
| on the hot path can be fast, particularly when JIT
| compiled. And that's going to be faster than spawning a
| new process. The article even demonstrates this showing
| hot `String#bytesize` is faster than all other options.
| That's not to say writing a program in C is going to be
| slower than Ruby, but rather that integrating Ruby and C,
| which will involve translation of data types, is going to
| favor being written in Ruby. And, of course, the
| implementation of `String#bytesize` is in C anyway, but
| we don't have to deal with the overhead of context
| switching.
|
| > You're already doing it. Do you have a database? Nginx?
| Me cached? Imagemagick? Do you use containers?
|
| All of those things ship as packages and are trivial to
| install. I don't need to compile anything. Packaging this
| theoretical application in a container out of band isn't
| going to help. Now, I need to `docker exec` on my hot
| path? No, that won't work either. So, instead, I need to
| add an explicit compilation step to what is otherwise a
| case of SCPing files to the target machine. I need to add
| a new task whether in the Dockerfile or in Rake to build
| an application to shell out to. At least with a native
| extension packaged as a gem there's a standard mechanism
| for building C applications.
|
| There's no getting around that this complicates the
| build. I'm not sure why that's debatable: not having to
| do all of that is easier than doing it.
|
| > Consider the alternative: is the ideal world the one
| where all software is available as a ruby package?
|
| I'm not saying we should rewrite everything in the world
| in Ruby. But, yes, for a Ruby application the ideal world
| is one in which all code is written in Ruby. It's the
| simplest option and gives a JIT compiler the most context
| for making optimization decisions. I'm not interested in
| using Ruby as a glue language for executing C
| applications in response to a routed Rails request. At
| that point I may as well use something other than Ruby
| and Rails.
| fomine3 wrote:
| It's slow
| poisonta wrote:
| I can sense why it didn't go to tenderlovemaking.com
| tenderlove wrote:
| I think tenderworks wrote this post.
| internetter wrote:
| "write as much Ruby as possible, especially because YJIT can
| optimize Ruby code but not C code"
|
| I feel like I'm not getting something. Isn't ruby a pretty slow
| language? If I was dipping into native I'd want to do as much in
| native as possible.
| kevingadd wrote:
| When dealing with a managed language that has a JIT or AOT
| compiler it's often ideal to write lots of stuff in the managed
| language, because that enables inlining and other optimizations
| that aren't possible when calling into C.
|
| This is sometimes referred to as "self-hosting", and browsers
| do it a lot by moving things into privileged JavaScript that
| might normally have been written in C/C++. Surprisingly large
| amounts of the standard library end up being not written in
| native code.
| internetter wrote:
| Oh, I am indeed surprised! I guess I always assumed that most
| of the JavaScript standard library was written in C++
| achierius wrote:
| Well, most all of the compiler, runtime, allocator, garbage
| collector, object model, etc, are indeed written in C++ And
| so are many special operations (eg crypto functions, array
| sorts, walking the stack)
|
| But specifically with regards to library functions, like
| the other commentator said, losing out on in lining sucks,
| and crossing between JS and Native code can be pretty
| expensive, so even with things like sorting an array it can
| be better to do it in js to avoid the overhead... Eg esp in
| cases where you can provide a callback as your comparator,
| which is js, and thus you have to cross back into js for
| every element
|
| So it's a balancing game, and the various engines have gone
| back and forth on which functions are implemented in which
| language over time
| kenhwang wrote:
| Ruby has realized this as well. When running in YJIT mode,
| some standard library methods switch to using a pure ruby
| implementation instead of the C implementation because the
| YJIT-optimized-ruby is better performing.
| pjmlp wrote:
| That is where a JIT enters the picture, ideally a JIT can re-
| optimize to an ideal state.
|
| While this is suboptimal when doing one shot execution, when an
| application is long lived, mostly desktop or server workloads,
| this work pays off versus the overall application.
|
| For example, Dalvik had a pretty lame JIT, thus it was faster
| calling into C for math functions, eventually with ART this was
| no longer needed, JIT could outperform the cost of calling into
| C.
|
| https://developer.android.com/reference/android/util/FloatMa...
| genewitch wrote:
| Depending on the math (this is a hedge) you need, FORTRAN is
| probably faster still. Every time I fathom a test and compare
| python, fortran, and C, Fortran always wins by a margin.
| Fortran:C:Python 1:1.2:1.9 or so. I don't count startup I
| only time time to return from function call.
|
| Most recently I did hand-looped matrix math and this ratio
| bore out.
|
| I used gfortran, gcc, and python3.
| pjmlp wrote:
| Sure, but that doesn't fit the desktop or server workloads
| I mentioned, I guess we need to except stuff like HPC out
| of those server workloads.
|
| I would also add that modern Fortran looks quite sweet, the
| punched card FORTRAN is long gone, and folks should spend
| more time learning it instead of reaching out to Python.
| doppp wrote:
| It's been fast for a while now.
| Thaxll wrote:
| Even a 50% or 2x speed improvment still make it a pretty slow
| language. It's in the Python range.
| CyberDildonics wrote:
| What is fast here? Ruby has usually been about 1/150th the
| speed of C.
| kenhwang wrote:
| If the code JITs well, Ruby performs somewhere between Go
| and NodeJS. Without the JIT, it's similar to Lua.
| CyberDildonics wrote:
| _If the code JITs well_
|
| What does this mean? Any JIT can make a tight loop of
| math and arrays run well but that doesn't mean a typical
| program runs well.
| kenhwang wrote:
| For Ruby, it's code where variables and method
| input/return types can be inferred to remain static, and
| variable scope/lifetime is finite. From my understanding,
| much of the performance gain was from removing the need
| for a lot of type checking and dynamic lookup/dispatch
| when types were unknown.
|
| So basically, writing code similarly to a statically
| typed compiled language.
| neonsunset wrote:
| Node uses V8 which has a very advanced JIT compiler for
| the hot code, which does a lot of optimizations for
| reducing the impact of JS's highly dynamic type system.
|
| The claim that Ruby YJIT beats this is not supported by
| the data to put it mildly:
|
| https://benchmarksgame-
| team.pages.debian.net/benchmarksgame/... (scroll down on
| each submission and you will see it uses YJIT)
|
| (and Go is not a paragon of absolute performance either)
| kenhwang wrote:
| Not at all saying Ruby's compiler is more capable, more
| that typical Ruby code is easier to optimize by their JIT
| design than typical JS, largely because Ruby's type
| system is more sane.
|
| The whitepapers that inspired Ruby's JIT was first tested
| against a saner subset of JS, and shown to have some
| promising performance improvement. The better
| language/JIT compatibility is why the current Ruby JIT is
| actually showing performance improvements over the
| previous more traditionally designed JIT attempts.
|
| JS can get insanely fast when it's written like low level
| code that can take advantage of its much more advanced
| compiling abilities; like when it's used as a WASM target
| with machine generated code. But humans tend to not write
| JS that way.
|
| Agreed about Go as well, it tends to be on the slow side
| for compiled languages. I called it out not as an example
| of a fast language, but because it's typical performance
| is well known and approximately where the upper bound of
| how fast Ruby can get.
| plagiarist wrote:
| I did a quick search for the white papers and couldn't
| find them. Would you be kind enough to leave a link or a
| title? It sounds interesting, I'd like to read more.
| kenhwang wrote:
| I believe this is the paper that started it all (there
| are a couple of follow up papers):
| https://arxiv.org/pdf/1411.0352v1
|
| The author was eventually hired to develop YJIT.
| schneems wrote:
| To add some nuance to the word "fast."
|
| When we optimize Ruby for performance we debate how to
| eliminate X thousand heap allocations. When people in Rust
| optimize for performance, they're talking about how to hint
| to the compiler that the loop would benefit from SIMD.
|
| Two different communities, two wildly different bars for
| "fast." Ruby is plenty performant. I had a python developer
| tell me they were excited for the JIT work in Ruby as they
| hoped that Python could adopt something similar. For us the
| one to beat (or come closer to) would be Node.js. We are
| still slower then them (lots of browser companies spent a LOT
| of time optimizing javascript JIT), but I feel for the
| relative size of the communities Ruby is able to punch above
| its weight. I also feel that we should be celebrating tides
| that raise all ships. Not everything can (or should be)
| written in C.
|
| I personally celebrate any language getting faster,
| especially when the people doing it share as widely and are
| as good of a communicator as Aaron.
| nicoburns wrote:
| Has it? I thought Ruby was pretty much the benchmark for the
| slowest language. What is it faster than?
| pansa2 wrote:
| Python. At least, it was a few years ago. Both languages
| have added JIT compilers since then, so I'm not sure how
| the most recent versions compare.
| chefandy wrote:
| I see some people say this but I've never seen a
| benchmark support that if you include performance-focused
| distributions like pypy.
|
| https://programming-language-
| benchmarks.vercel.app/amp/pytho...
|
| Is there somewhere with benchmarks that supports the idea
| that Ruby is faster than Python?
| pansa2 wrote:
| Fair point, I was specifically referring to the CPython
| interpreter. I don't know if there's a benchmark that
| compares PyPy to JIT-compiled Ruby.
| chefandy wrote:
| Looks the the one at that link does unless there's some
| newer versioning thing I'm not aware of. The top results
| seem to be comparing pypy 3.10.14 and ruby/yjit 3.4.1.
| pjmlp wrote:
| Because unfortunately PyPy is largely ignored by the
| community , despite their heroic efforts.
| chefandy wrote:
| I've only seen a handful of people use it for local
| development but I've seen plenty of people use it for
| servers in production. Look like the latest cpython has a
| JIT built in. It would be cool if it saw the same gains
| that ruby did.
| pjmlp wrote:
| It is pretty much early days, you are supposed to compile
| your own CPython to enable it.
| m00x wrote:
| Python
| epcoa wrote:
| Tcl, Vbscript, bash/sh.
|
| Tcl had it's web moment during the first dot com era within
| AOLserver.
| pjmlp wrote:
| Not only, you are missing Vignete, and our own Safelayer
| (yes I know it isn't public).
|
| However exactly because of the experience writing Tcl
| extensions all the time for performance, since 2003 I no
| longer use programming languages without JIT/AOT other
| than for scripting taks, or when the decision is
| external.
|
| The founders at our startup went on to create OutSystems,
| with many of the learnings, but using .NET instead, after
| we were given access to .NET during its "Only for MSFT
| partners eyes" early state.
| neonsunset wrote:
| FFI presents an opaque, unoptimizable boundary of code. Having
| chatty code like this is going to cost a lot. To the point
| where this is even a factor in much faster languages with zero-
| cost-ish interop like C# - you still have to make a call,
| sometimes paying the cost of modifying state flags for VM (GC
| transition).
|
| If Ruby YJIT is starting to become a measurable factor (after
| all, it was slower than other, purely interpreted, languages
| until recently), then the same rule as above will become more
| relevant.
| hahahacorn wrote:
| There's a phenomenal breakdown by JPCamara
| (https://jpcamara.com/2024/12/01/speeding-up-ruby.html) on why
| the Ruby#each method was rewritten in Ruby (https://bugs.ruby-
| lang.org/issues/20182). And bonus content from tender love:
| https://railsatscale.com/2023-08-29-ruby-outperforms-c/.
|
| TL;dr - JIT rules.
| hinkley wrote:
| There was a little drama that played out as Java was getting a
| proper JIT.
|
| In one major release, there was a bunch of Java code
| responsible for handling some UI element activities. It was
| found to be a bottleneck, and rewritten in C code for the next
| major release.
|
| Then the JIT became properly useful, and the FFI overhead was
| more than the difference between the hand-tuned C code and what
| the JIT would spit out on its own. So in the _next_ major
| release, they rolled back to the all-Java implementation.
|
| Java had a fairly reasonably fast FFI for that generation of
| programming language, but they swapped for a better one a few
| releases after that. And by then I wasn't doing a lot of Java
| UI code so I had stopped paying attention. But around the same
| time they were also making a cleaner interface between the
| platform-specific and the general Java code for UI, so I'm not
| entirely sure how that played out.
|
| But that's exactly the sort of see-sawing you need to at least
| keep an eye out for when doing this sort of work. Would you be
| better off waiting a couple milestones and saving yourself a
| bunch of hand-tuning work, or do you need it right now for
| political or technical reasons?
| kazinator wrote:
| If FFI calls are slow (even slower than Ruby -> Ruby calls)
| then informs the way you use native code. You look for
| workflows whereby frequent calls to a FFI function are avoided:
| e.g. large number of calls in some inner loop. Suppose such a
| situation cannot be avoided. Then you may have no recourse than
| to move that loop out of Ruby into C: create a custom FFI for
| that use case which you can call once and have it execute the
| loop, calling many times the function you really wanted to
| call.
|
| If the FFI call can be made faster, maybe you can keep the loop
| in Ruby.
|
| Of course that is attractive to people writing an application
| in Ruby.
|
| That's how I interpret keeping as much code Ruby as possible.
|
| Nobody in their right mind wants to add additional application-
| specific jigs written in C just to use some C piece.
|
| Once you start doing that, why even have FFI; you can just
| create a module.
|
| One attractive point about FFI is that you can take some C
| library and use it in a higher level language without writing a
| line of C.
| shortrounddev2 wrote:
| Does ruby have its equivalent to typescript, with type
| annotations? The language sounds interesting but I tend not to
| give dynamically typed languages the time of day
| kevingadd wrote:
| There's https://sorbet.org/ but it's not clear whether it has
| much adoption.
| zem wrote:
| I continue to think it was a big mistake not to add syntactic
| support for type annotations into the base language. python
| did this right; annotations are not enforced by the
| interpreter, but are accessible both by external tools as
| part of the AST and bytecode, and by the running program via
| introspection, so tools and libraries can do all sorts of
| interesting things with them.
|
| having to add annotations in a separate header file is simply
| too high friction to get widespread adoption.
| Lammy wrote:
| IMHO (and I don't expect most people to agree but please be
| tolerant of my opinion!) annotations are annoying busywork
| that clutter my code and exist just to make people feel
| smart for """doing correctness""". The only check I find
| useful is nil or not-nil, and any halfway-well-designed
| interface should make it impossible for some unexpected
| object type to end up in the wrong place anyway. For
| anything less than halfway-well-defined, you have bigger
| issues than a lack of type annotation.
|
| edit: I am quite fond of `case ::Ractor::receive; when
| SomeClass then ...; when SomeOtherClass then ...; end` as
| the main pattern for my Ractors though :)
| zem wrote:
| as your codebase and number of collaborators get larger,
| it's super useful to have the type checker be able to
| tell you "hey, you said your function arg could be a time
| or an int, but you are calling time-specific methods on
| it" or conversely "the function you are calling says it
| accepts time objects but you are passing it an int"
|
| also once you get into jit compilation you can do some
| nice optimisations if you can treat a variable type as
| statically known rather than dynamic.
|
| and finally, even if you're not writing python at scale
| it can be very nice to use the type annotations to
| document your function parameters.
| Lammy wrote:
| > also once you get into jit compilation you can do some
| nice optimisations if you can treat a variable type as
| statically known rather than dynamic.
|
| This is something I hadn't considered. Thanks for
| mentioning it :)
| FooBarWidget wrote:
| Sorbet is the most mature option. RBS barely has any tooling,
| while Sorbet works well.
|
| It definitely isn't at the level of Typescript adoption, even
| relatively speaking. And it's more clunky than Typescript.
| But it works well enough to be valuable.
| Lio wrote:
| It's interesting that RBS support is used by IRB for type
| completion.
|
| There is also work to settle on an inline form of RBS so I
| could see it taking over from Sorbet annotations in the
| future.
| dragonwriter wrote:
| > Does ruby have its equivalent to typescript, with type
| annotations?
|
| Ruby has a first party external type definition format (RBS) as
| well as third-party typecheckers that check ruby against RBS
| definitions.
|
| There is probably more use of the older, all third-party typing
| solution (Sorbet) though.
| teaearlgraycold wrote:
| This is the main thing keeping me from going back to Ruby. I
| don't want to go back to the stone age where there's no or poor
| static analysis
| nirvdrum wrote:
| If you're looking for static typing a dynamic language is
| going to be a poor fit. I find a place for both. I love Rust,
| but trying to write a tool that consumed a GraphQL API with
| was a brutal exercise in frustation. I'd say that goes for
| typing of JSON or YAML or whatever structured format in
| general. It's refreshing being able to just work with data in
| the form I already know it's in. Ruby can be an incredibly
| productive language to work with.
|
| If you're looking for static analysis in general, please note
| that there are mature tools available. Rubocop1 is probably
| the most popular and allows for linting and code formatting.
| Brakeman2 is a vulnerability scanner for Rails. Sorbet3 is a
| static type checker.
|
| The tooling is there if you want to try things out. But, if
| you want a statically typed language then that's a debate
| that's been going since the dawn of programming language
| design. I doubt it's going to get resolved in this thread.
|
| 1 - https://github.com/rubocop/rubocop
|
| 2 - https://brakemanscanner.org/
|
| 3 - https://sorbet.org/
| shortrounddev2 wrote:
| I want to use Lua for more tasks
| teaearlgraycold wrote:
| I've used rubocop and sorbet. But now that I've used
| TypeScript it's clear there's no comparison. TS will even
| analyze your _regex patterns_. Every update gets better.
| I'm eagerly waiting for the day they add analysis for array
| length and ranged numbers.
| Alifatisk wrote:
| > Does ruby have its equivalent to typescript, with type
| annotations?
|
| Someone recommended me this, so I might even spread the word
| further https://github.com/soutaro/rbs-inline?tab=readme-ov-
| file#rbs...
| dhqgekt wrote:
| If you like the Ruby syntax (but want a statically typed
| language), you might want to take a look at Crystal:
| https://crystal-lang.org/
|
| > Crystal is statically typed and type errors are caught early
| by the compiler, eliminating a range of type-related errors at
| runtime.
|
| > Yet type annotations are rarely necessary, thanks to powerful
| type inference. This keeps the code clean and feels like a
| dynamic language.
|
| Why does it remain relatively unpopular and what can be done so
| that more people get to use it?
| dragonwriter wrote:
| > Why does it remain relatively unpopular and what can be
| done so that more people get to use it?
|
| Because Ruby-ish syntax without Ruby's semantics or ecosystem
| isn't actually all that big of selling point, and if people
| want a statically typed language, there are plenty of options
| with stronger ecosystems, some of which have some Ruby-ish
| stntactic features.
| chris12321 wrote:
| Between Rails At Scale and byroot's blogs, it's currently a
| fantastic time to be interested in in-depth discussions around
| Ruby internals and performance! And with all the recent
| improvements in Ruby and Rails, it's a great time to be a Rubyist
| in general!
| jupp0r wrote:
| Is it? To me it seems like Ruby is declining [1]. It's still
| popular for a specific niche of applications, but to me it
| seems like it's well past its days of glory. Recent
| improvements are nice, but is a JIT really that exciting
| technologically in 2025?
|
| [1]: https://www.tiobe.com/tiobe-index/ruby/
| chris12321 wrote:
| Ruby will probably never again be the most popular language
| in the world, and it doesn't need to be for the people who
| enjoy it to be excited about the recent improvements in
| performance, documentation, tooling, ecosystem, and
| community.
| faizshah wrote:
| I think ruby can get popular again with the sort of
| contrarian things Rails is doing like helping developers
| exit Cloud.
|
| There isn't really a much more productive web dev setup
| than Rails + your favorite LLM tool. Will take time to earn
| Gen Z back to Rails though and away from Python/TS or
| Go/Rust.
| jimmaswell wrote:
| My impression is that a Rails app is an unmaintainable
| dynamically-typed ball of mud that might give you the
| fast upfront development to get to a market or get funded
| but will quickly fall apart at scale, e.g. Twitter fail
| whale. And Ruby is too full of "magic" that quickly makes
| it too hard to tell what's going on or accidentally make
| something grossly inefficient if you don't understand the
| magic, which defeats the point of the convenience. Is
| this perception outdated, and if so what changed?
| m00x wrote:
| Rails can become a ball of mud as much as any other
| framework can.
|
| It's not the fastest language, but it's faster than a lot
| of dynamic languages. Other than the lack of native
| types, you can manage pretty large rails apps easily.
| Chime, Stripe, and Shopify all use RoR and they all have
| very complex, high-scale financial systems.
|
| The strength of your tool is limited to the person who
| uses the tool.
| amomchilov wrote:
| The unrefactorable ball of mud problem is real, which is
| why both Stripe and Shopify have highly statically typed
| code bases (via Sorbet).
|
| Btw Stripe uses Ruby, but not Rails.
| byroot wrote:
| I'd say sorbet largely adds to the mud, but to each their
| own.
| fredrikholm wrote:
| > It's not the fastest language, but it's faster than a
| lot of dynamic languages.
|
| Such as?
|
| IME Ruby consistently fall behind, often way behind,
| nearly all popular languages in "benchmark battles".
| Lio wrote:
| Python? Ruby with YJIT, JRuby or Truffle Ruby usually
| beats python code in benchmarks.
|
| I haven't seen a direct comparisons but I wouldn't be
| surprised if Truffle Ruby was already faster than either
| elixir, erlang or php for single threaded CPU bound tasks
| too.
|
| Of course that's still way behind other languages but
| it's still surprisingly good.
| relistan wrote:
| In my work I've seen that TruffleRuby codebases merging
| Ruby and Java libraries can easily keep pace with Go in
| terms of requests per second. Of course, the JVM uses
| more memory to do it. I mostly write Go code these days
| but Ruby is not necessarily slow. And it's delightful to
| code in.
| fredrikholm wrote:
| > Python? Ruby with YJIT, JRuby or Truffle Ruby usually
| beats python code in benchmarks.
|
| Isn't that moving the goal post a lot?
|
| We wen't from 'faster than a lot of others' to 'competing
| for worst in class'.
|
| I'm not trying to be facetious, I'm curious as I often
| read "X is really fast" where X is a functional/OOP
| language that nearly always ends up being some
| combination of slow and with huge memory overhead. Even
| then, most Schemes (or Lisps in general) are faster.
|
| Being faster single threaded against runtimes that are
| built specifically for multithreaded, distributed
| workloads is also perhaps not a fair comparison, esp.
| when both runtimes are heavily used to write webservers.
| And again, Erlang (et al) come out faster even in those
| benchmarks.
|
| Is TruffleRuby production (eg. Rails) ready? If so, is it
| that much faster?
|
| I remember when the infamous "Truffle beats all Ruby
| implementations"-article came out that a lot of Rubyists
| were shooting it down, however this was several years ago
| by now.
| Lio wrote:
| Moving the goal posts? Perhaps I misunderstand what you
| are asking. Python is the not the worst in class
| scripting language. For example perl and TCL are both
| slower than python.
|
| Originally you just asked, "such as" [which dynamic
| language ruby is faster than?] Implying ruby is slower
| than every other dynamic language, which is not the case.
|
| JRuby is faster than MRI Ruby for some Rails workloads
| and very much production ready.
|
| Truffle Ruby is said to be about 97% compatible with MRI
| on the rubyspec but IMHO isn't production ready for Rails
| yet. It does work well enough for many stand alone non-
| rails tasks though and could potentially be used for
| running Sidekiq jobs.
|
| The reason to mention the alternative ruby runtimes is to
| show that there's nothing about the language that means
| it can't improve in performance (within limits).
|
| Whilst it's true that ruby is slower than Common Lisp or
| Scheme, ruby is still improving and the gap is going to
| greatly reduce, which is good news for those of us that
| enjoy using it.
| fredrikholm wrote:
| Thank you for a great answer; I did not mean any ill will
| and apologize if that was how it came across.
|
| Perl, Tcl, Smalltalk etc are basically non-existant from
| where I'm from, so they didn't occur to me.
|
| Perhaps I'm projecting a lot here. I have worked a lot in
| high performance systems and am often triggered by claims
| of performance, eg. 'X is faster than C' when this is
| 99.9% of the times false by two orders of magnitude. This
| didn't happen here.
|
| Thank you for taking the time to answer.
| Lio wrote:
| > _I did not mean any ill will and apologize if that was
| how it came across._
|
| Oh not at all, no I didn't think that. I'm enjoying the
| conversation.
|
| It's interesting that you mention Smalltalk as I believe
| that some of the JIT ideas we're seeing in YJIT are
| borrowed from there.
|
| As for all the "faster than C" talk here is very specific
| to ruby (or JIT'd) runtimes and overheads only in that
| context.
|
| I think it gets mentioned because it seems so counter
| intuitive at first. It's not to imply C isn't orders of
| magnitude faster in general.
|
| Along with the new out of the box features of Rails 8,
| the work on Ruby infrastructure is making it an exciting
| technology to work with again (IMHO).
| weaksauce wrote:
| the ruby is faster than c is because of the yjit. they
| are moving a lot of c ruby standard library and core
| language stuff into ruby code so the yjit can optimize it
| better. akin to java and their bytecode being able to
| optimize things on the fly instead of just once at
| compile time.
| pjmlp wrote:
| Java's Hotspot was originally designed for Smalltalk, and
| SELF.
|
| Two very dynamic systems, designed for being a complete
| graphical workstation, Perl, Tcl, Python, Ruby were as
| originially implemented, not even close of the original
| Smalltalk JIT paper from Peter Deutsch's paper"Efficient
| Implementation of the Smalltalk-80 System." in 1984!
| taurknaut wrote:
| > Rails can become a ball of mud as much as any other
| framework can.
|
| ...but rails can fit this dysfunction on a single slide
| ;)
| weaksauce wrote:
| it's faster than python in some tests that i've seen.
| faraaz98 wrote:
| Twitter fail whale was more skill issue that Rails
| shortcomings. If you read the book Hatching Twitter,
| you'll know quickly they weren't great at code
| nirvdrum wrote:
| If the the Twitter fail whale is your concern, then your
| perception is outdated. Twitter started moving off Ruby
| in 2009. Both the CRuby VM and Rails have seen extensive
| development during that decade and a half.
|
| I never worked at Twitter, but based on the timeline it
| seems very likely they were running on the old Ruby 1.8.x
| line, which was a pure AST interpreter. The VM is now a
| bytecode interpreter that has been optimized over the
| intervening years. The GC is considerably more robust.
| There's a very fast JIT compiler included. Many libraries
| have been optimized and bugs squashed.
|
| If your concern is Rails, please note that also has seen
| ongoing development and is more performant, more robust,
| and I'd say better architected. I'm not even sure it was
| thread-safe when Twitter was running on it.
|
| You don't have to like Ruby or Rails, but you're really
| working off old data. I'm sure there's a breaking point
| in there somewhere, but I very much doubt most apps will
| hit in before going bust.
| ksec wrote:
| The CRuby VM, or the CRuby interpreter alone is at least
| 2-3x faster since Fail Whale time. And JIT doubles that
| to 4 - 6x. Rails itself also gotten 1.5x to 2x faster.
|
| And then you have CPU that is 20 - 30x faster compared to
| 2009. SSD that is 100x - 1000x faster, Database that is
| much more battle tested and far easier to scale.
|
| Sometimes I wonder, may be we could remake twitter with
| Rails again to see how well it goes.
| caiusdurling wrote:
| > Sometimes I wonder, may be we could remake twitter with
| Rails again to see how well it goes.
|
| Mastodon is written in Ruby on Rails (:
| johnmaguire wrote:
| Maybe not the best testimonial. From what I've heard,
| Mastodon is a bit of a beast to scale. While some of this
| is probably due to ActivityPub (a la
| https://lucumr.pocoo.org/2022/11/14/scaling-mastodon/)
| itself, some of it may be related to Ruby's execution
| model: https://lukas.zapletalovi.com/posts/2022/why-
| mastodon-instan...
|
| My issue with Ruby (and Rails) has always been the "ball
| of mud" problem that I feel originates from its extensive
| use of syntactical sugar and automagic.
| taurknaut wrote:
| This is actually pretty accurate, except ruby is just
| slower, not randomly fragile at scale.
| saagarjha wrote:
| Was it ever
| genewitch wrote:
| There was a cycle in 2012 or so. I reckon PHP has more
| lines of code deployed.
|
| But C?
| pier25 wrote:
| It never was "the most popular language in the world".
|
| Rails maybe was popular in the US at some point but it was
| always niche in the rest of the world.
| adamtaylor_13 wrote:
| Rails is experiencing something of a renaissance in recent
| years. It's easily one of the most pleasant programming
| experiences I've had in years.
|
| All my new projects will be Rails. (What about projects that
| don't lend themselves to Rails? I don't take on such projects
| ;)
| cship2 wrote:
| Hmm I thought Crystal was suppose to be faster Ruby? No?
| mbb70 wrote:
| No one uses Ruby because it is fast. They use it because
| it is an ergonomic language with a battle-tested package
| for every webserver based activity you can code up.
| brigandish wrote:
| > No one uses Ruby because it is fast.
|
| Well, because it isn't.
|
| Crystal is an ergonomic language, too, looking a lot like
| Ruby even beyond a cursory glance. What Ruby has, like
| any longstanding language, is a large number of packages
| to help development along, so languages like Crystal have
| to do a lot of catching up. Looking at the large number
| of abandoned gems though, I'm not sure it's _that_ big a
| difference, the most important ones could be targeted.
|
| I'm not sure that has any relevance when compared with
| Python or JS or Go though, they seem to have thriving
| ecosystems too - is Rails really that much better than
| the alternatives? I wouldn't know but I highly doubt it.
| ksec wrote:
| I am still hoping once Crystal stabilise on Windows (
| Currently it still feels very much beta ). They could
| work on making compiling speed faster and incremental
| compiling.
| Alifatisk wrote:
| > Crystal was suppose to be faster Ruby
|
| No, it never intended to be a replacement to Ruby. They
| share similarities in syntax, the same way Elixirs syntax
| reminds you of Ruby.
|
| If you want faster Ruby, check out MRuby (not always an
| drop-in replacement though).
| obiefernandez wrote:
| Stable mature technology trumps glory.
| jupp0r wrote:
| That's why the JVM has been using JITs since 1993 while
| it's a renaissance inspiring marvel for Ruby in 2025.
| pjmlp wrote:
| Unfortunately it is, because too many folks still reach out
| to pure interpreted languages for full blown applications,
| instead of plain OS and application scripting tasks.
| nialv7 wrote:
| isn't this exactly what libffi does?
| tenderlove wrote:
| libffi can't know how to unwrap Ruby types (since it doesn't
| know what Ruby is). The advantage presented in this post is
| that the code for type unboxing is basically "cached" in the
| generated machine code based on the information the user passes
| when calling `attach_function`.
| almostgotcaught wrote:
| you know i thought i knew what libffi was doing (i thought it
| was playing tricks with GOT or something like that) but i think
| you're right
|
| https://github.com/libffi/libffi/blob/master/src/tramp.c
| dzaima wrote:
| libffi doesn't JIT for FFI calls; and it still requires you to
| lay out argument values yourself, i.e. for a string argument
| you'd still need to write code that converts a Ruby string
| object to a C string pointer. And libffi is rather slow.
|
| (the tramp.c linked in a sibling comment is for "reverse-FFI",
| i.e. exposing some dynamic custom operation as a function
| pointer; and its JITting there amounts to a total of 3
| instructions to call into precompiled code)
| kazinator wrote:
| libffi is slow; it doesn't JIT as far as I know.
|
| In libffi you built up _descriptor_ objects for functions.
| These are run-time data structures which indicate the arguments
| and return value types.
|
| When making a FFI call, you must pass in an array of pointers
| to the values you want to pass, and the descriptor.
|
| Inside libffi there is likely a loop which walks the loop of
| values, while at the same time traversing the descriptor, and
| places those values onto the stack in the right way according
| to the type indicated in the descriptor. When the function is
| done, it then pulls out the return according to its type. It's
| probably switching on type for all these pieces.
|
| Even if the libffi call mechanism were JITted, the preparation
| of the argument array for it would still be slow. It's less
| direct than a FFI jit that directly accesses the arguments
| without going through an intermediate array.
|
| FFI JIT code will directly take the argument values, convert
| them from the Ruby (or whatever) type to the C type, and stick
| it into the right place on the stack or register, and do that
| with inline code for each value. Then call the function, and
| convert the return value to the Ruby type. Basically as if you
| wrote extension code by hand: // Pseudo-code
| RubyValue *__Generated_Jit_Code(RubyValue *arg1, RubyValue
| *arg2) { return
| CStringToRuby(__targetCFunction(RubyToCString(arg1),
| RubyToCInt(arg2)); }
|
| If there is type inference, the conversion code can skip type
| checks. If we have assurance that arg1 is a Ruby string, we can
| use an unsafe, faster version of the RubyToCString function.
|
| The JIT code doesn't have to _reflect_ over anything other than
| at worst the Ruby types. It doesn 't have to have any array or
| list related to the arguments. It knows which C types are being
| converted to and form, and that is hard-coded: there is no data
| structure describing the C side that has to be walked at
| run0-time.
| nialv7 wrote:
| I am surprised many don't know how libffi works. Yes, it does
| generate native machine code to handle your call. Look it up.
|
| Yes it's probably worse than doing the jit in Ruby
| interpreter, since there you can also inline the type
| conversion calls, but there principles are the same.
|
| Edit: This is wrong, see comments below.
| dzaima wrote:
| It certainly uses native machine code, but I don't think it
| generates any at runtime outside of the reverse-FFI
| closures (at least on linux)? PROT_EXEC at least isn't used
| outside of them, which'd be a minimum requirement for linux
| JITting.
|
| Running in a debugger an ffi_call to a "int add(int a, int
| b)" leads me to https://github.com/libffi/libffi/blob/1716f
| 81e9a115d34042950... as the assembly directly before the
| function is invoked, and, besides clearly not being JITted
| from me being able to link to it, it is clearly inefficient
| and unnecessarily general for the given call, loading 7
| arguments instead of just the two necessary ones.
| nialv7 wrote:
| Oops, you are right. I think because the other direction
| of libffi - ffi_closure - has a jitted trampoline, I
| mistakenly thought both directions are jitted. Thanks for
| the correction.
| dzaima wrote:
| And the JITting in closures amounts to a total of three
| instructions; certainly not for speed, rather just as the
| bare minimum to generate distinct function pointers at
| runtime.
| haberman wrote:
| > Rather than calling out to a 3rd party library, could we just
| JIT the code required to call the external function?
|
| I am pretty sure this is the basis of the LuaJIT FFI:
| https://luajit.org/ext_ffi.html
|
| I think LuaJIT's FFI is very fast for this reason.
| aidenn0 wrote:
| Why does this need to be JIT compiled? If it could be written in
| C, then it certainly could just be compiled at load time, no?
| nirvdrum wrote:
| If what could be written in C? The FFI library allows for
| dynamic binding of library methods for execution from Ruby
| without the need to write a native extension. That's a huge
| productivity boost and makes for code that can be shared across
| CRuby, JRuby, and TruffleRuby.
|
| I suppose if you could statically determine all of the bindings
| at boot up you could write a stub and insert into the method
| table. But, that still would happen at runtime, making it JIT.
| And it wouldn't be able to adapt to the types flowing through
| the system, so it'd have to be conservative in what it accepts
| or what it optimizes, which is what libffi already does today.
| The AOT approach is to write a native extension.
| aidenn0 wrote:
| By "it" I meant this part from TFA:
|
| > you should write a native extension with a very very
| limited API where most work is done in Ruby. Any native code
| would be a very thin wrapper around the function we actually
| want to call that just converts Ruby types in to the types
| required by the native function.
|
| I think our main disagreement is your assertion that any
| compilation at runtime qualifiees as JIT. I consider JIT to
| be dynamic compilation (and possibly recompilation) of a
| running program, not merely anything that generates machine
| code at runtime.
| brigandish wrote:
| It's an aside, but
|
| > Now, usually I steer clear of FFI, and to be honest the reason
| is simply that it doesn't provide the same performance as a
| native extension.
|
| I usually avoid it, or in particular, gems that use it, because
| compilation can be such a pain. I've found it easier to build it
| myself and cut out the middleman of Rubygems/bundler.
| cchianel wrote:
| I had to deal with a lot of FFI to enable a Java Constraint
| Solver (Timefold) to call functions defined in CPython. In my
| experience, most of the performance problems from FFI come from
| using proxies to communicate between the host and foreign
| language.
|
| A direct FFI call using JNI or the new foreign interface is fast,
| and has roughly the same speed as calling a Java method directly.
| Alas, the CPython and Java garbage collectors do not play nice,
| and require black magic in order to keep them in sync.
|
| On the other hand, using proxies (such as in JPype or GraalPy)
| cause a significant performance overhead, since the parameters
| and return values need to be converted, and might cause
| additional FFI calls (in the other direction). The fun thing is
| if you pass a CPython object to Java, Java has a proxy to the
| CPython object. And if you pass that proxy back to CPython, a
| proxy to that proxy is created instead of unwrapping it. The
| result: JPype proxies are 1402% slower than calling CPython
| directly using FFI, and GraalPy proxies are 453% slower than
| calling CPython directly using FFI.
|
| What I ultimately end up doing is translating CPython bytecode
| into Java bytecode, and generating Java data structures
| corresponding to the CPython classes used. As a result, I got a
| 100x speedup compared to using proxies. (Side note: if you are
| thinking about translating/reading CPython bytecode, don't; it is
| highly unstable, poorly documented, and its VM has several quirks
| that make it hard to map directly to other bytecodes).
|
| For more details, you can see my blog post on the subject:
| https://timefold.ai/blog/java-vs-python-speed
| LinXitoW wrote:
| Speaking from zero experience, the FFI stories of both Python
| and Java to C seems much better. Wouldn't going connecting them
| via a little C bridge a general solution?
| cchianel wrote:
| JNI/the new Foreign FFI communicate with CPython via
| CPython's C API. The primary issue is getting the garbage
| collectors to work with each other. The Java solver works by
| repeatedly calling user defined functions when calculating
| the score. As a result:
|
| - The Java side needs to store opaque Python pointers which
| may have no references on the CPython side.
|
| - The CPython side need to store generated proxies for some
| Java objects (the result of constraint collectors, which are
| basically aggregations of a solution's data).
|
| Solving runs a long time, typically at least a hour (although
| you can modify how long it runs for). If we don't free memory
| (by releasing the opaque Python Pointer return values), we
| will quickly run out of memory after a couple of minutes. The
| only way to free memory on the Java side is to close the
| arena holding the opaque Python pointer. However, when that
| arena is closed, its memory is zeroed out to prevent use-
| after-free. As a result, if CPython haven't garbage collected
| that pointer yet, it will cause a segmentation fault on the
| next CPython garbage collection cycle.
|
| JPype (a CPython -> Java bridge) does dark magic to link the
| JVM's and CPython's garbage collector, but has performance
| issues when calling a CPython function inside a Java
| function, since its proxies have to do a lot of work. Even
| GraalPy, where Python is ran inside a JVM, has performance
| issues when Python calls Java code which calls Python code.
| high_na_euv wrote:
| How IPC methods would fit such cases?
|
| Like, talk over some queue, file, http, etc
| cchianel wrote:
| IPC methods were actually used when constructing the foreign
| API prototype, since if you do not use JPype, the JVM must be
| launched in its own process. The IPC methods were used on the
| API level, with the JVM starting its own CPython interpreter,
| with CPython and Java using `cloudpickle` to send each other
| functions/objects.
|
| Using IPC for all internal calls would probably take
| significant overhead; the user functions are typically small
| (think `lambda shift: shift.date in
| employee.unavailable_dates` or `lambda lesson:
| lesson.teacher`). Depending on how many constraints you have
| and how complicated your domain model is, there could be
| potentially hundreds of context switches for a single score
| calculation. It might be worth prototyping though.
| ignoramous wrote:
| Also see: _cgo is not Go_ Go code and C code
| have to agree on how resources like address space, signal
| handlers, and thread TLS slots are to be shared -- and when I
| say agree, I actually mean Go has to work around the C code's
| assumption. C code that can assume it always runs on one
| thread, or blithely be unprepared to work in a multi threaded
| environment at all. C doesn't know anything about
| Go's calling convention or growable stacks, so a call down to C
| code must record all the details of the goroutine stack, switch
| to the C stack, and run C code which has no knowledge of how it
| was invoked, or the larger Go runtime in charge of the program.
| It doesn't matter which language you're writing bindings or
| wrapping C code with; Python, Java with JNI, some language
| using libFFI, or Go via cgo; it is C's world, you're just
| living in it.
|
| https://dave.cheney.net/2016/01/18/cgo-is-not-go /
| https://archive.vn/GZoMK
| eay_dev wrote:
| I've been using Ruby more than 10 years, and seeing its
| development in these days is very exciting. I hope
| IshKebab wrote:
| > Even in those cases, I encourage people to write as much Ruby
| as possible, especially because YJIT can optimize Ruby code but
| not C code.
|
| But the C code is still going to be waaay faster than the Ruby
| code even with YJIT. That seems like an odd reason to avoid C. (I
| think there are other good reasons though.)
| Alifatisk wrote:
| > the C code is still going to be waaay faster than the Ruby
| code even with YJIT.
|
| I can't find it but I remember seeing a talk where they showed
| examples of Ruby + YJIT hitting the same speed and in some
| cases a bit more than C. The downside was though that it
| required to some warmup time.
| IshKebab wrote:
| I find that hard to believe. I've heard claims JIT can beat C
| for years, but they usually involve highly artificial
| microbenchmarks (like Fibonacci) and even for a high
| performance JITed language like Java it ends up not beating
| C. There's no way YJIT will.
|
| The YJIT website itself only claims it is around twice as
| fast as Ruby, which means it is still slower than a flat
| earth snail.
|
| The benchmarks game has YJIT and it's somewhere between PHP
| and Python. Very slow.
| cpursley wrote:
| Can anyone still make a case to start a new project in Rails in
| 2025 when there is Elixir LiveView?
|
| I enjoy Ruby but activerecord is a mess and the language is slow
| and lacks real time functionality.
| arrowsmith wrote:
| Rails has more mindshare, it's easier to hire for, there are
| more tutorials etc to help you when you get stuck, and Ruby has
| a more mature ecosystem of libraries/plugins than Elixir has.
|
| I'd still pick Phoenix over Rails any day of the week, but if I
| had to make the case for Rails, that would be it.
| adenta wrote:
| Is anyone using LiveView in production?
| evacchi wrote:
| somewhat related, this library uses the JVMCI (JVM Compiler
| Interface) to generate arm64/amd64 code on the fly to call native
| libraries without JNI https://github.com/apangin/nalim
___________________________________________________________________
(page generated 2025-02-13 23:01 UTC)