[HN Gopher] Ruby YJIT Ported to Rust
___________________________________________________________________
Ruby YJIT Ported to Rust
Author : the_duke
Score : 278 points
Date : 2022-04-20 08:18 UTC (14 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| The_rationalist wrote:
| jfmc wrote:
| "YJIT code ported from C99 to Rust" Beyond passing the test
| suite, are there more numbers to compare both versions? (e.g.,
| compilation time, lines of code, size of binaries, performance,
| etc.)
| matsadler wrote:
| I think the goal of this right now is just to match the C
| version.
|
| The C implementation of YJIT supported x86 Unix/Linux
| platforms, and it sounds like adding Windows and arm64 support,
| plus other improvements was a daunting task with the tools C
| provides.
|
| Now it's in Rust we'll hopefully see further improvements
| quicker.
| anitil wrote:
| This is the first time I've felt that Rust is starting to eat
| C's lunch.
| tormeh wrote:
| Rust will eclipse C++. C is a harder nut to crack,
| particularly for the embedded space where ease of
| implementing and maintaining a compiler back-end/code-
| emitter for your new weird 8-bit architecture is important.
| C is pretty close to an assembly macro and it's barely
| updated, which is great for that use-case. But for use
| cases like interpreters Rust is perfectly suitable.
| pjmlp wrote:
| Good luck with that on anything GPU or HPC related, or
| industries with language standards.
|
| Also until Rust compilers are bootstraped, they will
| always rely on a C++ infrastructure.
| FullyFunctional wrote:
| I've been around for a long time and I haven't seen a PL
| with this much momentum since Java was launched. Inertia
| is real, but the benefits over C++ are undeniable.
|
| Bootstrapping seems a silly thing to be obsessed about as
| C++ will be around forever still, but it obviously can be
| bootstrapped if that becomes important.
| pjmlp wrote:
| Ada/SPARK already provided such benefits, and NVidia has
| chosen it instead of Rust for automotive firmware.
|
| Rust momentum is meaningless for GPUs unless NVidia
| decides it gets to play in CUDA, and they are now one of
| the companies with more ISO C++ people on their payroll.
|
| It is also meaningless for PlayStation, Nintendo and
| Xbox, unless the respective SDKs integrate Rust.
|
| Bootstraping isn't silly, because LLVM and GCC are
| written in C++, so there isn't any "Rust will eclipse
| C+", when it depends on it for its existence.
| pcwalton wrote:
| SPARK doesn't provide the same feature set as Rust. If
| you want safe heap allocation in SPARK, then you get a
| garbage collector (unless you're talking really recent
| experimental extensions IIRC). If you want to forego the
| GC and remain memory-safe, then you also forego heap
| allocation. This might work for avionics code, but not
| for most apps.
|
| Besides, the post you're replying to is talking about
| "momentum", and it's obvious in 2022 that Ada doesn't
| have the momentum that Rust does (however you define
| "momentum"). NVIDIA is not the entire industry.
|
| Much of the rest of your post concerns video games, which
| are only a small portion of the total C++ code in
| existence. (And in any case it's not accurate to say that
| languages are "meaningless" unless the platform vendor
| officially supports them--console vendors don't maintain
| C# VMs either and yet Unity titles work just fine.)
| pjmlp wrote:
| What garbage collector? Ada never had one, besides the
| optional one in early standards, never implemented in any
| commercial compiler, thus removed in Ada 2012.
|
| I wasn't the one asserting momentum, and can relate to
| plenty of other industries where Rust isn't even on the
| radar.
|
| Going back to Ada example, Rust certainly doesn't have
| any momentum over Ada in high integrity computing.
|
| Console vendors do happen to collaborate with Unity, and
| make it first party on their SDKs, so yet another lack of
| information.
| pcwalton wrote:
| WebRender is certainly "GPU related" and is shipping to
| millions of happy Firefox users.
|
| And yes, LLVM is written in C++. So what? C++ compilers
| depend on C code in libc. Portions of libc are written in
| assembler. Some assembly instructions are decomposed into
| microcode. Yet nobody doubts that C++ has eclipsed
| assembly language in terms of importance to the industry
| nowadays. We'll always need a way for humans to read the
| actual instructions that the silicon interprets, but
| relatively few people need to be able to do that
| nowadays. That dynamic is what the parent post means by
| one language "eclipsing" another.
| pjmlp wrote:
| For how long? 3% and decreasing.
|
| Libc is UNIX only.
|
| As for the rest, it is useful to tone down hype with some
| cold water reality check.
| Ar-Curunir wrote:
| > tone down hype with some cold water reality check.
|
| I mean, you're the one who keeps mentioning Ada/SPARK on
| every Rust thread, so if anyone needs to stop hyping
| things, it's perhaps you?
| vlovich123 wrote:
| This position is like saying C or C++ won't eat ASM's
| lunch. While technically true since there's a lot of ASM
| code still being written, especially for extremely low-
| level or high performance code, the vast majority of C
| and C++ developers don't actually touch ASM (i.e. C/C++
| dominate ASM in terms of number of developer hours
| spent).
|
| I think you may also be overlooking the GCC backend for
| rustc and gccrs, a ground-up standalone reimplementation
| of the Rust language frontend for GCC. Both of those
| should drastically improve the coverage and availability
| of Rust to all the same platforms you would be using GCC
| to compile C code for.
|
| Depending on the compiler support, you might get that
| architecture for free unless the vendor is providing
| their own C compiler. The harder part is that your new
| weird 8-bit architecture probably won't benefit as much
| from the strong nostd ecosystem of libraries, so the
| overhead of writing Rust won't be counterbalanced. Still,
| like I said at the outset, this is an extremely niche
| use-case. Rust doesn't have to wipe C or C++ from the map
| for it to crack that nut.
|
| The harder nut for Rust to crack I think is actually C++.
| There are extremely large C++ codebases. Industry would
| love for there to be a significantly easier/cheaper story
| to tell in terms of integrating Rust with those
| codebases. That way you could set metrics around
| converting the codebase, new code has to be written in
| Rust etc. However, the challenge is that Rust can only
| replace components with very well-defined boundaries.
| Those boundaries are less clearly defined in C++
| codebases than they are in C codebases (linkage +
| templates in particular are challenging). To truly crack
| the C++ nut probably requires solving this problem unless
| Rust codebases just starting eating C++ codebases
| commercially through development velocity (which is a
| much longer and harder path).
| rapsey wrote:
| Very few will actually rewrite code in rust. It is enough
| for Rust to be used for new projects which would
| otherwise be c or c++
| rubyfan wrote:
| Can you say more about why you think that?
| tmikaeld wrote:
| Was looking for the same thing, what does this mean for Ruby
| performance?
| rvz wrote:
| It means completely nothing for performance.
| [deleted]
| block_dagger wrote:
| Nothing
| ModernMech wrote:
| According to the post not much. The Rust version performs
| about the same because it generates mostly the same machine
| code.
| Rafert wrote:
| YJIT benchmarks can be found at https://speed.yjit.org/
|
| The Rust port doesn't change performance much according to
| the pull request description.
| npalli wrote:
| Your own link states --
|
| Overall YJIT is 33.4% faster than interpreted CRuby! On
| Railsbench specifically, YJIT is 32.4% faster than CRuby!
| WJW wrote:
| Yes, but YJIT in rust is the same ~33.4% faster than
| vanilla CRuby than YJIT in C. The rewrite into Rust is
| expected to make YJIT easier to maintain and that _may_
| in turn make possible further improvements to code
| generation, but the rewrite generates the same machine
| code (and therefore the same speedup) as before.
| ewalk153 wrote:
| I didn't know this was public, sweet! Nice that the tooling
| that generates this report is also published:
| https://github.com/Shopify/yjit-metrics
| pizza234 wrote:
| Very likely, the performance of a JIT comes from:
|
| - the architecture of the JIT itself
|
| - the generated code
|
| AFAIK, the Rust YJIT doesn't change any (they explicitly say
| that the generated code is approximately the same), so there
| no significant difference in performance should be expected.
| asymmetric wrote:
| The PR itself says:
|
| > The new Rust version of YJIT has reached parity with the C
| version, in that it passes all the CRuby tests, is able to run
| all of the YJIT benchmarks, and performs similarly to the C
| version (because it works the same way and largely generates
| the same machine code). We've even incorporated some design
| improvements, such as a more fine-grained constant invalidation
| mechanism which we expect will make a big difference in Ruby on
| Rails applications.
| faitswulff wrote:
| RoR comparison benchmarks would be nice to see.
| erk__ wrote:
| The proposal for this was previously discussed at:
| https://news.ycombinator.com/item?id=29971360
| dj_gitmo wrote:
| > ... it works the same way and largely generates the same
| machine code
|
| How can they make this determination? Do they just eyeball a few
| sections of the machine code from each output? Is there some tool
| that can compare binaries? Is this just a very literal, function
| by function, translation from C to Rust?
|
| I don't know much reading/comparing machine code.
| pizza234 wrote:
| I'm not familiar with what YJIT generates, however, in general
| terms, if the ASM code for a given bytecode is small enough
| (Which I think it is), one can just compare them side by side,
| or just log them and compare them separately. I think a JIT for
| Ruby should compile relatively small chunks of ASM, not big
| walls of code (but again, this is my guess).
| pjmlp wrote:
| They mean the output of JIT compiler, not the binary itself.
| fuzzythinker wrote:
| Here's a benchmark [1] done in Jan'22 against many ruby
| implementations, truffleRuby [2] seems to be way ahead in most,
| and at least ahead in all. Why truffleRuby isn't talk about much
| here?
|
| [1] https://eregon.me/blog/2022/01/06/benchmarking-cruby-mjit-
| yj...
|
| [2] https://github.com/oracle/truffleruby
| pizza234 wrote:
| When I read about its performance, I had the same thoughts,
| however, I was surprised to read this in the Github project
| readme:
|
| > TruffleRuby might not be fast yet on Rails applications and
| large programs. Notably, large programs currently take a long
| time to warmup on TruffleRuby and this is something the
| TruffleRuby team is currently working on. Large programs often
| involve more performance-critical code so there is a higher
| chance of hitting an area of TruffleRuby which has not been
| optimized yet.
|
| I guess that they have a high-performing JIT, that is optimized
| for small but not large programs yet. I'm curious though, what,
| technically, makes such difference.
| stormbrew wrote:
| This is always a problem for any JIT. Large codebases,
| especially ones as heavy on dynamic code paths as rails, run
| individual pieces of code less frequently than smaller ones
| (because they're just doing more work in general, and in
| rails' case are constantly spawning new code to deal with).
|
| Then you have to instrument the code while it's running under
| a VM to decide what (and how) to JIT, and then you have to
| compile and assemble it. You also probably have to deal with
| some quasi-locking around the call sites as you switch code
| from using the VM to using the JIT.
|
| So, basically by the laws of thermodynamics, all else equal a
| JITing VM will be slower than a non-JITing one, and the
| benefits of JIT won't kick in until you have enough code
| instrumented and compiled to make a dent in that performance
| loss from the extra work.
|
| And then, the cleverer your JIT, and the more you optimize
| the code under compile, the more off-balance this gets,
| because doing those things gets more expensive.
| epage wrote:
| I found this part odd
|
| > . If YJIT is built in dev mode, then cargo is used to fetch
| development dependencies, but when building in release, cargo is
| not required, only rustc
|
| I'm not finding any information on why they bypass cargo and
| build directly with rustc. I'm curious what requirements led to
| this.
| nightpool wrote:
| My guess is that this is a tradeoff made to make it more
| compatible with downstream distro packaging systems like Debian
| and Redhat, who generally look very negatively on requirements
| to use external package managers like Cargo. By keeping their
| set of dependencies very small, removing Cargo from their build
| process has tons of benefits in terms of how complex it will be
| to compile the Ruby codebase
| steveklabnik wrote:
| Both Debian and Red Hat package Cargo, and also package
| various Rust crates as their own native packages, and then
| have Rust programs use Cargo to use them.
|
| (Okay actually I'm unsure about Red Hat, but this is how
| Fedora does it...)
| the_duke wrote:
| The Cargo.toml file gives the answer:
| https://github.com/Shopify/ruby/blob/rust-yjit-upstreaming/y...
|
| There is only a single, optional dependency which is apparently
| only used for testing.
| rtpg wrote:
| Maybe the production release has no dependencies to download,
| but the dev release has some helper stuff for running tests
| etc?
| stormbrew wrote:
| Cargo is .. not a pleasant build tool to integrate into other
| build systems. If you have a project (like mainline ruby) where
| the dominant mode is C and you need to integrate some rust into
| it, you will eventually feel like it'd be a useful use of your
| time to bypass cargo and use the compiler directly.
|
| Cargo is a fantastic tool, easily one of the best of its ilk,
| but real talk: it needs to be normalized that sometimes you
| don't want or need to use it. It is fit to a very specific set
| of tasks (mostly producing stand-alone binaries), and that set
| of tasks is a subset of the tasks rust as a whole is fit for.
| steveklabnik wrote:
| Hilariously, years ago I did some Rust/Ruby integration, for
| fun, and had Ruby's makefiles just call Cargo to build the
| rust code. It worked just fine.
|
| It doesn't work for all things in all cases, of course, but
| it can be workable. At work we built a build system on top of
| Cargo to paper over some of its deficiencies. It's not ideal
| but IMHO it's still better than dealing with rustc directly.
| In this case it's easier for them since they have no external
| dependencies.
| stormbrew wrote:
| I mean, calling rustc isn't so bad _other than_ managing
| dependencies. I don 't think it's really all that much more
| fraught than the compilers of other complex languages
| (including C++) that people manage to interact with
| directly. But cargo is simultaneously so good at dealing
| with dependencies, and (for lack of a better word)
| parasitic in its integration with nearly every crate in
| existence, that the moment you want to pull something else
| in it gets Hard.
|
| Where it gets _real_ messy is if you want to go back and
| forth (C- >rust->C or rust->C->rust where the bookends are
| in the same codebase). This was a thing we wanted to do at
| the job I just left, but we never managed to make it work
| in a way that wasn't very janky. This was in a very mature
| and large C codebase managed by cmake, where we were
| gradually eating parts of it with rust, though.
| steveklabnik wrote:
| Yes, absolutely, the end of your first paragraph is
| really what I mean; you end up having to basically
| rebuild cargo anyway. If you have a self contained code
| base, it's not like rustc is inherently bad to call
| directly, for sure.
| stormbrew wrote:
| Yeah. I really wish cargo had a "create a build
| environment" mode. It would make integrating rust into
| other systems a lot easier.
| hardwaregeek wrote:
| This is so cool! If new contributions to Ruby could be written in
| Rust, I'd be a lot more inclined to contribute. I don't think I'm
| alone here. Andy Kelley noted that the new Zig compiler has
| significantly more contributors, likely due to it being written
| in Zig and not C++.
|
| Some people may roll their eyes at this, but it is a lot more
| enticing to work on a Rust codebase than a C/C++ one. I'm less
| likely to screw up and create a serious bug; I get a lot more
| help from the compiler; the build system is standardized and
| simple; and it's just plain fun.
| faitswulff wrote:
| Unfortunately:
|
| > To be clear, it's OK to use Rust to implement YJIT (and other
| optional features in the future), but mainline CRuby will not
| be implemented in Rust.
|
| - Matz, https://bugs.ruby-lang.org/issues/18481#note-14
|
| On the other hand, there is Artichoke Ruby:
| https://github.com/artichoke/artichoke
| hardwaregeek wrote:
| True, although Matz has changed his mind in the past (type
| annotations come to mind). I wouldn't be surprised if people
| notice that YJIT contributions are far more common, Matz may
| reconsider this.
| MuffinFlavored wrote:
| I wonder if they found any bugs in the C99 version due to Rust's
| "memory/type safety" and all that?
| throwaway-m3232 wrote:
| Why not C++, for better portability? If I want to design my own
| CPU, I will have to add it to GCC. But Rust is LLVM so if I want
| to support Ruby-jit on my CPU, I will also will have to support
| LLVM.
| antonvs wrote:
| C++ is a 28-year old language that's been showing its age for
| at least a decade or two. If we want the software world to
| progress we need to move on from such languages.
| cjg wrote:
| Rust has some GCC support.
| cesarb wrote:
| > I will also will have to support LLVM.
|
| This won't be an issue for long, as there's already a GCC
| backend for Rust in development.
| brobinson wrote:
| Why not a memory safe language, to avoid those 70% of CVEs?
|
| (67% of 0-days last year:
| https://news.ycombinator.com/item?id=31085539)
| infamouscow wrote:
| Because Ruby is already memory safe and JIT miscompilation is
| a logic bug.
| pizza234 wrote:
| The fact that a language is memory safe doesn't imply that
| the underlying virtual machine/interpreter is.
|
| On the other hand, it's definitely true that the ASM
| generated is as unsafe as it gets, but the first point
| still stands. The memory unsafety of the VM is simply an
| additional attack vector.
| carlmr wrote:
| How is JIT miscompilation or vulnerabilities in the JIT
| compiler not an issue?
| infamouscow wrote:
| A Ruby program can delete all of the files on a computer,
| insert arbitrary rows into a database, drop a table, send
| email with attachments, etc. Am I correct that you're
| concerned the Ruby JIT itself will have a security
| vulnerability in the act of JIT compiling Ruby code? This
| seems extremely myopic.
| criticaltinker wrote:
| JS engines have had many serious vulnerabilities in their
| JIT optimizers, it's not myopic at all and is a well
| known technique in the industry.
|
| I agree that some folks aren't executing untrusted ruby
| code so they wouldn't have to worry about this - but how
| many PaaS/SaaS products out there are? Or how about third
| party dev tools that are blindly downloaded and executed
| on local workstations or CI pipelines?
| infamouscow wrote:
| > JS engines have had many serious vulnerabilities in
| their JIT optimizers, it's not myopic at all and is a
| well known technique in the industry.
|
| HotSpot and V8 are both written in C++ and get more use
| than any other JIT on Earth.
|
| Can you provide a link to a CVE caused by JIT
| miscompilation and explain how Rust would have been able
| to prevent the bug in a way that C++ wouldn't?
|
| > I agree that some folks aren't executing untrusted ruby
| code so they wouldn't have to worry about this - but how
| many PaaS/SaaS products out there are?
|
| This is what Xen, KVM, and Hyper-V do.
|
| > Or how about third party dev tools that are blindly
| downloaded and executed on local workstations or CI
| pipelines?
|
| Are you suggesting a Ruby JIT shouldn't generate machine
| code that corresponds to the Ruby program, but somehow
| magically prevent stupid developers from doing stupid
| things?
| bastawhiz wrote:
| It's a bad look if a malicious HTTP request to your Rails
| app can trigger RCE on your server. It's not about
| running code that's malicious, it's about bad data
| triggering a code path in the VM that is able to change
| the function of the application.
| infamouscow wrote:
| What you're describing is a logic bug.
|
| JITs write instructions to memory in a manner that's only
| slightly different than writing bytes to a file. The
| generation of those instructions can either be correct or
| incorrect and happens regardless of programming language.
|
| A JIT written in Python is equally capable of generating
| bad code as a JIT written in C or Rust or Lisp. A perfect
| port of a buggy JIT written in language A will generate
| the same buggy code even after being ported to language
| B.
| Tobu wrote:
| Rust's type system is enough to get rid of memory safety
| and UB, but it does that by enforcing more invariants,
| invariants which you also use to encode properties you care
| about. 70% percent of vulnerabilities are memory unsafety
| which is impossible in safe Rust etc etc, but a better type
| system, a language that doesn't disclaim commonly found
| code as unsupported, more productive errors, lower
| cognitive load... also tends to help with the rest of the
| bugs.
| unrealhoang wrote:
| Because Rust is much easier to learn than C++ so the authors
| are more comfortable with Rust?
| matharmin wrote:
| If you want to design your own CPU, supporting LLVM is going to
| give you much greater benefits than supporting Ruby. Nevermind
| the fact that you don't even need this to support Ruby.
| Tobu wrote:
| To add to your point, following Woodruff's "Weird
| architectures weren't supported to begin with", Robert
| O'Callahan pointed out[1] that for one definition of the
| open-source platform (looking at the requirements of Linux
| distributions), a new architecture would need to support at
| least: LLVM and GCC targets, a port of the Linux kernel, a V8
| backend, and acceleration for various codecs.
|
| And while at this point a platform needs to have support from
| both compilers, I can see the GCC/glibc ecosystem being made
| redundant; LLVM is more adaptable and has found its way into
| so many specialized compiler stacks.
|
| [1]: https://lwn.net/Articles/847830/
| sanxiyn wrote:
| This is a non-issue. YJIT only targets x86-64. After all, this
| is a JIT. If you designed a new architecture X, you need to
| port YJIT itself to target X, in addition to GCC, LLVM, etc.
| throwaway-m3232 wrote:
| Oh, so YJIT is highly coupled to x86-64? Porting GCC + yjit
| is less work than porting GCC + yjit + LLVM.
| lalaithion wrote:
| Why even port GCC at all, and not simply LLVM?
| lnxg33k1 wrote:
| But why not just buy an existing CPU on amazon
| dkersten wrote:
| So just port LLVM + yjit.
| FooBarWidget wrote:
| It's not like new architectures appear very quickly, much
| less adopted very quickly. The benefits of maintenance
| overhead reduction and development speed increase, far
| outweight the theoretical downside of having to port LLVM
| to that new architecture.
| byroot wrote:
| It's not that it's highly coupled, just that it's still the
| early days and only x86_64 was on the roadmap. Arm64 is
| planned, and will hopefully make it into Ruby 3.2
| FullyFunctional wrote:
| And with an Arm64 backend, adding RISC-V is probably
| going to be a walk in the park.
| mustache_kimono wrote:
| I'm not sure I understand why some people really hate Rust, but
| when the argument feels like "But can't we be miserable
| forever?" I just have to laugh.
| mustache_kimono wrote:
| FYI -- my technical thinking -- because Rust is a nicer
| language for the people who have to work with it. Full stop.
|
| Rust offers substantial memory safety guarantees, but that
| isn't the only thing it offers. People who don't know this
| are those that haven't tried it. Others have focused on
| security in this thread, and I think that's wrong headed.
| That's obviously not the reason for choosing Rust here. It's
| that it makes things that are important now and in the
| future, like say concurrency, easier and more likely to be
| correct. Yes, ergonomics and a nice dev experience actually
| matter _even for the people writing your compiler_!
|
| Moreover, Rust GCC support is far closer to being a thing
| that yjit is to being a thing. So -- let the kids play.
| bilkow wrote:
| > If I want to design my own CPU, I will have to add it to GCC.
|
| Why do you "have" to add it to GCC? You could only add it to
| LLVM instead.
| xutopia wrote:
| How will this benefit Ruby?
| coder543 wrote:
| Explained here: https://bugs.ruby-lang.org/issues/18481
| riffraff wrote:
| if YJIT is successful, ruby will be faster, which is good(tm).
| The rationale for the rust rewrite is that rust may be better
| suited for writing a JIT than C is.
| kibwen wrote:
| Motivation: https://bugs.ruby-lang.org/issues/18481
|
| _" The motivation behind this is that we are facing challenges
| in terms of code maintainability. As you know, JIT compilers can
| get very complex, and C99 doesn't offer many tools to manage this
| complexity. There are no classes and methods, limited type
| checking, and it's hard to fully separate code into modules, for
| instance."
|
| "We believe that having access to object oriented programming and
| a more expressive type system would help us manage growing
| complexity better and also improve the safety/robustness of YJIT.
| For instance we would like to add Windows support and a new
| backend to YJIT. That means we'll have two separate backends
| (x86, arm64) and we'll need to support two different calling
| conventions (Microsoft, SystemV), but currently, we have limited
| tools to build the abstractions needed, such as preprocessor
| macros and if-statements."_
| [deleted]
| czbond wrote:
| Thank you for posting the motivation - I was curious the
| "why"... and maintainability now explains it.
| WolfOliver wrote:
| What is the benefit? Can I run Ruby in the browser now?
| WJW wrote:
| Yes you can run Ruby in the browser if you want, but not
| because of this PR. Ruby-in-WASM was merged a few weeks ago.
|
| This PR rewrites the YJIT just-in-time compiler code from C
| into Rust, because the dev team likes Rust better and expects
| that it will make development of new features easier.
| vinceguidry wrote:
| Don't forget Opal!
|
| https://opalrb.com/
| Linda703 wrote:
___________________________________________________________________
(page generated 2022-04-20 23:01 UTC)