[HN Gopher] Our Experience Porting the YJIT Ruby Compiler to Rust
___________________________________________________________________
Our Experience Porting the YJIT Ruby Compiler to Rust
Author : thunderbong
Score : 86 points
Date : 2022-05-11 18:41 UTC (4 hours ago)
(HTM) web link (shopify.engineering)
(TXT) w3m dump (shopify.engineering)
| epage wrote:
| Glad to see an explanation for why they bypassed cargo. I had
| wondered about that when the PR was posted as it didn't contain
| any documentation for why they were going off the beaten path.
| However, I'm a bit confused at the reasoning. Because a solution
| felt sub-optimal (vendoring an optional dependency), they went
| the route of invoking rustc directly? That seems a bit scorched
| earth especially when they later talk about needing another
| dependency offline and instead of vendoring both they instead
| worked around that in a different way.
| Thaxll wrote:
| Only 3month for such a project is impressive, how big was the
| original C code base?
| vlang1dot0 wrote:
| According to the article:
|
| > YJIT is a relatively simple JIT compiler that totals about
| 11,000 lines of C code.
| SemanticStrengh wrote:
| the rust port is 16K LOCs despite the crazy C verbosity??
| (I'm reffering to the github addition count)
| pawelduda wrote:
| They stated it's not that big of a project and since it was a
| direct code conversion rather than to idiomatic Rust, it
| probably reduced the workload.
| ibraheemdev wrote:
| The rewrite commit is here for anyone interested:
| https://github.com/ruby/ruby/pull/5826
| steveklabnik wrote:
| I've been tickled pink to watch this work happen. A wild fusion
| of my past and my present. And of course, much cleaner and
| properly done, as opposed to fun experiments I used to like to
| do, like https://github.com/steveklabnik/ruby/tree/rust
| rubyskills wrote:
| Haha, love it. Both of your passions melded together! (We met
| in Michigan at our mutual rust conf btw)
| throwaway38311 wrote:
| avgcorrection wrote:
| You might recognize the author from https://pointersgonewild.com/
| faitswulff wrote:
| I wonder what the author would have preferred to the visual
| litter of `unsafe` blocks? Can you use `unsafe` to mark off
| entire scopes to avoid having to redeclare it? Seems unavoidable
| in a project that is meant to heavily interoperate with a C99
| code base like Ruby.
|
| Cargo having to phone home all the time sounds annoying, though.
| I hope that gets improved, though I know the cargo team is
| swamped right now.
| steveklabnik wrote:
| > Can you use `unsafe` to mark off entire scopes to avoid
| having to redeclare it?
|
| Yes.
|
| > Cargo having to phone home all the time sounds annoying,
| though.
|
| It does not have to phone home, ever. They for some reason
| didn't seem to want to use the right workflow, but that
| workflow does exist, and is an important and well-supported
| use-case.
|
| (It seems from the bug report that they did hit a confusing
| error message, which is unfortunate.)
| faitswulff wrote:
| Thanks for the clarifications!
| steveklabnik wrote:
| No problem. And I am serious about the "for some reason"
| bit, I don't know why `cargo vendor` isn't acceptable for
| them, but they know their requirements better than I do.
| tptacek wrote:
| She gave the reason in the piece, and linked to the ticket
| she filed:
|
| https://github.com/rust-lang/cargo/issues/10352
| steveklabnik wrote:
| I read the piece and the ticket. I still don't understand.
| That's fine, it's not my responsibility to anymore. I'm
| glad they found a solution they're happy with.
|
| Specifically, cargo vendor was suggested in the ticket, but
| there was no response as to why it wasn't adequate for this
| case, just that it doesn't "feel" good. I'm probably
| missing something.
| tptacek wrote:
| The reason is right there in the ticket. They have
| exactly one dependency, and it's optional. They don't
| want to vendor it just so they can build the no-capstone
| artifact in CI without having to phone home to crates.io
| or whatever. You don't have to agree with her, but you
| can't pretend she didn't give a clear reason.
| epage wrote:
| > They have exactly one dependency, and it's optional.
|
| Except they would have two dependencies (the other being
| libc) but they wanted to avoid vendoring for that as
| well.
| tptacek wrote:
| That's a good point, thanks!
| steveklabnik wrote:
| Saying "we don't want to" is technically a reason, sure,
| but it doesn't mean I understand the why. Why is that a
| problem?
|
| (Depending on that why, I have an idea or two for a
| workaround, but you can't suggest solutions until you
| understand the actual issue. I have my own laundry list
| of issues with Cargo that I have to work around; it's
| painful and I have a lot of empathy for that.)
| tptacek wrote:
| For all the same reasons that vendoring isn't the
| _default_ option in any dependency management system in
| any language: because there is a cost to vendoring. She
| didn 't say that, but I wouldn't think she'd have to. You
| don't vendor all of your dependencies either: why not?
| There's your answer.
|
| I'm wondering if maybe you've missed the part where they
| don't need this single dependency in the builds they're
| talking about doing.
|
| It seems like the Rust community people responding on the
| ticket had no trouble understanding what the issue is.
| There's a difference between acknowledging something and
| conceding that it needs to be changed.
| steveklabnik wrote:
| I don't because I don't need to worry about offline
| builds. If I did, then I'd vendor them. Hence not
| understanding. (And even if it literally is that only,
| then I have at least one idea for what they _could_ do
| that would make this work. But maybe that solution would
| fail for other reasons, and I want to acknowledge that
| it's not like I have the full requirements list and am
| saying "oh gosh just do this," because I am not.)
|
| That it's a single dependency is why it feels extreme to
| me; won't that be a very, very tiny thing to vendor?
|
| Anyway I don't really know why you're being so aggro
| here. I don't know how many times I can say "I'm probably
| missing something" and "they know better than I do." I
| really like you, Thomas, but I also hear the passive
| aggression, and don't like it. It makes me want to return
| it, but that doesn't benefit anyone.
| tptacek wrote:
| You said "They for some reason didn't seem to want to use
| the right workflow", which by my lights manages to be
| both loaded and dismissive (if that was the goal, well
| played). I pointed out that the "some reason" is right
| there in the post, and you doubled down. I'm not sure why
| you did; you could have just said, "oh, interesting,
| thanks" and I'd have said "pip pip! cheerio!" but if you
| want instead to... uh... explore the contours of this
| controversy? I'm always game, which is what you're
| experiencing now.
|
| If you want to keep me invested in a programming
| discussion, one very good way to do it is to imply that
| you can't for the life of you understand why someone
| wouldn't want to vendor Capstone.
| tptacek wrote:
| For what it's worth: the complaint here makes sense, but
| "unsafe" isn't just telling the compiler what's going on, it's
| (probably more importantly) telling other programmers what's
| going on.
| VWWHFSfQ wrote:
| > Unlike C, Rust won't automatically promote integer types to
| wider types. It forces you to manually cast any mismatching
| integer types for every operation
|
| > Rust's insistence on manual casting everywhere encourages
| people to write inefficient code, because writing verbose code
| feels uncomfortable and adds friction.
|
| This reads a lot to me like "we like how easy it is to create
| bugs in C. We'd like to be able to do that in Rust, too." But
| maybe I'm misunderstanding. I'm sure these people know a lot more
| about the consequences of implicit integer casting than I do.
| colejohnson66 wrote:
| Yes, but:
|
| > ..there's no reason why you couldn't safely promote a u8,
| u16, or u32 into a usize, and asking programmers to manually
| cast integers everywhere makes the code more noisy and verbose.
|
| They have a valid point. In C#, numerical casts that won't lose
| precision are implicit, and any that could are explicit. I
| think that's a nice compromise. So, int to uint requires a
| cast, but int/uint to long doesn't: int myInt
| = /* ... */; long myLong = myInt; uint myUInt =
| myInt; // error CS0266: Cannot implicitly convert type 'int' to
| 'uint'. An explicit conversion exists (are you missing a cast?)
| ulong myULong = myInt; // error CS0266: Cannot implicitly
| convert type 'int' to 'ulong'. An explicit conversion exists
| (are you missing a cast?) uint myUInt = /* ... */;
| int myInt = myUInt; // error CS0266: Cannot implicitly convert
| type 'uint' to 'int'. An explicit conversion exists (are you
| missing a cast?) long myLong = myUInt; ulont
| myULong = myUInt;
| stonemetal12 wrote:
| u8 max value + 1 means something as a u8. u8 max value + 1
| means something different as a u16. Implicitly changing
| behavior is never cool.
| shepmaster wrote:
| And chiming in:
|
| > there's no reason why you couldn't safely promote a u8,
| u16, or u32 into a usize
|
| The people running Rust on a 16-bit architecture (e.g. AVR)
| would disagree. On those platforms, usize is 16-bit, so
| promoting u32 to usize would not be possible.
|
| I _do_ wish there was a nicer way to declare up-front "I
| *never* want this crate to compile on something smaller than
| 32-bit (or 64-bit), so assume that".
|
| IIRC, similar ideas like this have floated around under the
| name of a "platform compatibility lint".
| wahern wrote:
| But whether implicitly or explicitly converted, the
| compiler would reject the code all the same, no? (Unless an
| explicit conversion would actually succeed, which is even
| _worse_. I 'm not a Rust programmer so don't know. But this
| is why explicit casts are frowned upon in C.)
| JoshTriplett wrote:
| The proposal is that by default the compiler would refuse
| to do an infallible conversion from `u32` to `usize`, but
| if you declare up front "this code assumes at least a
| 32-bit target", then you gain the ability to do
| infallible conversions from u32 to usize. And if you
| declare up front "this code assumes at least a 64-bit
| target", you gain the ability to do infallible
| conversions from u64 to usize.
|
| Also, you can already convert u8 and u16 to usize
| infallibly, today. Only u32 and u64 currently require
| fallible conversions via try_from.
| prirun wrote:
| Nim had this same problem a couple of years ago, and it's very
| annoying. I just checked some stdlib functions and it's still
| there:
|
| https://github.com/nim-lang/Nim/blob/devel/lib/std/varints.n...
|
| For example, in readVu64: 18: proc readVu64*(z:
| openArray[byte]; pResult: var uint64): int = 24: pResult
| = (uint64 z[0] - 241) * 256 + z[1].uint64 + 240 28:
| pResult = 2288u64 + 256u64*z[1].uint64 + z[2].uint64 31:
| pResult = (z[1].uint64 shl 16u64) + (z[2].uint64 shl 8u64) +
| z[3].uint64 33: let x = (z[1].uint64 shl 24) +
| (z[2].uint64 shl 16) + (z[3].uint64 shl 8) + z[4].uint64
|
| I get that there has to be some indication to the compiler to
| distinguish between doing 8-bit unsigned math vs 64-bit
| unsigned math to evaluate these expressions. But having such a
| mish-mash in a stdlib function makes new users think, "There
| must have been a good reason to do it like this since it's in
| stdlib, but dang if I know what it is!"
|
| I ran into this myself a couple of years ago when looking at
| Nim and agree it is annoying, verbose, and ugly to be forced to
| specify types for obvious integer things.
| SemanticStrengh wrote:
| This is a big reason on why I loosed interest on rust
| __s wrote:
| It's more complaining that Rust is verbose when one wants to
| store small integers in structs but have intermediate
| computation use machine width integers
|
| Implicit widening doesn't tend to cause bugs. I don't think
| it'd be so bad if vec[u8] implicitly worked without needing
| vec[u8 as usize]
| estebank wrote:
| FWIW, you can have a newtype wrapper around Vec that provides
| indexing access with any type you want. If you are porting or
| writing code that uses i32 for indices for some reason, then
| you use the wrapper instead of Vec with no runtime cost and
| no inline type conversions.
|
| This hasn't been solved at the language level yet because if
| we `impl Idx for u8` then inference in `vec[42]` breaks.
| Making inference support that in a principled manner will
| require a lot of work, and doing it in a hacky way might lock
| us into a suboptimal place that we haven't fully explored
| yet. I want to see _some_ solution here at some point.
| [deleted]
| SemanticStrengh wrote:
| I hope shopify will keep investing in the disruptive truffleruby
___________________________________________________________________
(page generated 2022-05-11 23:00 UTC)