[HN Gopher] Porting Rust's Std to Rustix
___________________________________________________________________
Porting Rust's Std to Rustix
Author : jpgvm
Score : 204 points
Date : 2022-01-04 10:36 UTC (12 hours ago)
(HTM) web link (blog.sunfishcode.online)
(TXT) w3m dump (blog.sunfishcode.online)
| codeflo wrote:
| As a believer in removing unnecessary layers of abstraction, this
| looks interesting. Is there a technical reason Rustix doesn't
| support Windows, or is that just not a priority at the moment?
|
| (For a bit of background info: It might seem counterintuitive,
| but on Windows, a "native" application "should" actually call the
| programming language agnostic Kernel32.dll functions directly. At
| least that's the documented, stable way of doing things. Instead,
| Rust's std currently goes through libc, which is also fine, but
| on Windows is an abstraction built on top of the Windows APIs,
| and is historically a lot less stable. This can cause code bloat
| and distribution hassles.
|
| This is a bit different from Linux, where the abstractions are
| layered the other way round: the OS (POSIX) spec mostly assumes
| an existing libc. As a result, on Linux, going libc-less is a bit
| harder in my limited experience -- though certainly possible if
| you know what you're doing.)
| sanxiyn wrote:
| > Instead, Rust's std currently goes through libc
|
| Are you sure about this? Rust's std's File::open calls
| CreateFile on Windows, not libc open. Isn't CreateFile correct
| Kernel32.dll API to call?
|
| https://github.com/rust-lang/rust/blob/master/library/std/sr...
| codeflo wrote:
| It is. I haven't checked the complete source code. What I
| know for sure is that a program using Rust's std still
| requires the C runtime to link and run, and from that
| perspective, it doesn't really matter if some or even many
| parts of the std don't actually use it.
| [deleted]
| sanxiyn wrote:
| I see. I wonder where Rust's std is using libc on Windows.
| I know for a fact filesystem portion of std doesn't call
| libc at all on Windows.
| ChrisSD wrote:
| Rust needs the C runtime for startup/shutdown code (and
| vcruntime for panics). It also needs basic memory
| functions such as memcpy and stack probes. There is no
| way round this except rewriting those in Rust.
| sanxiyn wrote:
| Rust already ships with its own memcpy, so it should be
| possible: https://github.com/rust-lang/compiler-
| builtins/blob/master/s...
| ectopod wrote:
| The startup/shutdown code is for the C runtime. If you
| avoid the C runtime you don't need it.
| ChrisSD wrote:
| At a minimum Rust would have to run the C initializers as
| these are used even in pure Rust code. It might also make
| use of the security cookie but I'm uncertain about that.
| Also the C runtime is needed by the SEH handling code (in
| vcruntime) so Rust would have to replace that before it
| can replace the C startup/shutdown.
|
| To be clear, this is feasible but it'll require someone
| knowledgable to put in the work of rewriting this in
| Rust.
| steveklabnik wrote:
| https://crates.io/crates/r0 exists (not everything you're
| talking about but like, there's some of this stuff
| around, in some contexts. We'll see if the whole pile of
| stuff needed ever gets ported or not, I'm guessing yes
| but on a long timeframe.)
| samhw wrote:
| Is there a reason "native" and "should" are in scare quotes? I
| think "should" is pretty well defined - after all, we've got an
| RFC for it!
|
| Edit: I'm not altogether sure why this is being downvoted, but
| in case it wasn't clear: this is a sincere question and not
| some kind of pedantic rhetorical question. (I don't really mind
| whether people upvote or downvote the comment, except to the
| extent that it's a signal that I perhaps wasn't clear in what I
| was asking.)
| codeflo wrote:
| I didn't downvote, in fact, I'm a bit amused because the
| reason I used quotes is exactly because I'm not using any
| official definition! There's no consensus definition of
| "native", and I'm using "should" purely in the colloquial
| sense of "something I'd like to see".
| CodesInChaos wrote:
| There isn't anything objectively wrong with using libc as an
| abstraction on Windows, especially since Win10 ships the
| "Universal C Runtime" out of the box (earlier you had to
| deploy a separate crt for each Visual Studio version).
|
| Personally I'm also in favour of cutting out libc and
| invoking the windows API directly. But others might argue
| that using it as a shared abstraction that's very similar on
| Windows, Linux, OSX, BSD, is worth it.
|
| And "native" is ill defined. It's just one more abstraction
| layer in the libc -> Win32-API -> NT-API -> Kernel chain.
| Libc isn't comparable to a java runtime or electron, which is
| what people usually mean when they talk about applications
| not being "native".
| pjmlp wrote:
| Kind of, it is still a language runtime specially on non-
| UNIX OSes.
|
| The recent thread about cutting down the startup of C
| applications on Linux proves the point it isn't a zero cost
| library.
| ylyn wrote:
| > This is a bit different from Linux, where the abstractions
| are layered the other way round: the OS (POSIX) spec mostly
| assumes an existing libc.
|
| FWIW the POSIX spec assumes libc, but on Linux specifically the
| syscall interface is stable.
| sanxiyn wrote:
| Note that macOS syscall interface is specifically unstable so
| Rust needs POSIX/libc path anyway.
| masklinn wrote:
| Also on Solaris/Illumos/SmartOS/..., and on OpenBSD (where
| it's more or less verboten entirely since system-call-
| origin verification).
| codeflo wrote:
| Thanks, that's a great correction/clarification.
| samhw wrote:
| This looks fantastic. In fact, in the best possible way, it looks
| _boring_. I 'm amazed that this wasn't always how it was
| implemented. I hope this can be upstreamed into std.
| sanxiyn wrote:
| It was mostly to save time. You need libc path to support macOS
| anyway, and by using libc you can share most code between macOS
| and Linux (and BSDs). Once that is done, I think Linux-only
| syscall path was not justified then.
|
| Now I think Rust has enough resource to try this.
| samhw wrote:
| Interesting, thanks for the detail! That makes perfect sense
| as a rationale. I'm glad the Rust team now has enough time to
| refine these things - thanks for all your work :)
| algesten wrote:
| If this was merged into Rust, I assume it means the use of
| `unsafe` in std would be "pushed down" one level into Rustix
| instead.
|
| Wonder if that means it would be easier to verify the correctness
| of `unsafe` use inside Rust itself?
| ansible wrote:
| The goal is to get this is merged _into_ Rust 's std library,
| they wouldn't write std on top of Rustix.
|
| > _Wonder if that means it would be easier to verify the
| correctness of `unsafe` use inside Rust itself?_
|
| If you mean usage of unsafe inside std (which the Rust compiler
| does depend upon), that is one of the explicit goals of Rustix.
| The usage of unsafe will be much more narrow overall, mostly
| surrounding the system calls themselves.
| Arnavion wrote:
| The port of libstd linked in the article still uses rustix as
| a separate crate.
|
| In any case, I hope it remains separate. Third-party no_std
| code would benefit from it.
| ansible wrote:
| I am under the impression that no-std code is primarily the
| concern of embedded systems that might not have much of an
| OS [1] at all, nevermind a suite of system calls to Linux.
|
| [1] Like a typical RTOS, that may provide some basic
| communication primitives, thread creation, and a hardware
| abstraction layer (HAL).
| roblabla wrote:
| That is nor necessarily the case. Embedded is a big user,
| but there are other reasons to use no_std, such as
| explicitly wanting to avoid heap allocations, or wanting
| greater control over the native APIs being called.
| algesten wrote:
| Yeah, that's what I mean. Sounds good!
| Icathian wrote:
| This looks like such a fascinating project. I would love to
| contribute to this sort of low-level work, but frankly I don't
| know enough yet. I'm going to have to dig into that repo a bunch.
|
| Anyone have any personal favorite books or articles that would be
| good for getting up to speed on this kind of thing?
| antonok wrote:
| I recommend checking out this issue [1] if you want to get your
| feet wet. Attempting to build a new Rust program and
| documenting what prevented it from working is a great way to
| understand how everything is implemented. Pretty satisfying if
| you can get a program working, too.
|
| [1] https://github.com/sunfishcode/mustang/issues/22
| Icathian wrote:
| This is an excellent place to start. Thank you!
| sanxiyn wrote:
| For this project, you'd want The Linux Programming Interface.
| Information here: https://man7.org/tlpi/
| Icathian wrote:
| I have a copy I grabbed for an OS class. I should take that
| back out. Much appreciated!
| styfle wrote:
| Does this mean we could have a single binary for x86_64 linux
| instead of two (glibc and musl)?
|
| x86_64-unknown-linux-gnu
|
| x86_64-unknown-linux-musl
| CodesInChaos wrote:
| I expect 4 options:
|
| * glibc linked dynamically
|
| * musl linked dynamically
|
| * musl linked statically
|
| * direct syscalls from rust
|
| and possibly additional options choosing if math functions,
| memcpy, etc. should use an implementation shipped with rust or
| one shipped with libc.
|
| The big question is if we'll get to a point where almost all
| rust application don't use a libc at all.
| masklinn wrote:
| > * musl linked dynamically
|
| > * musl linked statically
|
| This is handled through the orthogonal "crt-static" target-
| feature.
|
| IIRC each supported CRT has a preference, but for some that
| can be overridden (musl would be one of those, possibly one
| of few of those since statically linking glibc is generally
| recommended against, and so's pretty much every non-linux
| libc, when that's an option at all).
| sanxiyn wrote:
| No, it means there will be three binaries instead of two.
|
| Note that statically linked musl binaries run fine on glibc
| Linux, so you can just distribute musl binaries. That works
| right now. In the future, you will just distribute Rustix
| binaries, which will be hopefully smaller than musl binaries
| (because it doesn't need to support C API).
| heythere22 wrote:
| Just a quick reminder, using syscalls without libc is something
| that will cause trouble with OpenBSD. From [1]: "The eventual
| goal would be to disallow system calls from anywhere but the
| region mapped for libc"
|
| [1] https://lwn.net/Articles/806776/
| colonwqbang wrote:
| I think Linux is the about the only system where this makes
| sense. I don't know of another system where the syscall ABI is
| rigidly defined and stable like it is on Linux. Om most other
| systems the syscall ABI tends to be "call this C function in
| that shared library".
|
| However, most computers in the world run Linux nowadays.
| ink_13 wrote:
| However good you think Linux is on this file, the BSDs,
| particularly OpenBSD, are better.
| FpUser wrote:
| There is a saying: the best camera is the one you have with
| you when needed.
|
| For programming - whatever you use and are comfortable with
| is the best. One does not need to concern what others like
| / use as long as it does not impede one's own work.
| WJW wrote:
| That is irrelevant in the context of the `linux-raw`
| backend of this project, which does not target the BSDs.
| staticassertion wrote:
| > This project promotes several other goals as well, such as
| promoting I/O safety concepts and APIs, helping test some of the
| infrastructure used by cap-std, and helping set the stage for
| future projects related to sandboxing, WASI, nameless, and other
| areas.
|
| I'd be interested in hearing more about this. I have a lot of
| thoughts about sandboxing and Rust, including both build and
| runtime sandboxing. Would be cool to understand if others are
| working on this and chat about it. I'm familiar with cap-std, but
| curious about any other initiatives or where the discussions are
| happening.
| asplake wrote:
| Possibly dumb q, but potential useful to other languages?
| kibwen wrote:
| It looks like the goal of this project is to allow Rust code to
| avoid having to use FFI to call C code. In order to be useful
| to non-Rust languages, this project would either have to expose
| a C-compatible interface for FFI (which somewhat defeats the
| point of trying to avoid C), or else the other language would
| need to natively support Rust FFI (which would be a large
| amount of ongoing labor, since Rust doesn't have a stable ABI).
| clhodapp wrote:
| I believe that it _would_ open the door to writing a Linux-only
| libc in Rust, which could be cool
| nicoburns wrote:
| There is in fact already a library that does that
| https://github.com/redox-os/relibc
___________________________________________________________________
(page generated 2022-01-04 23:01 UTC)