[HN Gopher] A small Rust binary indeed (2022)
___________________________________________________________________
A small Rust binary indeed (2022)
Author : estebank
Score : 68 points
Date : 2024-02-18 17:32 UTC (5 hours ago)
(HTM) web link (darkcoding.net)
(TXT) w3m dump (darkcoding.net)
| blovescoffee wrote:
| Halfway through the article and we have
|
| unsafe { asm!( "mov edi, 42", "mov eax, 60", "syscall",
| options(nostack, noreturn) ) // nostack prevents `asm!` from
| push/pop rax // noreturn prevents it putting a 'ret' at the end
| // but it does put a ud2 (undefined instruction) instead }
|
| and
|
| > We will need to tell the C compiler that we're providing our
| own entry point, telling it not to include it's own start files.
|
| So it's a Rust program but it's just calling inline assembly and
| using a C compiler?
| remexre wrote:
| Rust uses the C compiler as a linker, because this is often the
| only way to ensure all the libraries needed by the system
| toolchain are included. (Compare to the CCLD variable in
| autotools -- it refers to the command to use the C compiler as
| a linker, and exists for this very reason.)
|
| This isn't only libc -- it also includes libgcc (or compiler-
| rt, depending on your system toolchain), which, despite the
| name, may still be called "behind your back" by the LLVM
| toolchain.
|
| > So it's a Rust program but it's just calling inline assembly
| and using a C compiler?
|
| Yeah, I think this article is more in the tradition of [0] (but
| trying hard not to drop rustc) than being completely practical
| advice on making the binary you ship to users smaller.
|
| [0]:
| http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...
| andrewaylett wrote:
| Well, not quite -- https://github.com/grahamking/demeter-
| deploy/blob/master/see... is the Rust version of a program
| that used to be written entirely in assembly, and it seems
| that it ends up being the same size. There's a few bits of
| asm in amongst the Rust, but it's still _definitely_ a Rust
| program.
|
| 800 lines of ASM file reduced to 600 lines of Rust, including
| comments and constants in both cases. He might be pushing the
| limits, and everything's unsafe Rust, but unsafe Rust is
| still safer than raw assembly.
| vacuity wrote:
| > unsafe Rust is still safer than raw assembly
|
| I don't think I would go that far. Assembly doesn't have
| undefined behavior, and especially not with the strict
| constraints around references as in Rust. The safe/unsafe
| dichotomy in Rust is better than only using C or C++ when
| there are concise, robust encapsulations around broken
| invariants.
| lmm wrote:
| > Assembly doesn't have undefined behavior
|
| Certainly some assembly languages do.
| vacuity wrote:
| Which ones? I assume at least the 1:1 machine code kind
| doesn't, and you mean something more like bytecode, but
| it'd be interesting if I'm wrong on that count.
| vardump wrote:
| > I don't think I would go that far. Assembly doesn't
| have undefined behavior
|
| As someone who has written a fair amount of assembler
| over the years... Yes, it doesn't have undefined
| behavior, but it also lacks practically all guard rails
| and safeties.
|
| The smallest error and you might do things like
| completely messing up your call stack - just need to
| forget one "POP" or mess up with stack pointer
| adjustment. Or for example a computed jump in the middle
| of an instruction.
|
| You can create bugs that can be almost impossible to
| figure out from a crash dump that even something as low
| level as C will effectively protect you from doing.
| vacuity wrote:
| I wonder if those issues can't be somewhat mitigated with
| a linter or interactive emulator. In any case, I think
| assembly is more uniformly difficult (and not portable!),
| while unsafe Rust generally feels less painless but you
| might have no idea which invariants you need to enforce
| unless you're very knowledgeable. Definitely don't write
| a whole application in either!
| estebank wrote:
| Note that that step sheds libc entirely (so the binary needs to
| provide the minimal things that libc does for your platform,
| namely that assembly you mention, and you'd have to do the same
| for a C binary that did that) and gets rid of 3kb (16kb ->
| 13kb), but changing the linker flags to avoid page-aligning the
| binary brings it down to _400 bytes_. I would have loved if the
| author had tried that on the libc version too, just for
| comparison 's sake.
|
| In a lot of conversations around Rust binary sizes some people
| extrapolate from the "Hello, World!" size difference as if the
| additional cost on top of a bare C binary was linear, when in
| reality it is (approximately) a constant cost. That on top of
| completely disregarding that the "bloat" _is_ doing something
| (panic machinery, string formatting, DWARF symbol storage,
| DWARF symbol parsing, etc.).
| tremon wrote:
| It's definitely not a constant cost, presumably due to the
| link-time optimization that rustc does. I've had binaries go
| from 800kB to 6MB simply by switching from getopts to the
| clap crate, for example.
| pornel wrote:
| Binary using clap with all the bells and whistles, even
| without LTO, is 900KB _after strip_.
|
| The standard library has 4MB of debug info baked in, which
| due to its special integration with Cargo is always added,
| even when you explicitly configure `debug=false`. This is
| what usually surprises people and makes Rust executables
| seem huge.
| LtWorf wrote:
| So it doesn't strip unneeded stuff?
| cryo wrote:
| Interesting, would be cool to see that applied to a real world
| rust program.
|
| Today I got rid of libc on the Windows version of a commandline
| tool to flash firmware via USB, which freed 7 kB of the .exe
| size.
|
| The original version was done in C++ plus Qt and was ca. 3.5 MB
| (.exe and dependencies).
|
| The optimized C version is 14 kB compressed with upx.
|
| FYI Code: https://github.com/dresden-elektronik/gcfflasher
| Klasiaster wrote:
| One can also create small binaries with
| https://github.com/sunfishcode/origin (e.g.,
| https://github.com/sunfishcode/origin/blob/main/example-crat...
| is in that ~400 bytes range) and select features as wanted
| without having to reimplement everything. Also see
| https://github.com/sunfishcode/origin-studio and
| https://github.com/sunfishcode/mustang - and of course
| https://github.com/sunfishcode/eyra
| r0rshrk wrote:
| So, the way to make your Rust binary small is to make error
| handling more difficult, or to rewrite it in assembly?
| abathologist wrote:
| I am not much concerned with hyper optimizations, but I was
| curious how OCaml would fair with the initial, simple steps,
| before things get crazy. But I opted for a more complex program:
| (\* t.ml \*) let () = print_endline "Hello, World!"
|
| Then just doing a standard compilation and a strip:
| $ ocamlopt -o t t.ml && ls -l -h t | cut -d " " -f5 1.5M
| $ strip t && ls -l -h t | cut -d " " -f5 356K $
| ./t Hello, World!
|
| I may be overlooking something, and would be interested to learn
| what if so, but I was surprised we got a result smaller than the
| rust binary in the first instance.
| estebank wrote:
| It is interesting that before stripping the size of the Rust
| version is bigger, but after only stripping the size of the
| OCaml version is bigger. It'd be nice to try and see what the
| "extra" info that Rust ships by default is.
| abathologist wrote:
| Yeah, I thought the same!
___________________________________________________________________
(page generated 2024-02-18 23:01 UTC)