[HN Gopher] Improving Rust compile times to enable adoption of m...
___________________________________________________________________
Improving Rust compile times to enable adoption of memory safety
Author : todsacerdoti
Score : 211 points
Date : 2023-02-03 09:22 UTC (13 hours ago)
(HTM) web link (www.memorysafety.org)
(TXT) w3m dump (www.memorysafety.org)
| fidgewidge wrote:
| I wonder about the framing of the title here. Rust is great but
| realistically a lot of software with memory safety bugs doesn't
| need to be written in C in the first place.
|
| For example Java has a perfectly serviceable TLS stack written
| entirely in a memory safe language. Although you could try to
| make OpenSSL memory safe by rewriting it in Rust - which
| realistically means yet another fork not many people use - you
| could _also_ do the same thing by implementing the OpenSSL API on
| top of JSSE and Bouncy Castle. The GraalVM native image project
| allows you to export Java symbols as C APIs and to compile
| libraries to standalone native code, so this is technically
| feasible now.
|
| There's also some other approaches. GraalVM can also run many
| C/C++ programs in a way that makes them _automatically_ memory
| safe, by JIT compiling LLVM bitcode and replacing allocation
| /free calls with garbage collected allocations. Pointer
| dereferences are also replaced with safe member accesses. It
| works as long as the C is fairly strictly C compliant and doesn't
| rely on undefined behavior. This functionality is unfortunately
| an enterprise feature but the core LLVM execution engine is open
| source, so if you're at the level of major upgrades to Rust you
| could also reimplement the memory safety aspect on top of the
| open source code. Then again you can compile the result down to a
| shared native library that doesn't rely on any external JVM.
|
| Don't get me wrong, I'm not saying don't improve Rust compile
| times. Faster Rust compiles would be great. I'm just pointing out
| that, well, it's not the only memory safe language in the world,
| and actually using a GC isn't a major problem these days for many
| real world tasks that are still done with C.
| duped wrote:
| That's not feasible for the millions of devices that don't have
| the resources for deploying GraalVM or GraalVM compiled native
| images.
|
| The other thing to consider is that in many applications,
| nearly every single bit of i/o will flow through a buffer and
| cryptographic function to encrypt/decrypt/validate it. This is
| the place where squeezing out every ounce of performance is
| critical. A JIT + GC might cost a lot more money than memory
| safety bugs + AOT optimized compilation.
| fidgewidge wrote:
| Native images are AOT optimized and use way less RAM than a
| normal Java app on HotSpot does. And you can get competitive
| performance from them with PGO.
|
| Using a GC doesn't mean never reusing buffers, and Java has
| intrinsics for hardware accelerated cryptography for a long
| time. There's no reason performance has to be less,
| especially if you're willing to fund full time research
| projects to optimize it.
|
| The belief that performance is more important than everything
| else is exactly how we ended up with pervasive memory safety
| vulns to begin with. Rust doesn't make it free, as you pay in
| developer hours.
| duped wrote:
| How big are GraalVM native images?
| titzer wrote:
| There is Go, too.
| nindalf wrote:
| > you could try to make OpenSSL memory safe by rewriting it in
| Rust
|
| Or just write a better crypto stack without the many legacy
| constraints holding OpenSSL back. Rustls
| (https://github.com/rustls/rustls) does that. It has also been
| audited and found to be excellent - report (https://github.com/
| rustls/rustls/blob/main/audit/TLS-01-repo...).
|
| You're suggesting writing this stack in a GC language. That's
| possible, except most people looking for an OpenSSL solution
| probably won't be willing to take the hit of slower run time
| perf and possible GC pauses (even if these might be small in
| practice). Also, these are hypothetical for now. Rustls exists
| today.
| fidgewidge wrote:
| OpenSSL was just an example. You could also use XML parsing
| or many other tasks.
|
| Point is that the code already exists - it's not hypothetical
| - and has done for a long time. It is far easier to write
| bindings from an existing C API to a managed implementation
| than write, audit and maintain a whole new stack from
| scratch. There are also many other cases where apps could
| feasibly be replaced with code written in managed languages
| and then invoked from C or C++.
|
| Anything written in C/C++ can certainly tolerate pauses when
| calling into third party libraries because malloc/free can
| pause for long periods, libraries are allowed to do IO
| without even documenting that fact etc.
|
| I think it's fair to be concerned that rewrite-it-in-rust is
| becoming a myopic obsession for security people. That's one
| way to improve memory safety but by no means the only one.
| There are so many cases where you don't need to do that and
| you'll get results faster by not doing so, but it's not being
| considered for handwavy reasons.
| josephg wrote:
| I think the thing you're missing is that opensource people
| love rewriting libraries in their favourite languages.
| Especially something well defined, like tls or an xml
| parser. Rustls is a great example. You wont stop people
| making things like this. Nor should you - they're doing it
| for fun!
|
| It's much more fun to rewrite something in a new language
| than maintain bindings to some external language. You could
| wrap a Java library with a rust crate, but it would depend
| on Java and rust both being installed and sane on every
| operating system. Maintaining something like that would be
| painful. Users would constantly run into problems with Java
| not being installed correctly on macos, or an old version
| of Java on Debian breaking your crate in weird ways. It's
| much more pleasant to just have a rust crate that runs
| everywhere rust runs, where all of the dependencies are
| installed with cargo.
| nindalf wrote:
| > It is far easier to write bindings from an existing C API
| to a managed implementation than write, audit and maintain
| a whole new stack from scratch.
|
| I'd agree, if rustls wasn't already written, audited and
| maintained. And there are other examples as well. The
| internationalisation libraries Icu4c and Icu4j exist, but
| the multi-language, cross-platform library Icu4x is written
| in Rust. Read the announcement post on the Unicode blog
| (http://blog.unicode.org/2022/09/announcing-
| icu4x-10.html?m=1) - security is only one of the reasons
| they chose to write it in Rust. Binary size, memory usage,
| high performance. Also compiles to wasm.
|
| Your comment implies that people rewrite in Rust for
| security alone. But there are so many other benefits to
| doing so.
| e12e wrote:
| > people looking for an OpenSSL solution probably won't be
| willing to take the hit of slower run time perf and possible
| GC pauses
|
| Golang users would?
|
| That aside excellent points about rust tls, and libssl legacy
| cruft.
| nindalf wrote:
| No, I'm imagining cross-language usage. Someone not using
| Go isn't going to use the crypto/tls package from the Go
| std lib regardless of its quality. The overhead and
| difficulty of calling into Go make this infeasible.
|
| To include a library written in another language as a
| shared lib, it needs to be C, C++ or Rust.
| burntsushi wrote:
| I originally posted this on reddit[1], but figured I'd share this
| here. I checked out ripgrep 0.8.0 and compiled it with both Rust
| 1.20 (from ~5.5 years ago) and Rust 1.67 (just released):
| $ git clone https://github.com/BurntSushi/ripgrep $ cd
| ripgrep $ git checkout 0.8.0 $ time cargo +1.20.0
| build --release real 34.367 user 1:07.36
| sys 1.568 maxmem 520 MB faults 1575
| $ time cargo +1.67.0 build --release [... snip sooooo
| many warnings, lol ...] real 7.761 user
| 1:32.29 sys 4.489 maxmem 609 MB
| faults 7503
|
| As kryps pointed out on reddit, I believe at some point there was
| a change to add/improve compilation times by making more
| effective use of parallelism. So forcing the build to use a
| single thread produces more sobering results, but still a huge
| win: $ time cargo +1.20.0 build -j1 --release
| real 1:03.11 user 1:01.90 sys 1.156
| maxmem 518 MB faults 0 $ time cargo
| +1.67.0 build -j1 --release real 46.112 user
| 44.259 sys 1.930 maxmem 344 MB
| faults 0
|
| (My CPU is a i9-12900K.)
|
| These are from-scratch release builds, which probably matter less
| than incremental builds. But they still matter. This is just one
| barometer of many.
|
| [1]:
| https://old.reddit.com/r/rust/comments/10s5nkq/improving_rus...
| ilyagr wrote:
| Re parallelism: I have 12 cores, and cargo indeed effectively
| uses them all. As a result, the computer becomes extremely
| sluggish during a long compilation. Is there a way to tell Rust
| to only use 11 cores or, perhaps, nice its processes/threads to
| a lower priority on a few cores?
|
| I suppose it's not the worst problem to have. Makes me realize
| how spoiled I got after multiple-core computers became the
| norm.
| jrockway wrote:
| Are they real cores or hyperthreads/SMT? I've found that
| hyperthreading doesn't really live up to the hype; if
| interactive software gets scheduled on the same physical core
| as a busy hyperthread, latency suffers. Meanwhile, Linux
| seems to do pretty well these days handling interactive
| workloads while a 32 core compilation goes on in the
| background.
|
| SMT is a throughput thing, and I honestly turn it off on my
| workstation for that reason. It's great for cloud providers
| that want to charge you for a "vCPU" that can't use all of
| that core's features. Not amazing for a workstation where you
| want to chill out on YouTube while something CPU intensive
| happens in the background. (For a bazel C++ build, having SMT
| on, on a Threadripper 3970X, does increase performance by
| 15%. But at the cost of using ~100GB of RAM at peak! I have
| 128GB, so no big deal, but SMT can be pretty expensive. It's
| probably not worth it for most workloads. 32 cores builds my
| Go projects quickly enough, and if I have to build C++ code,
| well, I wait. ;)
| globalreset wrote:
| exec ionice -c 3 nice -n 20 "$@"
|
| Make it a shell script like `takeiteasy`, and run `takeiteasy
| cargo ...`
| kstrauser wrote:
| Partly because of being a Dudeist, and partly because it's
| just fun to say, I just borrowed this and called it "dude"
| on my system. dude cargo ...
|
| has a nice flow to it.
| mbrubeck wrote:
| `cargo build -j11` will limit parallelism to eleven cores.
| Cargo and rustc use the Make jobserver protocol [0][1][2] to
| coordinate their use of threads and processes, even when
| multiple rustc processes are running (as long as they are
| part of the same `cargo` or `make` invocation):
|
| [0]: https://www.gnu.org/software/make/manual/html_node/Job-
| Slots...
|
| [2]: https://github.com/rust-lang/cargo/issues/1744
|
| [2]: https://github.com/rust-lang/rust/pull/42682
|
| `nice cargo build` will run _all_ threads at low priority,
| but this is generally a good idea if you want to prioritize
| interactive processes while running a build in the
| background.
| epage wrote:
| To add, in rust 1.63, cargo added support for negative
| numbers, so you can say `cargo build --jobs -2` to leave
| two cores available.
|
| See https://github.com/rust-
| lang/cargo/blob/master/CHANGELOG.md#...
| [deleted]
| Ygg2 wrote:
| As someone who uses Rust on various hobby projects, I never
| understood why people were complaining about compile times.
|
| Perhaps they were on old builds or some massive projects?
| burntsushi wrote:
| Wait, like, you don't _understand_ , or you don't share their
| complaint? I don't really understand how you don't
| understand. If I make a change to ripgrep because I'm
| debugging its perf and need to therefore create a release
| build, it can take several seconds to rebuild. Compared to
| some other projects that probably sounds amazing, but it's
| still annoying enough to impact my flow state.
|
| ripgrep is probably on the smallish side. It's not hard to
| get a lot bigger than that and have those incremental times
| also get correspondingly bigger.
|
| And complaining about compile times doesn't mean compile
| times haven't improved.
| Ygg2 wrote:
| I do understand some factors, but I never noticed it being
| like super slow to build.
|
| My personal project takes seconds to compile, but fair
| enough it's small, but even bigger projects like a game in
| Bevy don't take that much to compile. Minute or two tops.
| About 30 seconds when incremental.
|
| People complained of 10x slower perf. Essentially 15min
| build times.
|
| Fact that older versions might be slower to compile fills
| another part of the puzzle.
|
| That and fact I have a 24 hyper thread monster of CPU.
| TinkersW wrote:
| 30 seconds isn't incremental, that is way too long.
|
| I work on a large'ish C++ project and incremental is
| generally 1-2 seconds.
|
| Incremental must work in release builds(someone else said
| it only works in debug for Rust), although it is fine to
| disable link time optimizations as those are obviously
| kinda slow.
| jackmott42 wrote:
| First, compile times can differ wildly based on the code in
| question. Big projects can take minutes where hobby projects
| take second.
|
| Also, people have vastly different work flows. Some people
| tend to slowly write a lot of code and compile rarely. Maybe
| they tend to have runtime tools to tweak things. Otherwise
| like to iterate really fast. Try a code change, see if the UI
| looks better or things run faster, and when you work like
| this even a compile time of 3 seconds can be a little bit
| annoying, and 30 seconds maddening.
| Taywee wrote:
| It's less about "big projects" and more about "what
| features are used". It's entirely possible for a 10kloc
| project to take much more time to build than a 100kloc
| project. Proc macros, heavy generic use, and the like will
| drive compile time way up. It's like comparing a C++
| project that is basically "C with classes" vs one that does
| really heavy template dances.
|
| Notably, serde can drive up compile times a lot, which is
| why miniserde still exists and gets some use.
| jph wrote:
| Code gen takes quite a while. Diesel features are one way to
| see the effect...
|
| diesel = { version = "*", features = ["128-column-tables"],
| ... }
| [deleted]
| twotwotwo wrote:
| This also relates to something not directly about rustc: many-
| core CPUs are much easier to get than five years ago, so a CPU-
| hungry compiler needn't be such a drag if its big jobs can use
| all your cores.
| michaelt wrote:
| It's true!
|
| Steam hardware survey, Jan 2017 [1] vs Jan 2023, "Physical
| CPUs (Windows)" 2017 2023 1
| CPU 1.9% 0.2% 2 CPUs 45.8% 9.6% 3 CPUs
| 2.6% 0.4% 4 CPUs 47.8% 29.6% 6 CPUs 1.4%
| 33.0% 8 CPUs 0.2% 18.8% More 0.3% 8.4%
|
| [1] https://web.archive.org/web/20170225152808/https://store.
| ste...
| masklinn wrote:
| However, rustc currently has limited ability to parallelise
| at a sub-crate level, which makes for not-so-great tradeoffs
| on large projects.
| manholio wrote:
| The most annoying thing in my experience is not really the raw
| compilation times, but the lack of - or very rudimentary -
| incremental build feature. If I'm debugging a function and make
| a small local change that does not trickle down to some generic
| type used throughout the project, then 1-second build times
| should be the norm, or better yet, edit & continue debug.
|
| It's beyond frustrating that any "i+=1" change requires
| relinking a 50mb binary from scratch and rebuilding a good
| chunk of the Win32 crate for good measure. Until such
| enterprise features become available, high developer
| productivity in Rust remains elusive.
| burntsushi wrote:
| To be clear, Rust has an "incremental" compilation feature,
| and I believe it is enabled by default for debug builds.
|
| I don't think it's enabled by default in release builds
| (because it might sacrifice perf too much?) and it doesn't
| make linking incremental.
|
| Making the entire pipeline incremental, including release
| builds, probably requires some very fundamental changes to
| how our compilers function. I think Cranelift is making
| inroads in this direction by caching the results of compiling
| individual functions, but I know very little about it and
| might even be describing it incorrectly here in this comment.
| josephg wrote:
| > It's beyond frustrating that any "i+=1" change requires
| relinking a 50mb binary from scratch
|
| It's especially hard to solve this with a language like rust,
| but I agree!
|
| I've long wanted to experiment with a compiler architecture
| which could do fully incremental compilation, maybe down the
| function in granularity. In the linked (debug) executable,
| use a malloc style library to manage disk space. When a
| function changes, recompile it, free the old copy in the
| binary, allocate space for the new function and update jump
| addresses. You'd need to cache a whole lot of the compiler's
| context between invocations - but honestly that should be
| doable with a little database like LMDB. Or alternately, we
| could run our compiler in "interactive mode", and leave all
| the type information and everything else resident in memory
| between compilation runs. When the compiler notices some
| functions are changed, it flushes the old function
| definitions, compiles the new functions and updates
| everything just like when the DOM updates and needs to
| recompute layout and styles.
|
| A well optimized incremental compiler should be able to do a
| "i += 1" line change faster than my monitor's refresh rate.
| It's crazy we still design compilers to do a mountain of
| processing work, generate a huge amount of state and then
| when they're done throw all that work out. Next time we run
| the compiler, we redo all of that work again. And the work is
| all almost identical.
|
| Unfortunately this would be a particularly difficult change
| to make in the rust compiler. Might want to experiment with a
| simpler language first to figure out the architecture and the
| fully incremental linker. It would be a super fun project
| though!
| CGamesPlay wrote:
| Can you explain why the user time goes _down_ when using a
| single thread? Does that mean that there 's a huge amount of
| contention in the parallelism?
| nequo wrote:
| Faults also drop to zero. Might be worth trying to flush the
| cache before each cargo build?
| twotwotwo wrote:
| There are hardware reasons even if you leave any software
| scaling inefficiency to the side. For tasks that can use lots
| of threads, modern hardware trades off per-thread performance
| for getting more overall throughput from a given amount of
| silicon.
|
| When you max out parallelism, you're using 1) hardware
| threads which "split" a physical core and (ideally) each run
| at a bit more than half the CPU's single-thread speed, and 2)
| the small "efficiency" cores on newer Intel and Apple chips.
| Also, single-threaded runs can feed a ton of watts to the one
| active core since it doesn't have to share much power/cooling
| budget with the others, letting it run at a higher clock
| rate.
|
| All these tricks improve the throughput, or you wouldn't see
| that wall-time reduction and chipmakers wouldn't want to ship
| them, but they do increase how long it takes each thread to
| get a unit of work done in a very multithreaded context,
| which contributes to the total CPU time being higher than it
| is in a single-threaded run.
| celrod wrote:
| User time is the amount of CPU time spent in user mode. It is
| aggregated across threads. If you have 8 threads running at
| 100% in user mode for 1 second, that gives you 8s of user
| time.
|
| Total CPU time in user mode will normally increase when you
| add more threads, unless you're getting perfect or better-
| than-perfect scaling.
| burntsushi wrote:
| To be honest, I don't know. My understanding of 'user' time
| is that is represents the sum of all CPU time spent in "user
| mode" (as opposed to "kernel mode"). In theory, given that
| understanding and perfect scaling, the user time of a multi-
| threaded task should roughly match the user time of a single-
| threaded task. Of course, "perfect" scaling is unlikely to be
| real, but still, you'd expect better scaling here.
|
| If I had to guess as to what's happening, it's that there's
| some thread pool, and at some point, near the end of
| compilation, only one or two of those threads is busy doing
| anything while the other threads are sitting and idling. Now
| whether and how that "idling" gets interpreted as "CPU being
| actively used in user mode" isn't quite clear to me. (It may
| not, in which case, my guess is bunk.)
|
| Perhaps someone more familiar with what 'user' time actually
| means and how it interplays with multi-threaded programs will
| be able to chime in.
|
| (I do not think faults have anything to do with it. The
| number of faults reported here is quite small, and if I re-
| run the build, the number can change quite a bit---including
| going to zero---and the overall time remains unaffected.)
| Filligree wrote:
| User time is the amount of CPU time spent actually doing
| things. Unless you're using spinlocks, it won't include
| time spent waiting on locks or otherwise sleeping -- though
| it will include time spent setting up for locks, reloading
| cache lines and such.
|
| Extremely parallel programs can improve on this, but it's
| perfectly normal to see 2x overhead for fine-grained
| parallelism.
| fulafel wrote:
| Spinlocks are normal userspace code issuing machine
| instructions in a loop that do memory operations. It is
| counted in user time, unless the platform is unusual and
| for some reason enters the kernel to spin on the lock.
| Spinning is the opposite of sleeping.
|
| edit: misparsed, like corrected below, my bad.
| burntsushi wrote:
| I think you're saying the same thing as the GP. You might
| have parsed their comment incorrectly.
| burntsushi wrote:
| I'd say there's still a gap in my mental model. I agree
| that it's normal to observe this, definitely. I see it in
| other tools that utilize parallelism too. I just can't
| square the 2x overhead part of it in a workload like
| Cargo's, which I assume is _not_ fine-grained. I see the
| same increase in user time with ripgrep too, and its
| parallelism is maybe more fine grained than Cargo 's, but
| is still at the level of a single file, so it isn't that
| fine grained.
|
| But maybe for Cargo, parallelism is more fine grained
| than I think it is. Perhaps because of codegen-units. And
| similarly for ripgrep, if it's searching a lot of tiny
| files, that might result in fine grained parallelism in
| practice.
| Filligree wrote:
| Well, like mentioned elsewhere, most of that overhead is
| just hyper threads slowing down when they have active
| siblings.
|
| Which is fine; it's still faster overall. Disable SMT and
| you'll see much lower overhead, but higher time spent
| overall.
| burntsushi wrote:
| Yes, I know its fine. I just don't understand the full
| details of why hyperthreading slows things down that
| much. There are more experiments that could be done to
| confirm or deny this explanation, e.g., disabling
| hyperthreading. And playing with the thread count a bit
| more.
| ynik wrote:
| Idle time doesn't count as user-time unless it's a spinlock
| (please don't do those in user-mode).
|
| I suspect the answer is: Perfect scaling doesn't happen on
| real CPUs.
|
| Turboboost lets a single thread go to higher frequencies
| than a fully loaded CPU. So you would expect "sum of user
| times" to increase even if "sum of user clock cycles" is
| scaling perfectly.
|
| Hyperthreading is the next issue: multiple threads are not
| running independently, but might be fighting for resources
| on a single CPU core.
|
| In a pure number-crunching algorithm limited by functional
| units, this means using $(nproc) threads instead of 1
| thread should be expected to more than double the user time
| based on these two first points alone!
|
| Compilers of course are rarely limited by functional units:
| they do a decent bit of pointer-chasing, branching, etc.
| and are stalled a good bit of time. (While OS-level
| blocking doesn't count as user time; the OS isn't aware of
| these CPU-level stalls, so these count as user time!) This
| is what makes hyperthreading actually helpful.
|
| But compilers also tend to be memory/cache-limited. L1 is
| shared between the hyperthreads, and other caches are
| shared between multiple/all cores. This means running
| multiple threads compiling different parts of the program
| in parallel means each thread of computation gets to work
| with a smaller portion of the cache -- the effective cache
| size is decreasing. That's another reason for the user time
| to go up.
|
| And once you have a significant number of cache misses from
| a bunch of cores, you might be limited on memory bandwidth.
| At that point, also putting the last few remaining idle
| cores to work will not be able to speed up the real-time
| runtime anymore -- but it will make "user time" tick up
| faster.
|
| In particularly unlucky combinations of working set size
| vs. cache size, adding another thread (bringing along
| another working set) may even increase the real time.
| Putting more cores to work isn't always good!
|
| That said, compilers are more limited by memory/cache
| latency than bandwidth, so adding cores is usually pretty
| good. But it's not perfect scaling even if the compiler has
| "perfect parallellism" without any locks.
| burntsushi wrote:
| > Turboboost lets a single thread go to higher
| frequencies than a fully loaded CPU. So you would expect
| "sum of user times" to increase even if "sum of user
| clock cycles" is scaling perfectly.
|
| Ah yes, this is a good one! I did not account for this.
| Mental model updated.
|
| Your other points are good too. I considered some of them
| as well, but maybe not enough in the context of
| competition making many things just a bit slower. Makes
| sense.
| [deleted]
| pornel wrote:
| This is caused by hyperthreading. It's not an actual
| inefficiency, but an artifact of the way CPU time is counted.
|
| The HT cores aren't real CPU cores. They're just an
| opportunistic reuse of hardware cores when another thread is
| waiting for RAM (RAM is relatively so slow that they're
| waiting a lot, for a long time).
|
| So code on the HT "core" doesn't run all the time, only when
| other thread is blocked. But the time HT threads wait for
| their opportunity turn is included in wall-clock time, and
| makes them look slow.
| pjmlp wrote:
| Back in the early days of HT I was so happy to get a
| desktop with it, that I enabled it.
|
| The end result was that doing WebSphere development
| actually got slower, because of their virtual nature and
| everything else on the CPU being shared.
|
| So I ended up disabling it again to get the original
| performance back.
| pornel wrote:
| Yeah, the earliest attempts weren't good, but I haven't
| heard of any HT problems post Pentium 4 (apart from
| Spectre-like vulnerabilities).
|
| I assume OSes have since then developed proper support
| for scheduling and pre-empting hyperthreading. Also the
| gap between RAM and CPU speed only got worse, and CPUs
| have grown more various internal compute units, so
| there's even more idle hardware to throw HT threads at.
| fnordpiglet wrote:
| I remember I would spend hours looking at my code change
| because it would take hours to days to build what I was working
| on. I would build small examples to test and debug. I was
| shocked at Netscape with the amazing build system they had that
| could continuously build and tell you within a short few hours
| if you've broken the build on their N platforms they cross
| compiled to. I was bedazzled when I had IDEs that could tell me
| whether I had introduced bugs and could do JIT compilation and
| feedback to me in real time if I had made a mistake and provide
| inline lints. I was floored when I saw what amazing things rust
| was doing in the compiler to make my code awesome and how
| incredibly fast it builds. But what really amazed me more than
| anything was realizing how unhappy folks were that it took 30
| seconds to build their code. :-)
|
| GET OFF MY LAWN
| [deleted]
| [deleted]
| burntsushi wrote:
| I dare to want better tools. And I build them when I can.
| Like ripgrep. -\\_(tsu)_/-
| fnordpiglet wrote:
| Keep keeping me amazed and I'll keep loving the life I've
| lived
| burntsushi wrote:
| Someone asked (and then deleted their comment):
|
| > How many LoC there is in ripgrep? 46sec to build a grep like
| tool with a powerful CPU seems crazy.
|
| I wrote out an answer before I knew the comment was deleted,
| so... I'll just post it as a reply to myself...
|
| -----
|
| Well it takes 46 seconds with only a single thread. It takes ~7
| seconds with many threads. In the 0.8.0 checkout, if I run
| `cargo vendor` and then tokei, I get: $ tokei
| -trust src/ vendor/ ===================================
| ============================================ Language
| Files Lines Code Comments Blanks
| ===============================================================
| ================ Rust 765
| 299692 276218 10274 13200 |-
| Markdown 387 21647 2902 14886
| 3859 (Total) 321339
| 279120 25160 17059 ======================
| =========================================================
| Total 765 299692 276218
| 10274 13200 ====================================
| ===========================================
|
| So that's about a quarter million lines. But this is very
| likely to be a poor representation of actual complexity. If I
| had to guess, I'd say the vast majority of those lines are some
| kind of auto-generated thing. (Like Unicode tables.) That count
| also includes tests. Just by excluding winapi, for example, the
| count goes down to ~150,000.
|
| If you _only_ look at the code in the ripgrep repo (in the
| 0.8.0 checkout), then you get something like ~13K:
| $ tokei -trust src globset grep ignore termcolor wincolor
| ===============================================================
| ================ Language Files
| Lines Code Comments Blanks ==========
| ===============================================================
| ====== Rust 34 15484
| 13205 780 1499 |- Markdown
| 30 2300 6 1905 389
| (Total) 17784 13211
| 2685 1888 =====================================
| ========================================== Total
| 34 15484 13205 780 1499
| ===============================================================
| ================
|
| It's probably also fair to count the regex engine too (version
| 0.2.6): $ tokei -trust src regex-syntax
| ===============================================================
| ================ Language Files
| Lines Code Comments Blanks ==========
| ===============================================================
| ====== Rust 29 22745
| 18873 2225 1647 |- Markdown
| 23 3250 285 2399 566
| (Total) 25995 19158
| 4624 2213 =====================================
| ========================================== Total
| 29 22745 18873 2225 1647
| ===============================================================
| ================
|
| Where about 5K of that are Unicode tables.
|
| So I don't know. Answering questions like this is actually a
| little tricky, and presumably you're looking for a barometer of
| how big the project is.
|
| For comparison, GNU grep takes about 17s single threaded to
| build from scratch from its tarball: $ time
| (./configure --prefix=/usr && make -j1) real 17.639
| user 9.948 sys 2.418 maxmem 77 MB
| faults 31
|
| Using `-j16` decreases the time to 14s, which is actually
| slower than a from scratch ripgrep 0.8.0 build. Primarily do to
| what appears to be a single threaded configure script for GNU
| grep.
|
| So I dunno what seems crazy to you here honestly. It's also
| worth pointing out that ripgrep has quite a bit more
| functionality than something like GNU grep, and that
| functionality comes with a fair bit of code. (Gitignore
| matching, transcoding and Unicode come to mind.)
| Thaxll wrote:
| It was me, and thanks for the details. I missed the multi
| threaded compilation in the second part, I thought it was
| 46sec with -jx
| kibwen wrote:
| In addition, it's worth mentioning here that the measurement
| is for release builds, which are doing far more work than
| just reading a quarter million lines off of a disk.
| lumb63 wrote:
| I love to see work being done to improve Rust compile times. It's
| one of the biggest barriers to adoption today, IMO.
|
| Package management, one of Rust's biggest strengths, is one of
| its biggest weaknesses here. It's so easy to pull in another
| crate to do almost anything you want. How many of them are well-
| written, optimized, trustworthy, etc.? My guess is, not that
| many. That leads to applications that use them being bloated and
| inefficient. Hopefully, as the ecosystem matures, people will pay
| better attention to this.
| pornel wrote:
| On the contrary, commonly used Rust crates tend to be well
| written and well optimized (source: I have done security audits
| of hundreds of deps and I curate https://lib.rs).
|
| Rust has a culture of splitting dependencies into small
| packages. This helps pull in only focused, tailored
| functionality that you need rather than depending on multi-
| purpose large monoliths. Ahead-of-time compilation + generics +
| LTO means there's no extra overhead to using code from 3rd
| party dependency vs your own (unlike interpreted or VM
| languages where loading code costs, or C with dynamic libraries
| where you depend on the whole library no matter how little you
| use from it).
|
| I assume people scarred by low-quailty dependencies have been
| burned by npm. Unlike JS, Rust has a strong type system, with
| rules that make it hard to cut corners and break things. Rust
| also ships with a good linter, built-in unit testing, and
| standard documentation generator. These features raise the
| quality of average code.
|
| Use of dependencies can improve efficiency of the whole
| application. Shared dependencies-of-dependencies increase code
| reuse, instead of each library rolling its own NIH basics like
| loggers or base64 decode, you can have one shared copy.
|
| You can also easily use very optimized implementations of
| common tasks like JSON, hashmaps, regexes, cryptography, or
| channels. Rust has some world-class crates for these tasks.
| gregwebs wrote:
| Haskell is one of the few languages that can compile slower than
| rust. But they have a REPL GHCI that can be used to fairly
| quickly reload code changes.
|
| I wish there were some efforts at dramatically different
| approaches like this because there's all this work going into
| compilation but it's unlikely to make the development cycle twice
| as fast in most cases.
| sesm wrote:
| Are there any articles/papers that explain how a mix of
| compiled and interpreted code works for Haskell? I wanted to
| play with this idea for my toy language, but don't know where
| to start.
| nkit wrote:
| I've started liking evcxr (https://github.com/google/evcxr) for
| REPL. It's a little slow compared to other REPLs, but still
| good enough to be usable after initial load.
| sitkack wrote:
| I agree, evcxr really needs to be advertised more. It might
| need a new name, I don't even know how to say it.
| antipurist wrote:
| > eee-vic-ser
|
| https://github.com/google/evcxr/issues/215
| laszlokorte wrote:
| Phonetically for a german that sounds like "eww,
| disgusting wanker"
| pornel wrote:
| Another build time improvement coming, especially for fresh CI
| builds, is a new registry protocol. Instead of git-cloning
| metadata for 100,000+ packages, it can download only the data for
| your dependencies.
|
| https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-spar...
| MarkSweep wrote:
| You can also use something like this to cache build artifacts
| and dependencies between builds:
|
| https://github.com/Swatinem/rust-cache
| MuffinFlavored wrote:
| I really wonder how many Dockerfiles are out there that on
| every PR merge pull the entire cargo "metadata" without cache
| and how wasteful that is from a bandwidth/electricity
| standpoint or if in the grand scheme of things it's a small
| drop in the bucket?
| aseipp wrote:
| In my experience it's pretty significant from the bandwidth
| side at reasonable levels of usage. You'd be astounded at how
| many things download packages and their metadata near
| constantly, and the rise of fully automated CI systems has
| really put the stress on bandwidth in particular, since most
| things are "from scratch." And now we have things like
| dependabot automatically creating PRs for downstream
| advisories constantly which can incur rebuilds, closing the
| loop fully.
|
| If you use GitHub as like a storage server and totally
| externalize the costs of the package index onto them, then
| it's workable for free. But if you're running your own
| servers then it's a whole different ballgame.
| kzrdude wrote:
| I think github would have throttled that cargo index
| repository a long time ago if it wasn't used by Rust, i.e
| they get some kind of special favour. Which is nice but
| maybe not sustainable.
| kibwen wrote:
| Github employees personally reached out to various
| packagers (I know both Cargo and Homebrew for certain)
| asking them not to perform shallow clones on their index
| repos, because of the extra processing it was incurring
| on the server side.
| CodesInChaos wrote:
| Why would a CI build need the index at all? The lock file
| should already contain all the dependencies and their hashes.
| kibwen wrote:
| You're correct that Cargo doesn't check the index if it's
| building using a lockfile, but I think the problem is that a
| freshly-installed copy of Cargo assumes that it needs to get
| the index the first time that any command is run. I assume
| (but haven't verified in the slightest) that this behavior
| will change with the move to an on-demand index by default.
| Vecr wrote:
| Good thing they will continue to support the original protocol.
| I don't like downloading things on demand like that, not good
| for privacy.
| charcircuit wrote:
| How is it bad for privacy?
|
| Before:
|
| Download all metadata, Download xyz package
|
| After:
|
| Downolad xyz's metadata, Download xyz
|
| They already know you are using xyz.
| throwaway894345 wrote:
| I don't care much either way, but you have the privacy
| argument backwards. If you're downloading all the things,
| then no knows if you are using xyz, only that you _might_
| be using xyz. If you 're just downloading what you need and
| you're downloading xyz, then they know that you're using
| xyz.
| rascul wrote:
| You're downloading specific packages either way, which
| can potentially be tracked, regardless of whether you're
| downloading metadata for all packages or just one.
|
| Edit: A thought occurs to me. Cargo downloads metadata
| from crates.io but clones the package repo from
| GitHub/etc. So unless I'm missing something, downloading
| specific metadata instead of all metadata allows for
| crates.io to track your specific packages in addition to
| GitHub.
| pornel wrote:
| No, repos of packages are not used, at all. Crates don't
| even need to be in any repository, and the repository URL
| in the metadata isn't verified in any way. Crates can
| link to somebody else's repo or a repo full of fake code
| unrelated to what has been published on crates.io.
|
| crates.io crates are tarballs stored in S3. The tarball
| downloads also go through a download-counting service,
| which is how you get download stats for all crates (it's
| not a tracker in the Google-is-watching-you sense, but
| just an integer increment in Postgres).
|
| Use https://lib.rs/cargo-crev or source view on docs.rs
| to see the actual source code that has been uploaded by
| Cargo.
| kibwen wrote:
| This has it backwards. crates.io has always hosted the
| crates themselves, but has used Github for the index. In
| the future, with the sparse HTTP index, crates.io will be
| the only one in the loop, cutting Github out of the
| equation.
| Xorlev wrote:
| I'm not sure I understand. This is talking about Cargo
| metadata download improvements. You still download
| individual packages regardless of receiving a copy of the
| entire registry, so privacy hasn't materially changed
| either way.
|
| If knowing you use a crate is too much, then running your
| own registry with a mirror of packages seems like all you
| could do.
| aseipp wrote:
| Great stuff. Now, if they can just have a globally shared (at
| least per $USER!), content-addressible target/ directory, two
| of my complaints with Cargo would be fixed nicely...
| xiphias2 wrote:
| I see a lot of work going on making the compiler faster (which
| looks hard at this point), but I wish I just would be able to
| make correct changes without needing to recompile code at least.
|
| The extract function tool is very buggy. As I spend a lot of time
| refactoring, maybe putting time in those tools would have a
| better ROI than so much work into making the compiler faster.
| estebank wrote:
| Keep in mind that the people working on rustc are not the same
| working on rust-analyzer, even if there's some overlap in
| contributors and there's a desire to share libraries as much as
| possible. Someone working on speeding up rustc is unlikely to
| have domain expertise in DX and AST manipulation, and vice-
| versa.
| xiphias2 wrote:
| Maybe you're right, but I think both are hard enough that
| people who are smart enough to do one can do the other if
| they really want :)
|
| By the way AST manipulation is easy, the really hard part of
| refactoring (that I had a lot of problem with) is creating
| the lifetime annotations, which requires a deep understanding
| of the type system.
|
| I was trying to learn some type theory and read papers to
| understand how Rust's life times work, but only found long
| research papers that don't even do the same thing as Rust.
|
| I haven't found any documentation that documents exactly when
| a function call is accepted by the lifetime checker (borrow
| checking is easy).
| guipsp wrote:
| Have you read the rustnomicon section on lifetimes? I found
| it pretty useful
| xiphias2 wrote:
| It's cool, I just looked at the manual.
|
| there's this part though:
|
| // NOTE: `'a: {` and `&'b x` is not valid syntax!
|
| I hate that I can't introduce new lifetime inside a
| function, it would make refactoring so much easier. Right
| now I have to try to refactor, see if the compiler
| accepts it or not, then revert the change.
|
| Sometimes desugaring would be a great feature in itself,
| sugaring makes interactions between functions much harder
| to understand.
| IshKebab wrote:
| I really wish there was some work on hermetic compilation of
| crates. Ideally crates would be able to opt-in (eventually opt-
| out) to "pure" mode which would mean they can't use `build.rs`,
| proc macros are fully sandboxed, no `env!()` and so on.
|
| Without that you can't really do distributed and cached
| compilation 100% reliably.
| josephg wrote:
| That would help some stuff, but it wouldn't help with
| monomorphized code or macro expansion. Those two are the real
| killers in terms of compilation performance. And in both of
| those cases, most of the compilation work happens at the call
| site - when compiling _your_ library.
| IshKebab wrote:
| Those are simply different problems though. Macro expansion
| is not even exactly a problem.
|
| For monomorphized code the compiler just needs a mode where
| it automatically does what the Momo crate does.
|
| For proc macros the Watt crate (precompiled WASM macros) will
| make a big difference. It just needs official sanction and
| integration.
|
| Anyway yeah those are totally separate problems to caching
| and distributed builds.
| pjmlp wrote:
| Eiffel, Ada, D, C++ with extern templates (even better if
| modules are also part of the story) show ways how this can be
| improved.
|
| Naturally somehow has to spend time analysing how their way
| maps into Rust compilation story.
| nicoburns wrote:
| Would it not allow macro expansion to be cached? Which I
| believe it can't be currently because macros can run
| arbitrary code and access arbitrary external state.
| mcdonje wrote:
| I don't know much about how the compiler works, so the answer
| here is probably that I should read a book, but can external
| crates from crates.io be precompiled? Or maybe compile my
| reference to a part of an external crate once and then it doesn't
| need to be done on future compilations?
|
| If the concern is that I could change something in a crate, then
| could a checksum be created on the first compilation, then
| checked on future compilations, and if it matches then the crate
| doesn't need to be recompiled.
| epage wrote:
| Hosted pre-compiled builds would need to account for
|
| - Feature flags
|
| - Platform conditionals
|
| - The specific rust version being used
|
| - (unsure on this) the above for all dependencies of what is
| being pre-compiled
|
| There is also the impediments of designing / agreeing on a
| security model (do you trust the author like PyPI, trust a
| central build authority, etc) and then funding the continued
| hosting.
|
| Compiling on demand like in sccache is likely the best route
| for not over-building and being able to evict unused items.
| mcdonje wrote:
| sccache seems awesome. I wasn't aware of it. Thanks.
| runevault wrote:
| because of the lack of stable ABI they'd need to pre-compile it
| however many versions of the rust compiler they wanted to
| support.
| duped wrote:
| Cargo already does this when building incrementally, and there
| are tools for doing it within an organization like sccache.
|
| > If the concern is that I could change something in a crate
|
| It's possible for a change in one crate to require recompiling
| its dependencies and transitive dependencies, due to
| conditional compilation (aka, "features' [0]). Basically you
| can't know which thing to compile until it's referenced by a
| dependent and provided a feature set.
|
| That said, many crates don't have features and have a default
| feature set, but the number of variants to precompile is still
| quite large.
|
| [0] https://doc.rust-lang.org/cargo/reference/features.html
|
| Note that C and C++ have the exact same problem, but it's
| mitigated by people never giving a shit about locking
| dependencies and living with the horrible bugs that result from
| it.
| dcow wrote:
| When people complain about rust compile times are they
| complaining about cold/clean compiles or warm/cached compiles? I
| can never really tell because people just gripe "compile times".
|
| I can see how someone would come to rust, type `cargo run`, wait
| 3-5 minutes while cargo downloads all the dependencies and
| compiles them along with the main package, and then say, "well
| that took awhile it kinda sucks". But if they change a few lines
| in the actual project and compile again it would be near instant.
|
| The fair comparison would be something akin to deleting your node
| or go modules and running a cold build. I am slightly suspicious,
| not in a deliberate foul play way but more in a messy semantics
| and ad-hoc anecdotes way, that many of these compile time
| discrepancies probably boil down more to differences in how the
| cargo tooling handles dependencies and what it decides to include
| in the compile phase, where it decides to store caches and what
| that means for `clean`, etc. compared to similar package
| management tooling from other languages, than it does to "rustc
| is slow". But I could be wrong.
| codetrotter wrote:
| > But if they change a few lines in the actual project and
| compile again it would be near instant.
|
| If it's a big project and the lines you are changing are in
| something that is being used many other places then the rebuild
| will still take a little while. (30 seconds or a minute, or
| more, depending on the size of the project.)
|
| Likewise, if you work on things in different branches you may
| need to wait more when you switch branch and work on something
| there.
|
| Also if you switch between Rust versions you need to wait a
| while when you rebuild your project.
|
| I love Rust, and I welcome everything that is being done to
| bring the compile times down further!
| dcow wrote:
| I am not discouraging efforts to make compile times faster.
| However, I also see a lot of things that would really make
| Rust soar not being worked on, like syntax quality of life
| reworks that get complex under the hood being dropped,
| partially complete features with half baked PRs, IDE tooling
| and debugging support, interface-types and much of the
| momentum behind wasm, async traits and the sorely lacking
| async_std, etc. It seems like every time I dive into
| something moderately complex I start hitting compiler caveats
| with links to issues that have been open for 5 years and a
| bunch of comments like "what's the status of this can we
| please get this merged?". It can ever so slightly give one
| the impression that the rust community has decided that the
| language is mature and the only thing missing is faster
| compile times.
| insanitybit wrote:
| > "what's the status of this can we please get this
| merged?"
|
| Having written Rust professionally for a number of years,
| this didn't happen too much. Where it did it was stuff like
| "yeah you need to Box the thing today", which... did not
| matter, we just did that and moved on.
|
| > It can ever so slightly give one the impression that the
| rust community has decided that the language is mature and
| the only thing missing is faster compile times.
|
| That is generally my feeling about Rust. There are a few
| areas where I'd like to see things get wrapped up (async
| traits, which are being actively worked on) but otherwise
| everything feels like a bonus. In terms of things that made
| Rust difficult to use, yeah, compile times were probably
| the number one.
| dcow wrote:
| I mean this is what you have to do to access variables
| from an async block: let block = || {
| let my_a = a.clone(); let my_b = b.clone();
| let my_c = c.clone(); async move {
| // use my_a, my_b, my_c let value = ...
| Ok<success::Type, error::Type>(value) }
| }
|
| And you can't use `if let ... && let ...` (two lets for
| one if) because it doesn't desugar correctly.
|
| And error handling and backtraces are a beautiful mess.
| Your signatures look like `Result<..., Box<dyn
| std::error::Error>>` unless you use `anyhow::Result` but
| then half the stuff implements std::error::Error but not
| Into<anyhow::Error> and you can't add the silly trait
| impl because of _language limitations_ so you have to
| map_err everywhere.
|
| It's not just "oh throw a box around it and you're good".
| It's ideas that were introduced to the language when
| there was lots of steam ultimately not making it to a
| fully polished state (maybe Moz layoffs are partly to
| blame IDK). Anyway I love Rust and we use it in
| production and have been for years, but I think there's
| still quite a bit to polish.
| nicoburns wrote:
| There are a few more fundemental missing pieces for me:
|
| - It's impossible to describe a type that "implements
| trait A and may or may not implement trait B"
|
| - It's impossible to be generic over a trait (not a type
| that implements a trait, the trait itself)
| tialaramex wrote:
| > It's impossible to describe a type that "implements
| trait A and may or may not implement trait B"
|
| How is this different from just describing a type that
| only "implements trait A" ?
| nicoburns wrote:
| It would allow you to call a function to check for trait
| B and downcast to "implements trait A and B" in the case
| that it does implement the trait.
| merely-unlikely wrote:
| I'm still learning the language but couldn't you use an
| enum containing two types to accomplish the same thing?
| nicoburns wrote:
| You can if you know all of the possible types in advance.
| But if you want to expose this as an interface from a
| library that allows users to provide their own custom
| implementation then you need to use traits.
| tialaramex wrote:
| It seems like a way to ask "Can this thing implement X
| and if so how?" from say the Any trait would be what you
| want here, I have no idea how hard that would be to
| deliver but I also don't see how the previous trait thing
| is relevant, like, why do we need to say up front that
| _maybe_ we will care whether trait B is implemented?
| insanitybit wrote:
| > - It's impossible to describe a type that "implements
| trait A and may or may not implement trait B"
|
| So, specialization? Or something else? I haven't found a
| need for specialization. I remember when I came from C++
| I had a hard time adjusting to "no specialization, no
| variadics" but idk I haven't missed it in years.
|
| > - It's impossible to be generic over a trait (not a
| type that implements a trait, the trait itself)
|
| Not sure I understand.
| nicoburns wrote:
| > So, specialization?
|
| Basically yes. But that works with dynamic dispatch
| (trait objects) as well as static dispatch (generics).
|
| > Not sure I understand.
|
| A specific pattern I'd like to be able to represent is:
| trait AlgorithmAInputData { ... }
| trait AlgorithmA { trait InputData =
| AlgorithmAInputData; ... }
| trait DataStorage<trait AlgorithmA> { type
| InputData : Algorithm::InputData; fn
| get_input_data() -> InputData; }
| fn compute_algorithm_a<Storage:
| DataStorage<AlgorithmA>>() { ... }
| mamcx wrote:
| > It can ever so slightly give one the impression that the
| rust community has decided that the language is mature and
| the only thing missing is faster compile times.
|
| Is not the case, is that the features are now _good enough_
| and compile times is the one major, big, sore point.
|
| So, if you compare Rust to X you can make a _very good
| case_ until you hit:
|
| "... wait, Rust is THAT SLOW TO COMPILE?"
|
| ":(. Yes"
| wongarsu wrote:
| For the branch-switching usecase you might get some milage
| out of sccache [1]. For local storage it's just one binary
| and two lines of configuration to have a cache around rustc,
| so it's worth testing out.
|
| 1: https://github.com/mozilla/sccache
| Lewton wrote:
| > (30 seconds or a minute, or more, depending on the size of
| the project.)
|
| I'm working on a largeish modern java project using gradle,
| and this sounds great... Every time I start my server it
| takes 40 seconds just for gradle to find out that all the sub
| projects are up to date, nothing has been changed and no
| compilation is necessary...
| insanitybit wrote:
| It's a few things:
|
| 1. Clean builds can happen more often than some may think.
| CI/CD pipelines can end up with a lot of clean builds -
| especially if you use ephemeral instances (to save money), but
| even if you don't it's very likely.
|
| Even locally it can happen sometimes. For example, we used
| Docker to run builds. For various reasons the cache could get
| blown. Also, sometimes weird systemy things happen and 'cargo
| clean' fixes it, but you have to recompile from scratch. This
| can take 10+ minutes on a decent sized codebase.
|
| 2. On a large codebase even small changes can lead to long
| recompile times, especially if you want to run tests - cargo
| check won't be enough, you need to build.
| pjmlp wrote:
| Both, because some times a little change implies compiling the
| world due to configuration changes.
|
| Also it is quite irritating sometimes seeing the same crate
| being compiled multiple times as it gets referenced from other
| crates.
|
| Ideally Rust could use a dumb compilation mode (or interpreter)
| for change-compile-debug cycles, and proper compilation for
| release, e.g. Haskell and OCaml offer such capabilities on
| their toolchains.
| titzer wrote:
| I primarily develop the Virgil compiler in interpreted mode
| (i.e. running the current source on the stable binary's
| interpreter). Loading and typechecking ~45kloc of compiler
| source takes 80ms, so it is effectively instantaneous.
| pjmlp wrote:
| Yeah it works great having toolchains that support all
| possible execution models.
| sitkack wrote:
| But you aren't on our timeline and haven't opted into our
| bullshit.
|
| HTML+Js projects used to be testable with the load of a web
| page and that community has opt-in to long build times.
|
| Most people are so far away from flow-state that they can't
| even imagine another way of being.
| anuraaga wrote:
| cargo run is a command you'd generally use to actually get
| something running I guess. This is not going to be incremental
| development in many cases which focus on unit tests I guess.
|
| FWIW cold builds (i.e., in docker with no cache) of cargo are
| much slower than go, hanging for a long time on refreshing
| cargo.io indexes. I don't know exactly what that is doing but I
| have a feeling it is implemented in a monolithic way rather
| than on-demand. Rust has had plenty of time to make this better
| but it is still very slow for cold cargo builds, often spending
| minutes refreshing the crates index. But Go misses easy
| optimizations like creating strings from a byte slice.
|
| So it is what it is - Go makes explicit promises of fast
| compile times. Thanks to that, build scripts in go are pretty
| fast. Any language that doesn't make that explicit might be
| slow to compile and might run fast - that's totally fine and I
| would rather have two languages optimized to each case than one
| mediocre language.
| zozbot234 wrote:
| You don't even need a separate language, there's already a
| "fast" compiler for Rust based on cranelift which is used in
| debug builds by default.
| burntsushi wrote:
| Cranelift is not used for debug builds by default. I think
| that's _probably_ a goal (although I 'm not actually 100%
| sure about that just because I'm not dialed into what the
| compiler team is doing). Even the OP mentions this:
|
| > We were able to benchmark bjorn3's cranelift codegen
| backend on full crates as well as on the build dependencies
| specifically (since they're also built for cargo check
| builds, and are always built without optimizations): there
| were no issues, and it performed impressively. It's well on
| its way to becoming a viable alternative to the LLVM
| backend for debug builds.
|
| And the Cranelift codegen backend itself is also clear
| about it not being ready yet:
| https://github.com/bjorn3/rustc_codegen_cranelift
|
| (To be clear, I am super excited about using Cranelift for
| debug builds. I just want to clarify that it isn't actually
| used by default yet.)
| nicoburns wrote:
| The more immediate goal of "distribute the cranelift
| backend as a rustup component" has been making good
| progress and seems like it might happen relatively soon h
| ttps://github.com/bjorn3/rustc_codegen_cranelift/mileston
| e/...
| burntsushi wrote:
| That's amazing. Thanks for that update. Can't wait.
| pjmlp wrote:
| Great news.
| dwattttt wrote:
| Refreshing the crates index has gotten quite slow because it
| currently downloads the entire index, regardless of which
| bits you need. There's a trial of a new protocol happening
| now, due for release in March, that should speed this up
| (https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-
| spar...)
| dmm wrote:
| Incremental builds are what matter to me. On my 1240p if I
| change one file and build it takes ~11s to build. Changing one
| file and running tests takes ~3.5. That's all build time the
| tests run in <100ms.
|
| The incremental build performance seems to be really dependent
| on single-thread performance. An incremental build on a 2014ish
| Haswell e5-2660v3 xeon takes ~30s.
| kibwen wrote:
| _> On my 1240p if I change one file and build it takes ~11s
| to build. Changing one file and running tests takes ~3.5_
|
| `cargo test` and default `cargo build` use the same profile,
| so presumably the first number is referring to `cargo build
| --release`. Release builds deliberately forego compilation
| speed in favor of optimization. In practice, most of my
| development involves `cargo check`, which is much faster than
| `cargo build`.
| dmm wrote:
| > so presumably the first number is referring to `cargo
| build --release`
|
| Both numbers are for debug builds. I don't know why `cargo
| test` is faster but I appreciate it.
|
| Incremental release builds with `cargo build --release` are
| even slower taking ~35s on the 1240p.
| kibwen wrote:
| Honestly, it should be impossible; in the absence of some
| weird configuration, cargo test does strictly more work
| than cargo build. :P Can you reproduce it and file a bug?
| dureuill wrote:
| i suspect most of that time is link time. Possibly the
| linker in use is not very parallel, and so linking one
| big executable with cargo build takes longer than many
| smaller test executable whose linking can actually be
| made parallel?
| zozbot234 wrote:
| Practically all Rust crates make heavy use of monomorphized
| generics, so every use of them in a new project is bespoke and
| has to be compiled on the spot. This is very different from how
| Go or Node work. You _could_ compile the non monomorphic
| portions of a Rust crate into a C-compatible system library
| (with a thin, header-like wrapper to translate across ABI 's)
| but in practice it wouldn't amount to much.
| Tobu wrote:
| Some popular proc-macros could be pre-compiled and
| distributed as WASM, and it would be impactful, since they
| tend to bottleneck the early parts of a project build.
| However I don't think that could be made entirely
| transparent, because right now there's a combinatorial
| explosion of possible syn features. For now I avoid depending
| on syn/quote if I can.
| conaclos wrote:
| I contributed to Rome tools [1] and the build takes more than 1
| min. This makes write-build-test loop frustrating... Such
| frustrating that I am hesitating to start a project in Rust...
|
| My machine is 7yo. People tell me to buy a new one just for
| compiling a Rust project... That's ecologically questionable.
|
| [1] https://rome.tools/
| TheDesolate0 wrote:
| [dead]
| Bjartr wrote:
| It's funny, any other post on HN about improvements to Rust I've
| seen are chock full of comments to the effect of "I guess that
| feature is nice, but when will they improve the compile times?"
| And now many of the replies to this post are "Faster compiles are
| nice, but when will they improve/implement important features?"
|
| The Rust dev team can't win!
| marcosdumay wrote:
| That's how winning looks like.
| capableweb wrote:
| 1) HN is maybe one organism if you zoom out enough, but it
| consists of people with wildly different opinions, you'll have
| capitalists arguing with anarchists here, any post is bound to
| have both sides, no sides and every side, all on the same page
| 2) it's easier to complain about stuff, somehow. Not sure why,
| or if it's extra prominent on HN in particular, but people tend
| to start thinking "Why am I against this thing?" and then write
| their thoughts, rather than "Why do I like this thing?". Maybe
| it is more engagement to write something that can be
| challenged, and people like when others engage with them, so
| they start to implicitly learn to be that way.
| jchw wrote:
| I think pessimistic and cynical reactions are the literal
| lifeblood of news aggregator comments sections. It's been
| like this for as long as I can remember across as many
| aggregators as I've ever used.
|
| Part of the problem is that news aggregates reward people who
| comment early, and the earliest comments are the kneejerk
| reactions where you braindump thoughts you've had brewing but
| don't have anywhere to put. (Probably without actually
| clicking through.)
| smolder wrote:
| Another part of it is just psychology. People seem much
| more inclined to join discourse to make objections than to
| pile on affirmative comments, which generally an upvote
| suffices for.
| yadoomerta wrote:
| It's also partly the site's culture. Not saying it's
| wrong, because it adds some noise and not much new info,
| but I've been downvoted before for posting comments like
| "Thanks for saying this!"
| nindalf wrote:
| I agree in general except HN also rewards quality a bit
| more. All new comments get a few minutes to bask at the top
| of their subthreads. So a really good, late comment can
| still get to the top and stay there.
| jchw wrote:
| To a degree, the comment ranking algorithm helps, though
| long/fast threads do often leave brand new replies buried
| upon posting.
|
| Still, I believe what makes HN unusually nice is just the
| stellar moderation. It is definitely imperfect, but it
| creates a nice atmosphere that I think ultimately does
| encourage people to try to be civil, even though places
| like these definitely have a tendency to bring out the
| worst in people. Having a deft touch with moderation is
| very hard nowadays, especially with increasingly
| difficult demands put against moderators and absolutely
| every single possible subject matter turning into a
| miniature culture war (how in the hell do you turn the
| discussion of gas ranges vs electric ranges into a
| culture war?!) and the unmoderated hellscapes of the
| Internet wrongly painting all lightweight moderation with
| a black mark.
|
| I definitely fear for the future of communities like HN,
| because the pressure from increasingly vile malicious
| actors as well as the counter-active pressure from others
| to moderate harder, stronger, faster will eventually
| break the sustainability of this sort of community. When
| I first joined HN, a lot of communities on the Internet
| felt like this. Now, I know of very few.
| jerf wrote:
| This is when it is important not to model the comment section
| as some sort of single composite individual.
|
| Since it is impossible to mentally model them as the number of
| humans they are, I find it helpful to model them as at least a
| few very distinct individuals, or sometimes just as an
| amorphous philosophical gas that will expand to fill all
| available comments, where the only question is really with what
| distribution rather than _whether_ a given point will be
| occupied.
| [deleted]
| boredumb wrote:
| Rust is amazing, I truly believe a large number of people are
| intimidated by it and so go out of their way to shit on it and
| pretend like it's only for some niche IOT device... when it's
| just as easy to write out a full crud application in Rust as
| any other language at this point.
| [deleted]
| thechao wrote:
| I used to be a grad student adjacent to Bjarne Stroustrup; he
| has a quip he's used a bunch: if no one's complaining, no one
| cares.
|
| I see all of these complaints -- both the volume, and the count
| -- as great indicators of Rust's total health.
| guhidalg wrote:
| So true. I use a similar quip at work: "Take the shortcut and
| if we're not out of business when it becomes an issue, then
| it will be a good problem to have".
| rascul wrote:
| I guess that depends on the work. By day I repair houses
| and taking shortcuts can mean I could have an even bigger
| problem to solve in a few months. Luckily I have yet to be
| in such a situation myself but I've fixed other's shortcuts
| a number of times.
| guhidalg wrote:
| Yes, I should specify I'm talking about software
| development. Physical products and work rarely have the
| luxury of making mistakes or taking shortcuts, the
| universe is a harsh place.
| bufo wrote:
| "There are possible improvements still to be made on bigger
| buffers for example, where we could make better use of SIMD, but
| at the moment rustc still targets baseline x86-64 CPUs (SSE2) so
| that's a work item left for the future."
|
| I don't understand this. The vast majority (I would guess 95%+)
| of people using Rust have CPUs with AVX2 or NEON. Why is that a
| good reason? Why can't there be a fast path and slow path as a
| failover?
| pjmlp wrote:
| Because it requires some kind of fat binaries.
|
| Some C and C++ compilers offer this and it requires some
| infrastructure to make it happen (simd attribute in GCC), or
| explicitly loading different kinds of dynamic libraries.
___________________________________________________________________
(page generated 2023-02-03 23:01 UTC)