[HN Gopher] Compiling Rust is NP-hard
___________________________________________________________________
Compiling Rust is NP-hard
Author : lovasoa
Score : 367 points
Date : 2021-07-08 09:10 UTC (13 hours ago)
(HTM) web link (niedzejkob.p4.team)
(TXT) w3m dump (niedzejkob.p4.team)
| FartyMcFarter wrote:
| > What's the most efficient algorithm you can write to do this?
| As it turns out, you can't do well in the worst case -- and we'll
| prove this by a reduction from SAT.
|
| Correction: you can't do well in the worst case, assuming P [?]
| NP (where "well" means "in polynomial runtime").
| gizmo wrote:
| Pedantic and wrong. Considering there is no known algorithm
| that can sat solve in P, nobody in the world can do well in the
| worst case. And if nobody can do it, then you can't do it
| either, for any you. QED.
| FartyMcFarter wrote:
| I'm not a native English speaker, but I thought "you can't do
| well" is equivalent to "it's impossible to do well", which is
| how I read the original sentence.
|
| As an aside, I find it a bit ironic that you're complaining
| about pedantry while interpreting the "you" word literally.
| gizmo wrote:
| It's not ironic that my response was pedantic, it was the
| entire point. When you engage in pedantic nerd sniping you
| just open yourself up for even more pedantic nerd sniping.
|
| "Can't" can mean impossible, or it can mean nobody can do
| it. Compare "you can't go back in time" with "you can't
| lift a horse".
| gugagore wrote:
| It's awkward to abbreviate "in polynomial time" as "in P". P
| is a set of (decision) problems. We don't know that 3-SAT is
| in P. We don't know that it's not.
|
| To say that a solver is not in P is a type error.
| OgAstorga wrote:
| Your "proof" it's more like. No one has ever stepped on Mars.
| Therefore no one will.
|
| Yet another pedantic fallacy.
| __s wrote:
| Not that no one will, just that no one can
|
| I give you until the end of today to show me someone
| (anyone!) step on Mars
|
| It was just someone being cheeky about someone being
| cheeky. Now I'm being cheeky about you being cheeky over
| someone being cheeky about someone being cheeky
| sischoel wrote:
| While this is an interesting result, isn't basically every
| compiler NP-hard, because of register allocation?
| srl wrote:
| Compilers don't guarantee (or try to find) the absolute optimal
| allocation, right?
| WJW wrote:
| This depends on the compiler of course. Most will not because
| it is not worth it, but I expect that there is at least one
| compiler out there with the option to exhaustively search the
| allocation space for that very last nanosecond of runtime
| optimization. Probably inside an HFT firm somewhere.
| chrisseaton wrote:
| No - you can express register allocation as an NP-hard problem
| if you want to, but you don't have to and not all compilers do
| it like that. You don't need an optimal register allocation -
| you get into diminishing returns very quickly.
| sischoel wrote:
| Yes, I hadn't thought about that before, but of course the
| difference here is that the Rust compiler cannot just produce
| a "good enough" result for pattern matching.
| kadoban wrote:
| Well, it probably _could_ do "good enough". If a set of
| matches gets too large/hairy it could just output a "here
| be dragons" warning and go ahead with compiling it, no?
|
| That may not count as a correct Rust compiler though, I'm
| not 100% sure how that's defined, or if it is.
| chrisseaton wrote:
| Register allocation is a great example of over-
| theorisation. People found that register allocation can be
| expressed as graph colouring, so they doubled-down and
| worked really hard to solve that pure problem, but then it
| turned out you can just do a very simple linear allocation
| heuristic and the generated code is basically as fast as
| solving the NP-hard problem.
|
| https://dl.acm.org/doi/abs/10.1145/330249.330250
| imtringued wrote:
| Well, the problem for the NP-hard algorithm is that the
| base line is already very efficient. It's like branch
| prediction. Even the dumbest predictors can be 90%
| efficient. The perfect solution doesn't have much room to
| be better. I'd say there are more microsecond level
| optimization problems left than on the nanosecond or
| millisecond level.
| adrianN wrote:
| If you do a SSA transform of your program the register graphs
| are chordal. It turns you that coloring chordal graphs is easy:
| http://web.cs.ucla.edu/~palsberg/course/cs232/papers/HackGoo...
| antientropic wrote:
| Or worse than NP-hard: C++ template instantiation and the C
| preprocessor are Turing complete. (Except that I believe C++
| requires some fixed limit on the maximum instantiation depth,
| but even so it's still exponential complexity.)
| rocqua wrote:
| I believe you can encode paradoxes in C++ templating.
| Something about two specializations which are mutually
| impossible because they reference eachother. So you can say:
| A is of type T <=> B is not of type T B is of type T
| <=> A is not of type T
|
| Where <=> is if and only if.
|
| In which case it is paradoxical to say that A is of type T.
| But also paradoxical to say that A is not of type T.
| MauranKilom wrote:
| Note that "a template for which no valid instantiation is
| possible" is actually ill-formed, no diagnostic required.
|
| That said, it's not clear to me what part of "you can write
| uninstantiable templates" is surprising, so I'm probably
| not understanding correctly. Maybe you could find the C++
| code you're talking about? I would be interested.
| Banana699 wrote:
| >the C preprocessor are Turing complete
|
| That's surprising, the C preprocessor is pretty crippled.
| It's creator intentionally made it so people won't abuse it
| to create full blown mini languages.
|
| Did you get things mixed up or is there actually a way the c
| preprocessor can be turing-complete ?
| namibj wrote:
| CPP needs to feed into itself (pipe stdout into stdin), if
| you want it to be turing-complete.
| gpderetta wrote:
| I believe that while you can't encode an infinite tape,
| you can very easily define an arbitrarily large tape, so,
| while not technically Turing complete, it is close
| enough.
| mtlmtlmtlmtl wrote:
| Not sure whether this proves turing completeness of the
| preprocessor or not(probably not) but Simon Tatham has a
| great(and somewhat disturbing) write-up[1] on creating
| arbitrary control structures using macros in C. He even
| goes as far as providing a library of helper macros for
| doing things like making gensyms, etc.
|
| If nothing else, it does demonstrate that you can do a lot
| more in the preprocessor than you might think, and this is
| at least a significant subset of what true MP macros are
| used for in Lisp most of the time. Except much gnarlier...
|
| [1] https://www.chiark.greenend.org.uk/~sgtatham/mp/
| srl wrote:
| TFA notes that rustc is not a particular _efficient_ SAT solver,
| taking something like 10^4 times as long as a commonly used
| algorithm [1] would on a small example.
|
| So, I'm curious: what's the _strongest_ case one could make for
| including a more robust SAT solver as part of the exhaustiveness
| checking? Is there any reasonable place where this could matter?
| (Extremely autogenerated code?)
|
| [1] https://en.wikipedia.org/wiki/DPLL_algorithm
| trissylegs wrote:
| Rust's proc-macro system makes it pretty easy to create some
| hard to compile code. _but_ there 's many ways to generate very
| slow compiling code with proc macros.
| Quekid5 wrote:
| One perhaps-interesting use case might be to use the fastest
| possible solver in the case of compiling generated/foreign
| code, so basically code that's expected to always compile (but
| which still needs to be checked for safety).
|
| One could even imagine using it as a first-pass for code and to
| just fall back to a slower-but-with-better-error-messages type
| checker if errors are found.
|
| (Of course you'd end up with two orthogonal implementations of
| the type checker, but that might not even be a bad thing since
| you'd be easily able to check for inconsistencies between them,
| aka. type checker bugs.)
| ygra wrote:
| I was under the impression that cases where it would matter are
| typically code that wouldn't usually be written. E.g. C# and
| Java also can solve SAT as part of overload resolution with
| generic types and it's an interesting peculiarity, but not a
| pattern that is interesting enough to optimize. Perhaps that's
| the case here as well.
|
| I could also imagine the compiler code being easy to read,
| understand and modify are often more valuable traits than
| compilation speed of highly atypical patterns in a language.
| But that may well depend on _how_ atypical those are.
| johncolanduoni wrote:
| The fact that most dependently typed languages (which have a
| much higher propensity for complicated exhaustiveness checking
| when pattern matching) don't use a serious SAT solver for this
| makes me doubt Rust would ever need to.
| sanity31415 wrote:
| Doesn't Liquid Haskell[1] use a SAT solver?
|
| [1] https://ucsd-progsys.github.io/liquidhaskell-blog/
| johncolanduoni wrote:
| Like most languages with refinement types it uses an SMT
| solver (a more general problem), though I'm not sure it
| uses it for pattern matching. I was thinking of Coq, Agda,
| Lean, etc. which are full-spectrum dependently typed
| languages. LiquidHaskell is more specialized, though its
| automated proof capabilities are much stronger.
| gpm wrote:
| If you were using rustc as some sort of JIT compiler compile
| times could become extremely important, and error messages less
| important, at which point things like this might be worth it.
|
| If you wanted the SAT solver for some other reason
| (optimization? proving code correct?) it might make sense to
| use it for this too.
| multifasciatus wrote:
| > Obviously, every formula can be transformed into this form, by
| applying the rules of Boolean algebra.
|
| Is it so obvious? Why do so many engineers write with these sort
| of modifiers (obviously, just, of course, etc.)?
|
| Edit: I am assuming no ill intent from the author. I also use
| these sorts of modifiers without thinking about them much when
| writing. I am just curious as to why we as engineers call deeply
| technical things obvious when they are not.
| rgoulter wrote:
| I liked the model in blogpost:
| https://status451.com/2016/01/06/splain-it-to-me/
|
| It discusses that there can be different ways sharing
| information comes across. For some people, sharing information
| demonstration that you're smart enough to know the thing. For
| others, sharing information is for others' benefit.
|
| I wouldn't prefer to interpret "obviously" as "you're not as
| smart/cool as me if you don't know this", but I can see why
| there are examples that could come across like that. (I'd
| sooner interpret it as "I'm not claiming to be smart by saying
| this").
| bombela wrote:
| I don't think it's any malice. Just a way to bring up the idea
| of a sudden realization. Obviously, I could be wrong.
| bruce343434 wrote:
| Roughly related: why not simply[1]
|
| [1] https://news.ycombinator.com/item?id=27415725
| NieDzejkob wrote:
| That sounds like something that could be solved with a linter
| for prose. Though, I can't think of the search terms that would
| let me find such a solution, if it exists.
| guga42k wrote:
| Usually this is a polite way to say that proof/derivation is
| too long to fit on this page, please look somewhere else if you
| are interested.
|
| Notable example would be Lev Landau's Theoretical Physics. If
| you see "obviously" there it is probably not obvious at all and
| can take good few hours to derive the formula.
| joshribakoff wrote:
| The good faith interpretation is that the author means in the
| sense of "it follows then that" as opposed to "if you didn't
| realize this upon reading the previous section then something
| is wrong"
| EdSchouten wrote:
| It's only NP-hard if you want the compiler to give you proper
| warnings/errors, right? As far as I know, the first matching arm
| is used. This means that if a compiler doesn't do any analysis on
| it, but naively generates machine code to compare the value, then
| there is no complexity at all.
|
| (Not saying that that's acceptable...)
| tialaramex wrote:
| Here's what Rust's compiler actually does today:
|
| https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_bu...
|
| Rust lets you say either "There are four specific possible
| Clowns, and I want everybody to explicitly deal with that" or
| you can write #[non_exhaustive] to say "There are several
| Clowns, right now I defined four of them, but maybe I'll add
| more some day" in which case _you_ are allowed to only write
| code for those four Clowns, but everybody _else_ is obliged by
| the compiler to handle the case with more Clowns than that
| since they can 't depend on there only being four, and they
| have no idea how to name the additional ones that don't yet
| exist.
|
| So, a compiler that just punts is not a correct Rust compiler.
| It needs to do this analysis or it's compiling a different less
| useful language.
|
| In the real world the (default) exhaustive case is great for
| types that reflect some truth about the world being modelled,
| where you really do mean that all the code relying on this
| enumerated type should refuse to compile and needs re-
| engineering if your list is subsequently discovered not to
| really be exhaustive. Teapot not withstanding, there are
| definitely only five PlatonicSolids, such as Tetrahedron and
| Cube. The non-exhaustive case is great for types that reflect a
| truth you know or strongly suspect will change and everybody
| using your system needs to be prepared for that up front. In
| 2016 the enumerated list of US FederalHolidays didn't need
| Juneteenth, but in 2021 it does, we don't want random code to
| blow up because it had never imagined Congress would make new
| ones.
| EdSchouten wrote:
| Sure. But a naive compiler whose only purpose is to stomp out
| a poorly optimized executable can safely ignore that
| annotation, right?
| richardwhiuk wrote:
| No, because checking that it is exhaustive is required by
| the language definition.
|
| It's not just not optimized, it's wrong.
| tialaramex wrote:
| You could instead require everything to be treated as non-
| exhaustive in your cheap compiler. This means it would
| reject correct Rust code that explicitly handles each of
| the five PlatonicSolids differently, because that code
| lacks a way to handle a sixth (non-existent) PlatonicSolid.
|
| You can't just go "Eh, shrug emoji" without deciding what
| happens when you don't match.
|
| Suppose you silently treat the first (or last) handled case
| as the default, so I can write code that only handles Cubes
| and unlike the real Rust compiler your naive compiler spits
| out a program. But wait, now what happens for a
| Tetrahedron? I defined Cubes to have a 32-bit floating
| point size, but for whatever reason Tetrahedrons have a
| 16-bit integer length instead. Does your compiler just try
| to treat the 16-bit integer as a 32-bit floating point
| number since it assumes this was a Cube?
| kevincox wrote:
| Yes, that annotation doesn't affect the codegen of a
| correct program (optimizations aside).
| NieDzejkob wrote:
| It still needs to ensure that one of the arms will always
| match. (just not executing any branch in that case is tricky,
| because the arms can return values)
| remram wrote:
| It does not need to check, if it _assumes_ it. mrustc works
| that way, it is not trying to check code, only compile valid
| code: https://github.com/thepowersgang/mrustc
|
| This is similar to C/C++'s " undefined behaviors". There are
| things you can write that aren't valid for the language, but
| that compilers are not required to catch.
| kadoban wrote:
| I had not thought of pattern-exhaustiveness checking like that.
|
| Isn't compiling several languages, including Rust, actually
| undecidable though? Seems like if your type system or
| metaprogramming systems are Turing complete, this has to be the
| case.
|
| So that's ~worse than NP-hard already (though calling it NP-hard
| is still technically correct).
| trissylegs wrote:
| I think in Haskell you need to add: {-#
| LANGUAGE UndecidableInstances #-}
|
| to get an Undecidable type system.
| arianvanp wrote:
| There are other language extensions to make the haskell type
| system undecidable. The classic example is `RankNTypes`
|
| E.g. you can proof that type systems of rank n >= 3 are
| undecidable. [0]
|
| I think that's why haskell has both Rank2Types and RankNTypes
| as a language extension. So you can still use higher rank
| types without running into decidability problems up to rank
| 2.
|
| [0] https://dl.acm.org/doi/10.1145/182409.182456
| arianvanp wrote:
| Type checking /compiling often is conservative for this
| purpose. It's sound (Rejects all incorrect programs),
| decideable (There is an algorithm to reject all incorrect
| programs) but due to Godel's decidability problem they're
| incomplete (There are always correct programs which the type
| checker rejects as incorrect).
|
| There are exceptions of course (C++ templating is turing
| complete; so type-checking is undecidable) but often you want
| your compiler to be sound and decidable
|
| Also see https://3fx.ch/typing-is-hard.html
|
| (edit: Apparently , as the link I posted shows, rust is indeed
| undecidable in certain edge cases. However you can usually
| reason about a subset of the language (Restrict using certain
| features) that is still decidable. I have the feeling Rust
| might have such a subset but it's just a fuzzy feeling)
| thaumasiotes wrote:
| > It's sound (Rejects all incorrect programs), decidable
| (There is an algorithm to reject all incorrect programs) but
| due to Godel's decidability problem they're incomplete (There
| are always correct programs which the type checker rejects as
| incorrect).
|
| Isn't that the definition of "semidecidable" as distinct from
| "decidable"? A decidable question is one where you can
| determine the answer, whatever that answer may be. A
| semidecidable question is one where, if the answer is yes,
| you can determine that, and if the answer is no, you may not
| be able to determine that.
|
| If you can always show that a program is incorrect, but you
| can't necessarily show that a program is correct, then the
| correctness of a program isn't decidable.
| arianvanp wrote:
| Decide-ability here means "Can we come to an answer in a
| known amount of steps". If you have an algorithm for type
| checking; then that answer is yes! You're _always_ able to
| come up with an answer that is yes or no.
|
| For example the following algorithm is for typechecking is
| decideable: typechecks :: Program -> Bool
| typechecks _ = false
|
| We always come to decision. There is no "semi" here. Just a
| clear yes/no answer.
|
| The algorithm is even sound. It rejects all incorrect
| programs.
|
| However it is not complete. It might also reject correct
| programs (In this case it rejects all correct programs).
|
| Of course you want a as little conservative typechecker
| without getting into trouble. But it's always conservative
| to some extent. Preferably you would both be sound _and_
| complete. But the problem is that any reasonable logic
| system can't proof its own consistency. Hence we can't have
| both. However we can get "as close as we possibly can".
| kmill wrote:
| I believe you're explaining what a computable function
| is. A decidable set S of a type T is defined to be one
| for which there is a computable function f : T -> Bool
| such that the set of values x with f x = true is equal to
| S. A semidecidable set S of a type T is defined to be one
| for which there is a partial computable function f : T ->
| Bool such that whenever x is in S, then f x is computable
| and equals true.
|
| Given a definition for a well-typed program, you get a
| set W of Program of the well-typed programs. You can ask
| whether W is decidable or semidecidable -- i.e., whether
| there is some (partial) computable function f : Program
| -> Bool with the above properties.
|
| The example you give is certainly computable, but it has
| nothing to do with the set W, so it says nothing about
| (semi)decidability of the typechecking decision problem.
|
| However, it is a valid trying to find the largest subset
| of W that is (semi)decidable. Or trying to redefine what
| it means to be well-typed so that it is (semi)decidable!
|
| One interesting case of a programming language without a
| semidecidable typechecking problem is Lean. The
| typechecker they've implemented will both reject or time
| out on valid programs (so the typechecker is at least a
| computable function in a sense), but in practice these
| cases don't really occur.
| thaumasiotes wrote:
| > We always come to decision. There is no "semi" here.
|
| That's not really how I understand the "semi" in the
| terminology. For a decidable question, you can recognize
| when the answer is yes, and you can recognize when the
| answer is no. For a semidecidable question, you can
| recognize when the answer is yes.
|
| It doesn't take much of a stretch to describe that as
| cutting your capabilities in "half".
|
| The way I usually think about this is that with full
| decidability, you can rule something in or out. With
| semidecidability, you can rule something in, but you
| can't rule things out.
|
| That framework extends well to the situation described
| above, where if the compiler chooses to compile a
| program, then the typing in that program was valid, and
| if the compiler chooses to complain about the typing, we
| can't draw any conclusions. It doesn't match your
| example; you can't rule anything in or out.
| cedilla wrote:
| Semi-decidable means that will accept all correct formulas,
| and either rejects incorrect formulas or gives no answer.
| It's the same as having a generator that generates every
| correct formula eventually.
|
| In my opinion, OP is incorrect and those systems are sound
| but indecidable.
| thaumasiotes wrote:
| > Semi-decidable means that will accept all correct
| formulas, and either rejects incorrect formulas or gives
| no answer.
|
| By this definition, program incorrectness isn't
| semidecidable either - the compiler will accept all
| correct [incorrect] formulas, and will either reject or
| accept incorrect [correct] formulas.
| ThreeFx wrote:
| Not really, because for undecidable problems
| semidecidability can only hold one-way. If it held in
| both ways then the language would be decidable.
|
| Take the Halting program for example: For a program p
| which halts, you can determine whether it will halt in
| finite time (because by definition it halts). However,
| there is no semidecidable algorithm for "this program
| runs infinitely long".
| thaumasiotes wrote:
| I don't understand what you're trying to say. What are
| you contradicting with "Not really"? The claim above you
| is "by this definition, program incorrectness isn't
| semidecidable either". You're saying that it is?
| LadyCailin wrote:
| Sort of a side track, but there's a great layman's intro to
| this in this Veratasium video
| https://m.youtube.com/watch?v=HeQX2HjkcNo
| jcelerier wrote:
| > There are exceptions of course
|
| for a definition of "exceptions" being "virtually all
| languages with a non-trivial static type system and non-zero
| userbase" as shown in your link.
|
| Java (https://arxiv.org/abs/1605.05274), C#
| (https://blog.hediet.de/post/how-to-stress-the-csharp-
| compile...), Scala
| (https://michid.wordpress.com/2010/01/29/scala-type-level-
| enc...), Haskell, Rust
| (https://sdleffler.github.io/RustTypeSystemTuringComplete/),
| Typescript
| (https://github.com/Microsoft/TypeScript/issues/14833) ...
|
| Having a type system with generics and not being turing-
| complete as a result _is_ the exception, not the rule.
| octachron wrote:
| Generics is not really the right cutting point: parametric
| polymorphism is perfectly decidable. However, few languages
| stop at just parametric polymorphism. For instance, the
| OCaml issue in the linked blog post happens at the module
| level, which mixes parametric polymorphism and subtyping.
| And typically, while SML has parametric polymorphism, it
| lacks the OCaml extension that makes the module type system
| undecidable.
| gnulinux wrote:
| For an example of contrary, one can check Agda. It is not
| Turing complete, but a useful language (I personally wrote
| many parsers in it and its Haskell FFI is pretty good so
| you can essentially write unsafe parts of your programs in
| Haskell, and safe parts in Agda then prove correctness).
| arianvanp wrote:
| Haskell's System-F is definitely completely is decidable.
| You need to opt in to undecidable features like RankNTypes
| (n>=3) explicitly. (which are GHC; and not Haskell
| specific).
|
| Even then; Undecidability is a bit spectrumy. e.g.
| RankNTypes become sort of decideable again in the presence
| of explicit type annotations
| Zababa wrote:
| Rust's type system is undecidable. You can see a few languages
| listed here: https://3fx.ch/typing-is-hard.html
| silon42 wrote:
| I believe there are limits to recursion (as mentioned in link
| above), so not strictly true. Perhaps they are too big?
| kibwen wrote:
| Rust's recursion limit is fairly low, I believe it's 128.
| However, you can override this if you find it too
| restrictive.
| simiones wrote:
| This ends up in some discussion about what "compiling" actually
| means. Is macroexpanding a part of "compiling"?
|
| Most languages with complex type systems avoid Turing
| completeness by having an explicit limit for all type level
| expansions - at least, that is how Java and C# do it.
| kadoban wrote:
| I think metaprogram expansion has to be considered part of
| compiling. You have sourcecode, you need machinecode. If the
| lang says that the process of getting there requires running
| some metaprogram, so be it.
|
| For Rust I think the type system itself is probably enough
| though even without that. If neither type checking or
| metaprogramming are part of compiling, I think your (for
| hypothetical you) definition of compiling is a bit too
| restrictive.
| aj3 wrote:
| Even makefiles are Turing complete.
| BBC-vs-neolibs wrote:
| Thanks, I couldn't quite put it in words, but that is exactly
| my reaction.
| jacoblambda wrote:
| Compiling a complete language and verifying correctness is
| undecidable but that's not necessarily what Rust does.
|
| The complier can still fail to compile/verify otherwise correct
| parts of a program. Because it's only operating on a subset of
| the entire possible language and it's not verifying all
| properties of the language, it's not quite undecidable but it
| is still very much NP-hard.
|
| Languages like C, C++, or Ada on the other hand take the other
| approach. They compile all parts of the language but make
| little to no attempt to enforce correctness on all those parts.
| You see verifying compilers for those languages only allowing a
| subset that they can actually verify which is the same as Rust.
|
| At the moment Rust the language is what the main Rust compiler
| says it is but once there are meaningfully different compilers
| you'll start to notice the issue a bit more and there will
| likely be parts of the language that one compiler can verify
| but the other can't (and therefore fail to compile).
| kibwen wrote:
| Any hypothetical standards-conformant Rust implementation
| would have to be as or more restrictive than rustc, not less.
| Something like mrustc, which performs no borrow checking,
| would not be allowed to call itself a conformant Rust
| implementation.
|
| (I say hypothetical because until there's a spec, it's simply
| not feasible to build a conformant alternative
| implementation, as you'd need to be bug-for-bug compatible
| with rustc.)
| ncmncm wrote:
| This is also not correct. As in, completely wrong.
|
| Presuming a formal spec that rustc happens to enforce,
| another compiler could simply try longer to resolve types
| before giving up, admitting some set of programs that rustc
| gives up on.
| tialaramex wrote:
| The spec. could specify exactly the rules rustc actually
| has including iteration limits, this would freeze rustc
| (and so it would be undesirable) but it is an option.
|
| The thing we're talking about here has changed after Rust
| 1.0, Rust 1.0 shipped with a rule that said if you match
| integers you have to provide a default. In lots of code
| that feels natural. But not everywhere. If you match all
| possible integers (e.g. for a i8 that's from -128 to
| 127), the compiler would say "Not good enough" until you
| write the default match it will never actually need.
|
| That was fixed with feature(exhaustive_integer_patterns)
| in 2018 or so AIUI.
|
| But you could imagine Rust being standardised with the
| old behaviour and, even though it's clearly _possible_ to
| write a compiler which implements
| feature(exhaustive_integer_patterns) that would then be
| non-standard because Rust programs lacking the default
| match for integers are forbidden in the standard.
| ncmncm wrote:
| The type system is also Turing-complete, therefore non-
| terminating. It would be arbitrarily hard to write a spec
| in a non-Turing-complete language such that all
| implementations would admit the same set of programs, or
| even (responsive to the original claim) reliably only a
| smaller set.
| ncmncm wrote:
| > _[Rust is] not quite undecidable but it is still very much
| NP-hard._
|
| > _Languages like C, C++, or Ada ... make little to no
| attempt to enforce correctness_
|
| Both statements are laughably false, but for different
| reasons.
|
| Both are based on the Doctrine of Rust Exceptionalism, which
| requires that Rust is not just a variation on a theme, but
| fundamentally different from other languages. Like most
| doctrines, this one is indefensible.
|
| Rust compilation, like C++ and Haskell, is undecideable, not
| just NP-hard. Non-terminating compilation is artificially
| curtailed, thus failing to produce an open set of what would
| have been correct programs.
|
| The Rust compiler performs certain correctness checks that
| other languages do not. It rejects an infinite set of correct
| programs that its correctness checks cannot resolve. In this
| as in all other particulars, Rust is very much on the
| continuum with other languages. All strongly-typed languages
| perform a huge variety of correctness checks, some built-in,
| others programmed, with great success, rejecting the large
| majority of incorrect programs people write. As a
| consequence, it is very common for Rust, C++, and Haskell
| programs to run correctly the first time, once the compiler
| is satisified.
| volta83 wrote:
| > Isn't compiling several languages, including Rust, actually
| undecidable though?
|
| While Rust is Turing complete, the compiler has a recursion
| limit, and if you take that into account, then compiling Rust
| is decidable (you only need to evaluate compilation up to the
| recursion limit).
| ghosty141 wrote:
| Rusts type system is turing complete if I recall correctly.
| Obviously, in practice everything is in a way decidable
| because we have limitations everywhere.
| derefr wrote:
| In other words, the _abstract machine specification_ you're
| programming your compile-time logic against in Rust, _has_
| Turing-complete semantics; but its extant _implementations_
| -- any Rust compiler people would care to use --
| intentionally does not implement compile-time Turing-
| completeness.
|
| But you could in theory write a Rust compiler whose
| compile-time runtime _is_ fully Turing-complete; and that
| compile-time runtime would be conformant to the Rust
| language spec. (Just, nobody would want to use it, because
| "unbounded runtime" isn't a property people tend to want
| from their compilers.)
| dllthomas wrote:
| > But you could in theory write a Rust compiler whose
| compile-time runtime is fully Turing-complete
|
| Only if your theory allows for infinite storage...
| einpoklum wrote:
| Well, my infinite tape seemed to work just fine last time
| I checked.
|
| Of course, I didn't check ad infinitum. <rim shot>
| lovasoa wrote:
| There is a low hardcoded limit to how deep you can nest
| types. But I don't think there is a hardcoded limitation to
| how large match patterns can be. That's the difference.
| smitop wrote:
| It's possible to change that hardcoded limit. By default
| it is 256, but you can bring it all the way up to 2^64
| with #![recursion_limit = "18446744073709551615"].
| pjmlp wrote:
| Same applies to C++ templates by the way, the compilers
| usually provide an expansion depth flag.
| comonoid wrote:
| "I'm not just hard, I'm NP-hard, baby".
| speed_spread wrote:
| Wanna solve my constraints?
| not2b wrote:
| So just use a SAT solver to check for checking whether pattern
| matching is exhaustive. Any decent one will have much better
| performance than the current approach in the Rust compiler for
| complex cases. The interesting thing about SAT is that although
| the worst case is exponential (as far as we know until someone
| proves P = NP), even huge real-life problems (from circuits,
| program analysis etc) tend to have structure that makes them
| quickly solvable with a SAT solver. And if not, have a time
| budget and report "too complex to solve" if exceeded.
| thaunatos wrote:
| It's worth noting that the author of the arcicle doesn't think
| it's worth integrating a SAT solver:
|
| > Does this mean that rustc should integrate an industrial-
| strength SAT solver? As hilarious as that would be, I'm
| advocating no such thing. This will only be a performance issue
| on pathological examples crafted by bored nerds, and I don't
| think precious engineering time should be spent on that.
| Besides, generalizing a SAT algorithm to handle the full
| expressive power of Rust's patterns might be, to borrow some
| language from mathematicians, non-trivial.
| Ericson2314 wrote:
| > Does this mean that rustc should integrate an industrial-
| strength SAT solver?
|
| Well, this is actually what https://github.com/rust-
| lang/rfcs/blob/master/text/1868-port... would require. There is
| already chalk, a logic programming language for traits too.
|
| I don't balk at these things. A good logical / relational toolbox
| is very useful for a lot of tasks.
| cletus wrote:
| Compiling really is the Achilles heel of Rust I feel. This
| article really talks about the worst case. As others have noted,
| Haskell and other languages have this property too.
|
| Here's another perspective on practical issues [1] that's worth
| reading.
|
| Basically, Rust made some understandable but unfortunate design
| decisions that are hard to walk back or change now.
|
| [1]: https://pingcap.com/blog/rust-compilation-model-calamity
| w-m wrote:
| Presumably there's another SAT solver in the realm of the Rust
| tool chain: cargo surely has one for dependency resolution. This
| is a big time sink in other package managers (looking at you,
| Anaconda). Does someone have an insight how solving is
| implemented in Cargo?
| littlestymaar wrote:
| There is one[1] but Rust allows different major versions of the
| same package to exist in parallel, so dependency resolution is
| usually trivial unless someone in the dependency tree
| explicitly adds an upper limit of the acceptable version [2],
| which is pretty uncommon (I don't think I've ever met one in
| practice).
|
| [1]: https://github.com/rust-
| lang/cargo/blob/master/src/cargo/cor...
|
| [2]: https://doc.rust-lang.org/cargo/reference/specifying-
| depende...
| donatj wrote:
| I had rare occasion to compile a Rust application recently
| because a prebuilt binary didn't exist for my platform. It took
| just over two hours! Coming from primarily Go development
| recently, I was astonished.
|
| How do you get anything done when the cost of testing an idea is
| several hours?! I'd feel like the sunk cost fallacy would kick in
| for even the most lackluster of changes just because you took the
| time to try it.
| spockz wrote:
| Compiling linkerd-proxy took me also north of an hour on an
| Intel 6850k. Most of the time was spent compiling dependencies
| though. I also compiled envoy proxy which also took more than
| an hour.
|
| Subsequent rust builds for linkerd-proxy were quite fast. For
| envoy most time was spent in Bazel itself. Probably because I
| just did a Bazel build instead of specifying a more specific
| target.
| mkj wrote:
| Out of curiousity I tried linkerd2-proxy here on a i5-1145G7
| (I guess similar total CPU? Less cores, laptop model, newer).
| Took 5 minutes but used ~6GB resident. Maybe yours was
| swapping a lot?
| glennpratt wrote:
| I worked on a Rust project at work for several months. Compile
| times were not a significant issues for developers where it was
| generally cached. For example, repeatedly running a test only
| compiled a small amount of code.
|
| It was a bit slow in CI/CD build environments with no cache,
| but so are the k8s Go projects I work on (go mod pulling in the
| world).
|
| The only thing approaching 2 hours I've ever seen is building
| the compiler from scratch.
| yakubin wrote:
| Although Rust compile times are rather bad, IME such
| astronomical compile times are rather a result of terrible code
| quality, a lot of code doing essentially nothing, just gluing
| glue, to the point where it's normal to get a 50-60 frames deep
| stack trace with most functions being named such that it would
| suggest that all of them do basically the same. Then you look
| at their bodies and... yep... doing the same, aka nothing.
|
| With Rust, procedural macros add a lot of overhead when
| compiling, so I try to avoid them.
|
| At work, we have a C++ project which, without using distributed
| builds with distributed ccache, running on powerful cloud
| computers, takes 3h+. Code quality is adequate. Debugging
| anything is a nightmare in such a codebase.
| spoiler wrote:
| While I agree that the compile times are bizarre, I don't
| think we can jump to conclusions such as "such astronomical
| compile times are rather a result of terrible code quality"
| without knowing more about what's being compiled and under
| what conditions it was being compiled!
| gameswithgo wrote:
| Short answer is it doesn't normally take that long. You also
| don't have the problem of cgo being slow or ever having to wait
| for the data race checker to run.
|
| initial compile has to download and compile all dependencies.
| after that compiling is incremental. still slower than go
| though
| adwn wrote:
| Compiling Servo, a _web browser engine_ , takes 5-6 minutes for
| an optimized build from scratch on my 6C/12T machine. So no, 2
| hour build times are not normal for all but the largest
| software projects.
| genghizkhan wrote:
| Compiling paru, an AUR helper for Arch Linux, on the other
| hand, takes me around 3 - 4 minutes on a 4C/8T i5 for an
| optimised build. I think Rust compile times might just fall
| within a narrow range.
| Kaze404 wrote:
| Compiling for production and compiling for development are two
| very different processes, with the latter benefiting from less
| optimizations and incremental compilation. It wouldn't take two
| hours to test an idea.
| wongarsu wrote:
| In my experience, Rust compilation times aren't that different
| from what I'm used to in the JavaScript world. Initial
| compilation takes a minute or two (comparable to the runtime of
| `npm install` on Windows), when you make changes you usually
| get incremental compilation within a few seconds.
|
| I guess you can have projects that are just huge and/or run
| into some pathological case that increases compile time a lot
| (just like with C++), but for any subsequent compilations you
| should get very fast incremental builds.
| jackcviers3 wrote:
| I get this question about large scala codebases too - clean
| build and test cycles for things on underpowered containers in
| cloud build pipelines are in the tens of minutes sometimes.
|
| One: my local dev machine has 16 gigs of memory and 16 cores.
| What takes the tiny docker container with 1 gig and 2 vcpus 30
| minutes takes my computer about 1.
|
| Two: Incremental compilation and testOnly make build/test
| cycles maybe a second / twenty seconds max, and most of that is
| test runtime on complex property based tests.
|
| You just get by without a clean compile most of the time after
| you build the project the first time. And really, a lot of
| builds spend an inordinate amount of time just pulling in
| external dependencies (which are also cached on my local
| machine, but not on the containerized builds a lot of the
| time).
| lmilcin wrote:
| I have worked for 3 years on a project where it took a whole
| week to get the code compiled, signed by an external company
| and deployed to the device so that I could see the results.
|
| I just learned to work without compiling for a long time. Over
| time my productivity increased and the number of bugs fell
| dramatically.
|
| Working this way requires you to really think about what you
| are doing, which is always a good idea.
|
| This was over a decade ago and now I work mostly on Java
| backends and I am happy that I typically spend days or even
| weeks without ever compiling the code and that it usually works
| the first time I run it.
|
| I can't think how I could get back. It looks really strange to
| me to observe other developers constantly compiling and running
| their code just to see if it works. It kinda looks as if they
| did not exactly understand what they are doing because if they
| did, they would be confident the implementation works.
|
| The only time when I actually run a lot of compile/execute
| iterations is when I actually don't know how something works. I
| typically do this to learn, and I typically use a separate toy
| project for this.
| nine_k wrote:
| Often you can be certain that you know what tp expect from
| your own code, but not from dependencies or external systems.
| So checking that your assumptions about them are right is a
| major reason to run and rerun your code.
| lmilcin wrote:
| That is good reason to minimize amount of dependencies and
| only use the ones you know and can reason about. It is also
| part of what I do to help me reason about code before I
| execute it.
|
| As I mentioned, if I don't understand a dependency or
| external system I make separate toy project where I can
| rapidly experiment and learn.
|
| Think of it as having fun in a aircraft simulator. You play
| with it so that you are prepared to fly the actual plane.
|
| Also checking your assumptions by trying to see if the code
| works is a problem in itself. A lot of these broken
| assumptions will not break your code immediately but maybe
| sometime later. Maybe when your application is under load
| or maybe when clock on the server is moved back by an hour
| or maybe when the connection breaks in a certain way.
|
| Base your work on _knowing_ how something works and not
| _assuming_.
|
| Best way to limit the risk of your application failing due
| to broken assumption is to limit your reliance on
| assumptions in the first place.
| marcosdumay wrote:
| I used to use approach, but the new heavily typed languages
| bring a lot of really nice tools that you only get to use at
| compile time.
|
| Specifically in Rust, you can use the language to guide you
| through things like refactoring, thread synchronization,
| correctness verification and a lot of other things. But you
| have to run the compiler.
| lmilcin wrote:
| I don't write production code in Rust (though I learn for
| my embedded hobby).
|
| But you can say the same for Java. IntelliJ IDEA internally
| does equivalent of compilation and tells me exactly where
| my code would fail to compile.
|
| So in a sense I am not strictly practicing my approach, but
| I also don't see reason to do so if the tools are reliably
| giving me hints when I made mistake writing something that
| will not compile.
| slunk wrote:
| > So in a sense I am not strictly practicing my approach
|
| Developing in an IDE that compiles almost continuously is
| about as far from the development philosophy you're
| advocating for here as one could get :P
| lmilcin wrote:
| That doesn't make any sense.
|
| This isn't about throwing away tools for some idealized
| goal. It is about using the tools that are available to
| achieve best results without making you reliant on the
| tools to the point you don't know what your program is
| going to do without compiling and running.
|
| IDE helps catch a lot of stupid simple mistakes and that
| helps save time. Why would that be bad?
| slunk wrote:
| I don't think using an IDE to catch lots of stupid simple
| mistakes is bad. It's how I prefer to work.
|
| > It looks really strange to me to observe other
| developers constantly compiling and running their code
| just to see if it works. It kinda looks as if they did
| not exactly understand what they are doing because if
| they did, they would be confident the implementation
| works.
|
| Explain to me how this statement doesn't apply to your
| use of an IDE, but the other engineers you've observed
| don't understand what they're doing.
| [deleted]
| lmilcin wrote:
| If you can't read that sentence with comprehension none
| of my explanations are going to help.
| SV_BubbleTime wrote:
| How do you like embedded Rust?
|
| I'm looking forward to someone making a legit IDE/suite
| with support, no indication of it yet but I assume some
| day!
| lmilcin wrote:
| I mainly work with STM32 Cortex-M3 MCUs (again, these are
| my personal projects).
|
| Rust, well, "works". But there is still a bunch of issues
| so I keep developing using C until I get the kinks ironed
| out.
| steveklabnik wrote:
| It's working really well for us at Oxide. I even do it on
| Windows :)
| hinkley wrote:
| My theory for Java is that it was frog boiling turned cargo
| culting.
|
| Comparatively speaking Java compilation was very fast at the
| beginning, so for instance Red-Green-Refactor (RGR) works
| pretty well. There's a parallel to other creative jobs where
| sometimes shuffling around the things you already have
| reveals a pattern that leads to a breakthrough.
|
| But there are other feedback loops that, with J2EE in
| particular, the cycle times started to creep up, and up, and
| at some point if you haven't stopped and looked at what
| you're doing you don't see how crazy things have gotten. RGR
| still has a place there because you are typically _not_
| recompiling everything and you aren 't spooling up the
| application to run unit tests. But making one line changes to
| see how a page loads is just bonkers amounts of busy work.
|
| One of the bad dynamics is that people more like you also
| tend to memorize the code, which is both bad for new hires
| (circular logic does not reveal itself when you introduce one
| assertion at at time, but does when you get hit with all of
| them at once), and also incentivizes you push back on
| refactoring. Because those damned smartasses keep moving
| things around and they were Just Fine where they were. If
| that happens you have cemented the entire codebase and
| anything that is really wrong with it is going to stay wrong
| until someone proposes a rewrite. And having learned the
| wrong lessons the first time, we repeat them again in the
| second iteration.
| random_kris wrote:
| I am one of those devs that is always compiling and checking
| if their program is working. I am mainly working with java
| and still doing this.
|
| Less so than when working with JavaScript. But please teach
| me your ways hahha
| ransom_rs wrote:
| In case anyone else does this and is new to Rust - you can
| use `cargo check` to type check/borrow check/everything
| else without doing any codegen
| lmilcin wrote:
| This thread is no longer with regards to Rust or checking
| whether the code compiles or not. It is about how you can
| work with compilation times that are longer than a coffee
| break.
| lmilcin wrote:
| Start by understanding that this compile/run process is a
| crutch. Rather than use your knowledge, experience and
| intelligence to predict if it works you resign yourself to
| just waiting to see.
|
| This is a problem for many reasons. One is that this may
| help you get something working, but with lack of deep
| understanding the outcome will likely be subpar if only
| because not all problems that you could have predicted will
| show themselves on execution.
|
| Another is that it basically shrinks the part of brain that
| is necessary for understanding and predicting behavior of
| your code (figuratively). Sort of like driving with GPS
| makes me helpless without it.
|
| Try to write larger stretches of code without compilation.
|
| Try to focus on modularizing your application so that you
| can reason about modules separately. This is always a good
| idea, but it is even more important when you need to be
| able to predict how something works without trying it.
|
| When you have finally compiled your code and it failed, do
| not immediately go to fix the problem. Try to spend a
| moment to learn from the failure and improve your process
| so that you minimize chance of this happening in the
| future.
|
| Ask yourself, what you could have done to prevent this
| problem from happening? Could you have specified some
| function or module better? Could you have simplified your
| code to be able to better reason about it? Would it help if
| you have spent a bit more time getting acquainted with this
| internal or external library before you decided to use it?
|
| From my experience, most of this comes down to following
| things:
|
| - defining your modules and APIs correctly -- badly defined
| modules make it difficult to predict the behavior,
|
| - finding simple solutions to problems -- complex solutions
| tend to make it difficult to predict behavior,
|
| - using only tools you understand,
|
| - only interacting with existing code after you have
| understood how it works (I typically at least look over the
| code that I plan to use),
|
| - thinking hygiene (make sure you base your work on hard
| facts and not beliefs),
|
| - refactoring, refactoring, refactoring -- first solution
| to the problem is rarely optimal. I write something that
| works and then immediately keep refactoring it removing any
| unnecessary complexity until I am satisfied. Don't leave
| refactoring for later -- when you have just written a piece
| of code it is easiest to change it.
|
| - as much as possible, writing your code in a way that it
| is not even allowed to produce wrong result. This is very
| large topic so I won't explain. There is a lot of
| techniques that you can research.
| random_kris wrote:
| Do you think there is a time and place where it is more
| sensible to just go by trial and error.
|
| For example when I am interacting with a codebase for the
| first time and I want to implement something I just keep
| bashing and throwing shit at the wall untill something
| sticks. After that I start working more in line of what
| you described.
| lmilcin wrote:
| How exactly you arrive at understanding the codebase is
| not as important.
|
| Just make sure you keep actual development separate from
| learning the tool if you care for your results and
| especially reliability.
|
| Now, I use various ways to learn the tools and codebase.
| Running PoC for my idea or maintaining separate toy
| project helps me with maintaining hygiene.
|
| For example, for the past year I have been spending a lot
| of time learning reactive programming with RxJava and
| Reactor. I have created a dozen small projects
| illustrating various ideas for making reactive APIs,
| processing pipelines, separating business logic from
| infrastructure, composing reactive modules, etc.
|
| I did this with aim of purposeful learning rather than
| writing production code, even though some of these in the
| end migrated to be part of production codebase.
|
| I am now at a level where I can, again, write large
| swaths of modules, libraries but now using reactive
| paradigm, with a very good chance of it working
| correctly, which for me validates that I more or less
| understand what is going on.
| throwawayboise wrote:
| First job was in a mainframe environment, compiles were
| pretty quick but the job queue was so big and developer
| compiles were low enough priority that it could take hours of
| wall clock time to turn around.
|
| I don't remember the specifics but while watching the job
| queue on the terminal screen I discovered that the job
| priority was editable. So I bumped it up, my compile job ran,
| and I was happy. I only did this a few times before I got a
| lecture from the system operations guys that was quite
| explicit that I should never do this again.
|
| Yes, you figure out how to run code mentally, how to check
| carefully for typos and syntax errors, etc. No Intellisense
| then either, that's another modern crutch.
| yosefk wrote:
| I doubt your compile times were due to pattern matching which
| TFA demonstrates to be NP-complete, any more than C++'s
| compile-times are due to the undecidability of template
| metaprogramming.
|
| (Though I guess one might argue that slow compile times on
| typical code and rarely-used computationally hard features have
| the same root cause of not treating fast compile times as a top
| priority.)
| tialaramex wrote:
| Right, and C++ got a lot of its compiler performance for
| large projects by making this an embarrassingly parallel
| problem even though the consequences are negative for the
| main consumers of the code (people, not compilers).
|
| One cost of being embarrassingly parallel is the One
| Definition Rule. If we can compile N different code units in
| parallel, but they're all allowed to define things in the
| same namespace, obviously those definitions might contradict
| each other and we wouldn't notice. So, the C++ language
| explicitly forbids this, knowing you'll probably do it
| anyway, at least by mistake. If (when) you do, that isn't a
| valid C++ program, but the compiler isn't expected to produce
| any diagnostic (warning or error). So, you get a binary, but
| the language doesn't care what that binary does. Maybe it
| does exactly what you expected. Maybe it does _almost_
| exactly what you expected. If not too bad, the One Definition
| Rule means your compiler was _fast_ and that 's what matters.
| sakex wrote:
| I usually fails at link time doesn't it?
| bluGill wrote:
| No. I have intentionally violated the one definition rule
| and nothing broke at all. In my case I wrote a second
| timer, which had hooks my unit test system could use to
| advance time without having to inject timers.
|
| It will fail at link time if you link everything into the
| same library. Even here there is an escape: there are
| ways to mark something as a weak symbol and than the
| linker won't complain about more than one definition.
|
| See your OS documentation for how this works on your
| implementation. (though don't be surprised if the
| documentation is wrong...)
| brandmeyer wrote:
| > I wrote a second timer, which had hooks my unit test
| system could use to advance time without having to inject
| timers.
|
| This use case isn't an ODR violation. Its just using the
| linker to mock an interface.
|
| > It will fail at link time if you link everything into
| the same library.
|
| _That_ is an ODR violation, although there are
| variations on this pattern that are not required to be
| detectable. Template instantiation is an easy way to get
| a silent ODR violation.
| jcelerier wrote:
| > No. I have intentionally violated the one definition
| rule and nothing broke at all.
|
| did you use LTO ? It always catches ODR issues for me.
| There is also GNU Gold's --detect-odr-violations switch.
| bluGill wrote:
| No. LTO is something I've been meaning to look at. Though
| if I can't violate the ODR intentionally I might not be
| interested.
| jcelerier wrote:
| If you violate ODR intentionally you are just writing
| code that _will_ break in a future version of GCC /
| Clang / ..
| bluGill wrote:
| That is a risk I know I'm taking.
|
| Though I've been tempted to write a paper for C++ to make
| it defined behavior. I know it works in some form on most
| implementations even though it isn't legal. Thus there
| seems to be some other use cases for it that could/should
| be formalized. If anyone can give me other examples of
| why you want to do this and what the rules are on each
| platform I'll take a shot at it.
| jcelerier wrote:
| > I know it works in some form on most implementations
| even though it isn't legal.
|
| it definitely does not, every time I had an ODR issue
| that caused actual bugs. For instance, dynamic_cast not
| working because a typeid was defined in two shared
| objects, etc.
|
| What would be the behaviour you expect if you have
| a.cpp: int constant() { return 123; }
| b.cpp: int constant() { return 456; }
| c.cpp: int constant(); int main() {
| return constant(); }
|
| how could you define this meaningfully other than "hard
| error" ?
|
| e.g. here with gcc, if a and b are put into shared
| libraries, the return value depends on _the order in
| which the libraries are passed to g++ when linking_. e.g.
| g++ c.cpp a.so b.so
|
| calls the version in a.so, while g++
| c.cpp b.so a.so
|
| calls the version in b.so
| bluGill wrote:
| You can do that because the order in which libraries are
| passed to the linker is something you can control. Of
| course linkers don't have to do this, and future versions
| can do something different, but it works and the rules
| for gcc are currently "the order in which the libraries
| are passed to g++ when linking", which is defined. Of
| course gcc has the right to change those rules (I suspect
| the real rules are a bit more complex)
|
| Gcc also has the concept of weak symboles which if
| invoked (and the linker supports it) would allow you to
| make one of the two weaker than the others and then the
| whole doesn't depend on link order. Visual C++ also seems
| to have something like this, but I'm sure it is
| different.
|
| Like I said, I want to write a paper to make it defined -
| but the paper will be a lot longer than would fit in a
| response here, and depending on information that I
| currently don't know.
| IshKebab wrote:
| Were you compiling Servo on a Raspberry Pi or something? 2
| hours is ridiculous even for Rust.
|
| > How do you get anything done when the cost of testing an idea
| is several hours?
|
| Incremental compilation. Depending on the project you can get
| the edit compile cycle down to somewhere between 1 second and 5
| minutes. It's nowhere near as fast as Go but it's not like
| anyone is actually making edits then waiting 2 hours for them
| to compile.
| weavie wrote:
| Incremental compilation. Once you get the first build done,
| subsequent builds are much faster.
|
| Also a project is likely to be split up over multiple crates,
| so a lot of the changes you make will require building and
| testing just that one crate.
| runevault wrote:
| Isn't incremental compile disabled currently? I thought they
| found a significant bug and so turned it off until they could
| validate a fix.
| kibwen wrote:
| The grandparent was building a binary utility, which means
| they were (hopefully!) building in release mode, which
| doesn't use incremental compilation by default in any
| version of the compiler.
|
| For debug builds, where incremental is usually on by
| default, it was disabled for 1.52.1 due to bugs, and then
| kept off in 1.53 out of an abundance of caution. It should
| be back on in 1.54.
| runevault wrote:
| okay that's what I thought (and the not turned back on
| yet officially was what I was trying to hint at with the
| abundance of caution).
| ekidd wrote:
| > How do you get anything done when the cost of testing an idea
| is several hours?!
|
| My Rust compile times are usually between 10 and 30 seconds.
| Some tricks that help:
|
| - Get a good development machine. I prefer a Dell Precision
| laptop from the last couple of years, with plenty of cores. A
| 5-year old laptop with 2 cores will be a _lot_ slower.
|
| - Use Visual Studio Code and the rust-analyzer plugin. This
| will give you feedback in the editor while you're working.
|
| - Rely on the type system to identify most problems before ever
| generating code.
|
| - Minimize the use of slow-to-compile libraries that rely on
| complex generic types. Diesel, for example, is a great library
| but it's slow to compile.
|
| - Install cargo-watch, and use it to automatically re-run the
| tests.
|
| Also, remember that incrementally re-compiling a single crate
| in debug mode will be much faster than building an optimized
| application and all its dependencies from scratch.
|
| TL;Dr: Rust compilation times are less than ideal, but with a
| fast laptop and rust-analyzer, it's possible to spend very
| little time actually waiting for the compiler.
| remix2000 wrote:
| Just get a new machine for a couple zillion bucks, why
| haven't we thought of it earlier? :^)
| isaacimagine wrote:
| As a developer who uses Rust a bunch (work & personal),
| this is what I ended up having to do. I'm joking, but this
| one simple trick can cut compile times _in half_.
| edflsafoiewq wrote:
| Yeah, can really see that "empowering everyone" motto
| coming into play.
| ekidd wrote:
| Yes. If you are a professional developer who gets paid to
| work on large Rust projects all day long, then a US$2500
| high-end laptop will pay for itself many times over. (I'm
| assuming US or European salaries here.)
|
| I've gotten by with much less for personal projects. But
| Rust will make very efficient use of (say) an 8-core i9 if
| you have one.
| nicetryguy wrote:
| I'm pretty excited for the Linux kernel to take three weeks to
| build.
| stonemetal12 wrote:
| If I had to guess it is because all of the dependencies had to
| be compiled too.
|
| In a normal Rust application you break it up into crates.
| Crates are the unit of compilation, and are only recompiled
| when something in the crate changes. In a "normal" developer
| flow you only touch 1 or two crates at a time so normal
| developer compile time would be a couple of minutes at most.
| Even for a new build on a machine most developers would never
| see the two hour compile time because they would have gotten
| precompiled dependencies.
| Nullabillity wrote:
| > In a "normal" developer flow you only touch 1 or two crates
| at a time so normal developer compile time would be a couple
| of minutes at most.
|
| Obviously this depends on your computer, but generally
| incremental Rust builds should typically be on the order of
| seconds, not minutes.
|
| > Even for a new build on a machine most developers would
| never see the two hour compile time because they would have
| gotten precompiled dependencies.
|
| Cargo doesn't do precompiled dependencies (even between
| projects on the same machine).
| tester34 wrote:
| jesus christ
|
| rust's compilation times are as terrible as c++'s?
| okamiueru wrote:
| I don't know about Him, but at least when working with C++ I
| feel the compile time is entirely up to deliberate tradeoffs.
| With better planning and more effort made with forward
| headers, better implementation boundaries w/pimpl like
| constructs, and otherwise linking, and avoiding meta-
| programming entirely, I've not found it to be an issue.
|
| Incremental C++ builds can be lightning fast, or it can be
| ridiculously slow. I've worked on large projects where change
| + incremental compile + test was less than 5 seconds. And
| I've worked with project where no effort was made where the
| same could take 10 minutes.
| gpderetta wrote:
| Keeping your C++ compilation times reasonable requires
| eternal vigilance (or a distributed system with aggressive
| caching).
| kzrdude wrote:
| It's the same ballpark
| spoiler wrote:
| In my experience (C++ dev at work) C++ generally _feels_
| slower than Rust.
|
| Edit: I quickly realised, this might only because of Rust's
| --in my opinion--superior type system and tooling, so maybe
| it just requires less waiting for compilations than when
| working C++.
|
| Edit 2: I've never actually measured Rust compilation
| times, but even for large-ish graphical codebases (wezterm,
| alacritty, bevy games) I've compiled from scratch, it felt
| noticeably faster even then.
| tester34 wrote:
| I think we must do better
|
| AAA games need less resources than those tree walkers
| IshKebab wrote:
| In my experience Rust is faster for full clean builds, even
| if you use slow dependencies like Serde.
|
| However C++'s compilation units are smaller than Rust's which
| helps it to have faster incremental compilation times for
| small changes.
|
| So, same order of magnitude basically.
| kibwen wrote:
| I suspect the machine you were using was swapping itself to
| death, because I've never experienced anything resembling that
| in Rust.
|
| I also presume you were compiling in release mode, which is
| good for producing extremely fast programs but not something I
| bother with in regular development (in contrast to Go, which
| has no distinction between debug and release optimization
| levels).
|
| _> How do you get anything done._
|
| The vast majority of my workflow just involves seeing if my
| code typechecks, for which I don't even need to build the
| program (so no codegen, no linking). This I do very frequently,
| as a sanity check on my work. The command for this is `cargo
| check`. This takes less than a second for small projects, one
| to five seconds for medium projects, and one to twenty seconds
| for large projects.
| adwn wrote:
| > _I suspect the machine you were using was swapping itself
| to death_
|
| Good point. I recently upgraded my machine to 64 GiB RAM,
| because 16 GiB filled up pretty quickly with parallel
| compilation, several VS Code/rust-analyzer instances and a
| couple of browser tabs open.
| tinco wrote:
| Go is geared towards productivity. If you have a problem you
| can solve adequately in Go, why not just do it in Go?
|
| Because they gained popularity roughly around the same time
| many people think they are competing languages, but they're
| really not.
|
| Ruby is my main programming language, if I need something to be
| fast and/or light weight I switch to Go, sacrificing some
| comfort. And if I need absolute lowest possible overhead, I
| switch to Rust, sacrificing a whole lot more comfort. Rust is
| more expressive than Go, but if your Go project is so large
| that it needs a stronger typesystem in my opinion you should
| switch to a language like C# or Scala.
| sesm wrote:
| Go is geared towards productivity of specific people at
| specific environment: fresh grads at Google.
| tinco wrote:
| First time I used it, I had not written a line of Go prior,
| I rewrote our data ingestion gateway from Ruby to Go. It
| took me 1 week, and it had full feature parity. I don't
| think I had the GPA at university to land a fresh grad job
| at Google.
|
| Sure the tooling around it (besides the compiler itself)
| was the dumbest 21st century tooling I've seen (this was ~6
| years ago), but it was quick to learn, quick to write,
| quick to compile and quick to run so who am I to complain.
| sim_card_map wrote:
| How was the tooling dumb?
| tinco wrote:
| It had this weird system where all your projects were
| supposed to be located in one directory structure that
| mapped to their git namespaces, it made no sense at all
| and was a pain to work around.
| spoiler wrote:
| I'm really curious what you were compiling, and if you remember
| the machine specs (and was the machine under other load at the
| time). Two hours seems _beyond_ ridiculous.
|
| Also, I'm assuming it was a full (as opposed to an incremental)
| release build? But even then, I've never had to wait for that
| long.
| lmilcin wrote:
| Two hours is not beyond ridiculous. A lot of projects in the
| past compiled for way longer than that, days even.
|
| In 90s and early 2000s if you worked on a large desktop
| application (the size of Office or Photoshop) you could
| expect to spend a business day waiting for compilation
| results.
| spoiler wrote:
| Sorry, I should've clarified that I meant specifically
| meant in recent versions of Rust. I wasn't speaking in the
| general sense. I've experienced slow compilation times in
| C++ projects, although not of the same magnitude you
| describe (ie having to wait days)!
| gpderetta wrote:
| Around the turn of the millennium I remember recompiling,
| for some reason I don't remember, the whole of KDE. That
| took a bit...
| swiftcoder wrote:
| What on earth did you compile? I can't recall ever having
| compiled a rust project that took more than 10 minutes from
| scratch.
|
| I do have memories of the days when compiling Chromium or AOSP
| was a 4 hour battle, though :)
| [deleted]
| jynelson wrote:
| The rust compiler itself can take as long as 2 hours on a
| modern laptop if you do a full stage 2 build. I don't think
| that's what the top poster was talking about though, it
| sounded like they had a rust toolchain already.
| steveklabnik wrote:
| (You know this, but for anyone else reading, don't forget
| that that doing this compiles LLVM, which is in of itself a
| massive C++ project, as well as compiling the compiler
| itself multiple times. That's part of why it would be such
| an outlier.)
| swiftcoder wrote:
| Right. Nested C/C++ dependencies are the only case I can
| think of where drastically-long compile times are common.
| jandrese wrote:
| Or maybe someone included Boost?
|
| Sometimes it is just that someone drank the Kool-aid and
| made everything a template. Then split their code across
| hundreds of tiny files and did the one big massive
| include at the top of every file that pulls in
| _everything_.
| eptcyka wrote:
| I've only had 2 hour build times when I was compiling a rust
| toolchain on an emulated ARM system running on x86.
| JD557 wrote:
| > How do you get anything done when the cost of testing an idea
| is several hours?!
|
| I haven't used rust in production, but usually you can just use
| `cargo check` to only run the typecheck. Using `cargo build` is
| also usually much faster than `cargo build --release`.
|
| Having said that, at least in my toy projects it was not
| uncommon to have to compile using `--release` to run the tests
| (since otherwise the tests would take forever).
| littlestymaar wrote:
| > Having said that, at least in my toy projects it was not
| uncommon to have to compile using `--release` to run the
| tests (since otherwise the tests would take forever).
|
| Maybe you're already aware of that, but if the reason why
| your tests are slow is not "your code not being optimized"
| but "your dependencies not being optimized" then Cargo
| profile override[1] can save your life. You can have one
| specific dependency (or all of them) being build in release
| mode even when you're building you own code in debug mode.
| The first development build is going to be quite slow, but
| after that, you're going to have the best of both worlds.
|
| [1]: https://doc.rust-
| lang.org/cargo/reference/profiles.html#over...
| lower wrote:
| For languages with Hindley-Milner typing, like SML and OCaml, it
| has long been known that type-checking is even worse in the worst
| case (DEXPTIME hard) [1]. But the programs where this matters
| aren't what one would write by hand, typically.
|
| [1] https://dl.acm.org/doi/10.1145/96709.96748
| contravariant wrote:
| To some extent it wouldn't even be surprising for a
| sufficiently powerful language to be EXPSPACE. Pretty much
| anything with sufficiently powerful macros would be EXPSPACE,
| or worse.
| lower wrote:
| Maybe it's a good idea to add a concrete example (can't edit
| the reply anymore):
|
| OCaml: let x1 = fun y -> (y, y) in let x2
| = fun y -> x1 (x1 y) in let x3 = fun y -> x2 (x2 y) in
| let x4 = fun y -> x3 (x3 y) in let x5 = fun y -> x4 (x4
| y) in let x6 = fun y -> x5 (x5 y) in x6 (fun z ->
| z)
|
| Haskell: test = let x1 = \y -> (y, y)
| x2 = \y -> x1 (x1 y) x3 = \y -> x2 (x2 y) x4 =
| \y -> x3 (x3 y) x5 = \y -> x4 (x4 y) x6 = \y
| -> x5 (x5 y) in x6 (\z -> z)
|
| If you can compile these, add x7. The effort increases
| exponentially as one adds x7, x8 and so on.
| NieDzejkob wrote:
| The fact that x1 is \y -> (y, y) is superficial, though.
| Something like x1 = Just also increases exponentially, and
| makes clear that using a deduplicated DAG for representing
| type terms doesn't solve this.
| lower wrote:
| With x1 = Just, the program is accepted instantly by ghc. I
| think you need a type variable to appear twice.
| NieDzejkob wrote:
| Ah, it's still exponential, but in the size of the type,
| and the constants of the complexity are a bit different,
| so you need around x30.
| lower wrote:
| Ah, you're right. If one goes to x_{i+1}, then there will
| be 2^i Maybes in the type. The number of Maybe-
| occurrences in the type doubles in each step and sharing
| won't help there.
| Yajirobe wrote:
| could you provide a python example?
| Cu3PO42 wrote:
| Python does not perform static type checking, therefore
| this problem doesn't apply immediately. You can build a
| program that takes exponential time to execute, but that is
| probably not a relevation to you.
| Enginerrrd wrote:
| Building a program that takes exponential time to execute
| because of python run-time dynamic type checking might be
| interesting...
| shrimpx wrote:
| I don't think that's possible because I don't think
| there's any aspect of the runtime type checker that
| descends into complex type structures. It just checks if
| runtime type tags match when you invoke an operator, or
| that an attribute exists when you try to access it.
| daxfohl wrote:
| Formerly id id id id id id id id id... would do it, but looks
| like someone has put in a fix for that.
| lower wrote:
| Yes, types are usually represented by a DAG with sharing.
| OCaml does so, for example, and I assume ghc does something
| similar.
|
| So, while id id has type ('a -> 'a) -> ('a -> 'a), this is
| stored in memory by a pointer structure that amounts to let
| 'b = ('a -> 'a) in ('b -> 'b). The type of id id id would
| become let 'b = ('a -> 'a) in let 'c = ('b -> 'b) in ('c ->
| 'c). This grows linearly. If one were to write out the
| types without sharing then their size would grow
| exponentially.
| chowells wrote:
| There was a bug in ghci (the GHC REPL) for a while which
| made it take exponential time to print the type of `id id
| id id id id id id id`. It used the linear representation
| for type-checking, but the pretty-printer was not
| implemented so cleverly.
|
| Apparently that's been fixed. I can't confirm because
| that's not something I ever check the type of.
| kmill wrote:
| It seems to me that id id id ... id always has type a ->
| a. If you're referring to the type of the first id in the
| sequence, then what you're saying makes sense.
| Ericson2314 wrote:
| Note that is is polynomial with respect to the size of the
| _output_.
|
| That is to say, pathological expressions don't stay small but
| induce horrendous searching, but rather merely get a log bigger
| (and that can cause more trouble for the rest of the program).
|
| That means a heuristic to bail out when something grows a lot
| probably not violate the users intent and get it back into
| polynomial time.
| chalst wrote:
| No, this is the time complexity of type checking before any
| computation on the terms is performed.
| Ericson2314 wrote:
| Yes I am saying that while it is exponential with respect
| to the input, and it is polynomial with respect to the
| output.
| shadowgovt wrote:
| The points made by the Endsay here are key, and actually a good
| example of trade-offs in software engineering. It is definitely
| okay to incorporate an algorithm that is NP-hard into the core of
| a performant system as long as you can have some expectation that
| the input size on the NP-hard part will be small. Sometimes, it
| even allows you to simplify another piece of the problem via pre-
| computation that allows you to avoid making the piece of the
| system that accepts large inputs require a costly solution
| algorithm.
| estebank wrote:
| The actual code that handles exhaustiveness checks has extensive
| explanations on how it works and what it was inspired by.
|
| https://github.com/rust-lang/rust/blob/master/compiler/rustc...
| rocqua wrote:
| > he standard representation of such a formula is the conjunctive
| normal form, or simply: AND-of-ORs. Our previous example would
| look like this when transformed: A and B and (A
| or B) and (!A or !B)
|
| There is another normal form, the disjunctive normal form. This
| is in the form of An OR of ANDs. Interestingly, every formula can
| be represented in this form. And SAT is rather easy to determine
| for this form.
|
| The catch is that taking a logical formula, and placing it in
| Disjunctive normal form is actually a rather arduous, non
| polynomial process.
| maweki wrote:
| > And SAT is rather easy to determine for this form.
|
| Gave me a chuckle, as that's quite an understatement, as the
| decision procedure is only: is there a single clause and is it
| just "False"?
|
| > The catch is that taking a logical formula, and placing it in
| Disjunctive normal form is actually a rather arduous, non
| polynomial process.
|
| Basically writing out all assignments that make the formula
| true. Potentially exponentially many, as there are 2^n possible
| assignments of n variables. So this transformation is basically
| just solving and writing down all models.
| mmarx wrote:
| The conjunctive normal form can be exponentially large in the
| worst case as well, though, see the footnote in the article.
| In practice that is not a problem, since it suffices to find
| a formula in CNF that is satisfiable precisely when the
| original formula is satisfiable, even though it might not be
| equivalent.
___________________________________________________________________
(page generated 2021-07-08 23:01 UTC)