[HN Gopher] Speeding up C++ build times
       ___________________________________________________________________
        
       Speeding up C++ build times
        
       Author : skilled
       Score  : 45 points
       Date   : 2024-04-27 09:34 UTC (1 days ago)
        
 (HTM) web link (www.figma.com)
 (TXT) w3m dump (www.figma.com)
        
       | chipdart wrote:
       | This blog entry is highly disappointing. The Sigma blog post
       | reads as if they reinvented the wheel with basic information that
       | is not only widely known and understood but it also featured in
       | books published decades ago.
       | 
       | The blog post authors would do well if they got up to speed on
       | the basics of working with C++ projects. Books such as "Large
       | scale C++ vol1" by John Lakos already cover this and much more.
        
         | SuperV1234 wrote:
         | Shameless plug from my talk:
         | https://youtube.com/watch?v=PfHD3BsVsAM
        
         | rileymat2 wrote:
         | It could be my bias but it seems a lot of inexperienced
         | developers no longer read comprehensive books on topics but
         | survive on Google, stack overflow, some documentation with
         | examples/simple tutorials, blog posts and now Gpts.
         | 
         | All are useful tools but they are very poor in eliminating
         | unknown unknowns like a book would.
        
           | bdowling wrote:
           | You think only inexperienced developers have stopped reading
           | books?
        
             | rileymat2 wrote:
             | That's fair, I bet it is more widespread, but people
             | starting in the last 15-20 years did not even have the
             | initial introduction. I read a lot, it surprised me to find
             | out that it was atypical in an industry that is supposed to
             | be somewhat about leveraging brainpower. May as well stand
             | on the shoulders of giants.
        
       | LeSaucy wrote:
       | I use c++ on the daily and find that ccache and an m1/2/3 cpu go
       | a very long way to reducing build times.
        
         | MathMonkeyMan wrote:
         | Not sure why this was downvoted. It's true that ccache and
         | build parallelization (e.g. icecream) can grease the wheels
         | enough that builds are no longer a dev cycle bottleneck.
         | 
         | What the article is about, though, is changing the source code
         | so that it is intrinsically faster to compile. At some point
         | you say "this program isn't complicated, why does it take so
         | long to compile?" Then you start looking at unnecessary
         | includes, transitive includes, forward declarations, excessive
         | inlining, etc.
        
       | juunpp wrote:
       | Just another self-promotion blog post with near zero information
       | density. "Behold, the vastness of our vanity, regurgitating old
       | news like we just discovered something new." Do these posts
       | really help with hiring?
       | 
       | CLION also highlights unused includes, nothing new here. Use a
       | good IDE. A networked ccache also does wonders if your org allows
       | it.
       | 
       | Slow build otherwise stem from a combination of: a) lack of
       | proper modules in C++ (until recently) and b) unidiomatic or just
       | terrible code bases. To help with the latter, hide physical
       | implementations (PIMPL for class state, forward declarations for
       | imports), avoid OOP-style C++ above all, minimize use of
       | templates, design sound and minimal modules. No rocket science.
        
         | MathMonkeyMan wrote:
         | Three times I've joined a team that has a substantial C++
         | codebase, and three times I've been tempted to use libclang
         | based tooling to automate changes, or at least to identify
         | patterns that could be changed.
         | 
         | This article, while not the nerdy deep dive I'd like, does
         | touch on what happens when you try to do that. You realize that
         | the C++ standard library is really complicated, that your
         | existing code is really fucked up, and that libclang is too
         | limited a tool. You end up writing a XSLT engine in hacked up
         | python, but by a different name.
         | 
         | [LibTooling][1] is probably The Right Thing ("in C++", as the
         | article says), but I never spent the time to get it working.
         | 
         | Somebody write a DSL for C++ inspection and transformations
         | that uses LibTooling as a backend. I bet there are many, but
         | none close at hand.
         | 
         | edit: [this][2] is close...
         | 
         | [1]: https://clang.llvm.org/docs/LibTooling.html [2]:
         | https://clang.llvm.org/docs/LibASTMatchersTutorial.html#inte...
        
         | stefan_ wrote:
         | Is there anyone else that gets unreasonably angry at stuff like
         | PIMPL? It is truly the most braindead, bereft of sense activity
         | in this world. There was a comment in one of the many Rust
         | threads that called C++ a respectable language now but then
         | things like PIMPL snap you right back into the wasteland it is.
        
           | wakawaka28 wrote:
           | PIMPL is an elegant solution to multiple problems. Idk what
           | you could possibly have against it besides the extra work
           | involved. I don't think any language has solved the
           | fundamental problem of hiding details better than PIMPL does.
        
       | Scubabear68 wrote:
       | I find it hard to believe that this post indicates that C++ build
       | times are proportional to included bytes, period.
       | 
       | I haven't used C++ in quite a while, but aren't templates a big
       | part of this issue?
        
         | flohofwoe wrote:
         | It becomes believable when you consider that your own code is
         | just a very tiny appendix dangling off the end of a massive
         | chunk of included data. For instance just including <vector>
         | results in a 24kloc compilation unit of gnarly template code in
         | Clang with C++23:
         | 
         | https://www.godbolt.org/z/G18WGdET5
         | 
         | ...add <string> and <algorithm> and you're at 45kloc:
         | 
         | https://www.godbolt.org/z/Whv73YPYh
         | 
         | ...and those numbers have been growing steadily by a couple
         | thousand lines in each new C++ version.
         | 
         | Multiply this with a few thousand source files (not atypical
         | with the old 'clean code' rule to prefer small source files,
         | e.g. one file per class), and that's already dozens to hundreds
         | of million lines of code the compiler needs to process on a
         | full rebuild, all spent on compiling <vector> over and over
         | again.
         | 
         | TL;DR: the most effective way to improve build times in C++ is
         | to split your project into few big source files instead of many
         | small files (either manually, e.g. one big source file per
         | 'system', or let the build system take care of it via 'unity'
         | or 'jumbo' builds).
        
           | mgaunard wrote:
           | Except that doesn't improve the time of iterative builds,
           | which are the only ones that really matter to software
           | development.
        
             | flohofwoe wrote:
             | It does though for header changes, which then may trigger
             | fewer source file compilations. IME in incremental builds
             | the most time is spent in the linker anyway.
             | 
             | > which are the only ones that really matter to software
             | development.
             | 
             | Debatable in this age of cloud CI builds ;)
        
       | wifijammer wrote:
       | I really hate the repetition from separating out header and
       | definition files so I've been writing my whole codebase headers
       | only.
       | 
       | I feel like this kills my compile time but I'm not sure how to
       | fix it. Precompiled headers?
        
         | flohofwoe wrote:
         | If you have all your application code in headers you'll get the
         | fastest build times (for full rebuilds at least) by including
         | _all_ headers into a single main.cpp file and build just that,
         | since that way there is no redundant code for the compiler to
         | build at all.
         | 
         | Of course the downside is that every tiny code change triggers
         | a full rebuild then, but it's quite likely that the most time
         | is spent in the linker anyway, so maybe worth a try.
        
           | runevault wrote:
           | I think I've heard this called a Unity build where there's a
           | precompile step that just dumps everything into a single file
           | and compiles that so it doesn't have to re-include everything
           | at different compilation units (when I first heard the term I
           | got confused because it was in a game dev context but had
           | nothing to do with the Unity engine lol).
        
             | flohofwoe wrote:
             | Yes, it's typically called unity or jumbo build, and can
             | also be done with regular source files.
             | 
             | Cmake has a feature to do that automatically during the
             | build: https://cmake.org/cmake/help/latest/prop_tgt/UNITY_B
             | UILD.htm...
             | 
             | ...haven't tinkered with it yet though.
        
         | ranger_danger wrote:
         | ccache is one solution, or a script/IDE plugin that will create
         | both the header/definition from a signature you provide?
        
       | mgaunard wrote:
       | DIWYDU sound like a better tool than IWYU.
        
       | snypehype46 wrote:
       | Coincidentally in the project I'm currently working I managed to
       | reduce our compile times significantly (~35% faster) using
       | ClangBuildAnalyzer [1]. The main two things that helped were
       | precompiled headers and explicit template instantiations.
       | 
       | Unfortunately, the project still remains heavy because of our use
       | of Eigen throughout the entire codebase. The analysis with
       | Clang's "-ftime-trace" show that 75-80% of the compilation time
       | is spent in the optimisation stage, but not really sure what do
       | to about that.
       | 
       | [1] https://github.com/aras-p/ClangBuildAnalyzer
        
       | kjksf wrote:
       | I wrote about how I keep build times sane in SumatraPDF at
       | https://blog.kowalczyk.info/article/96a4706ec8e44bc4b0bafda2...
       | 
       | The idea is the same: reduce the duplicate parsing of .h files.
       | 
       | I don't use any tools, just a hard-core discipline of only
       | #include'ing .h in .cpp files.
       | 
       | The problem is that if you start #include'ing .h in .h, you
       | quickly start introducing duplication that is intractable, for a
       | human, to avoid.
       | 
       | On another note: C++ compiler should by default keep statistics
       | about the chain of #include's / parsing during compilation and
       | dump it to a file at the end and also summarize how badly you're
       | re-parsing the same .h files during build.
       | 
       | That info would help people remove redundant #include's.
       | 
       | But of course even if they do have such options, you have to turn
       | on some flags and they'll spam your build output instead of
       | writing to a file.
        
         | sfpotter wrote:
         | How much of a speedup did you get switching over to Rob Pike
         | style includes?
        
       | petermcneeley wrote:
       | The video game industry uses bulk builds (master files) which
       | groups all .cc s into very large single .cc files. The speedups
       | here are like 5-10x at least. These bulk files are sent to other
       | developers machines with possible caching. The result is 12 min
       | builds instead of 6 hours.
        
       | jupp0r wrote:
       | Bazel + remote workers yields a great user experience with small
       | infrastructure footprint per developer, but requires quite a bit
       | of work to initially set up. You get reproducible builds, caching
       | of test results and blazingly fast CI as a side effect.
        
       ___________________________________________________________________
       (page generated 2024-04-28 23:00 UTC)