[HN Gopher] Include-what-you-use: A tool to analyze includes in ...
___________________________________________________________________
Include-what-you-use: A tool to analyze includes in C and C++
source files
Author : st_goliath
Score : 105 points
Date : 2021-04-20 14:43 UTC (8 hours ago)
(HTM) web link (include-what-you-use.org)
(TXT) w3m dump (include-what-you-use.org)
| qbonnard wrote:
| I haven't used C++ in a while (sadly), but in my days I had
| stumbled upon deheader[0] which I don't seem mentioned here. From
| what I remember, it was very simple and easy to use, and yielded
| useful results.
|
| [0] http://www.catb.org/~esr/deheader/deheader.html
| [deleted]
| eliora wrote:
| i want to be a hacker
| MaxBarraclough wrote:
| Welcome to HN.
|
| Please check out the Guidelines and the FAQ, [0][1] regarding
| how best to participate. HN is friendly to curiosity, but not
| to off-topic comments, which is why you've been downvoted. If
| you'd like to discuss how to become a programmer, either find a
| thread where that's being discussed, or submit an _Ask HN_
| thread, following the style of this thread [2].
|
| [0] https://news.ycombinator.com/newsguidelines.html
|
| [1] https://news.ycombinator.com/newsfaq.html
|
| [2] https://news.ycombinator.com/item?id=24810399
| anarazel wrote:
| I found IWYU pretty annoying, due to its tendency to also include
| transitive includes. Some of those often end up about
| implementation details and are much more likely to be
| added/removed. But maybe the projects using it that I worked on
| were using it wrong?
| anand-bala wrote:
| If the issue is with including transitive dependencies that are
| in your own codebase, then you should annotate the public
| interface header to the implementation details with IWYU
| Pragmas [1] that export the implementation (for example [2]).
|
| If this is in third-party libraries, you can use IWYU Mappings
| [3] to map the "private" headers (usually the transitive
| include) to the public interface. An example that I use for the
| PEGTL library [4].
|
| [1]: https://github.com/include-what-you-use/include-what-you-
| use...
|
| [2]: https://github.com/anand-bala/signal-temporal-
| logic/blob/800...
|
| [3]: https://github.com/include-what-you-use/include-what-you-
| use...
|
| [4]: https://github.com/anand-bala/signal-temporal-
| logic/blob/800...
| johnnyapol wrote:
| I think it definitely can be a project thing. My experience
| with IWYU has been on very large codebases and I considered its
| ability to find transitive includes a blessing. The specific
| case where it shined for me was it made it much easier to
| identify the true impact of fileset changes on the larger
| codebase when it came to refactoring.
| Blikkentrekker wrote:
| i rather more like _OCaml_ 's way of doing things that often
| releases one entirely from having to write module inclusion
| directives, since in general bindings are qualied by their module
| name which becomes part of their namespace. So one would use
| `Array.map` in code, which is the `map` binding exported by the
| `Array` module, and the `Array` module is then included
| automatically, of course. This would be `array_map` in many
| languages to avoid conflicts, but modules in OCaml deliberately
| export short names on the expectation that bindings will be
| namespace qualified with their module name.
|
| It is possible to explicitly open this module, so that one can
| use `map` instead, but that's generally not wise.
|
| I find having to write a long list of include directives at the
| top of a file quite annoying, and this also does not betray in
| what module exactly bindings are defined that one might encounter
| in the code below them. If I encounter, say, `Net.Tcp.open` in
| _Ocaml_ code, I know that this function is defined in `.
| /net/tcp.ml`.
| vbernat wrote:
| Is that robust? Depending on the system, libc, compiler, some
| includes may be unused while others may be needed.
| quantumofalpha wrote:
| iwyu is Google's project originally. It has worked for them for
| more than a decade on their ginormous monorepo.
|
| Sometimes it gets some things wrong, so you have these escape
| hatches to control it: https://github.com/include-what-you-
| use/include-what-you-use...
| cperciva wrote:
| Even with a monorepo this isn't necessarily safe -- if you
| have a mix of x86 and arm servers, you'll need different
| headers included for intrinsics for example.
| quantumofalpha wrote:
| Conditionally-off blocks of code under #ifdefs are
| challenging for it, yes - it runs a proper C++ compiler on
| your code and won't get to see code in those blocks without
| the right defines.
|
| Don't blindly apply its suggestions - test them, skim to
| see what it got wrong, sprinkle some "// IWYU pragma: keep"
| to help it out in corner cases. The tool is more like a
| linter, you don't follow everything that your linter tells
| you to, no?
| ur-whale wrote:
| What I've always wanted is to write C++ code and have the minimal
| set of necessary includes needed to compile my code automatically
| added [edit: I should have said "managed"] by the IDE.
|
| How close can this tool get to that goal?
| burntoutfire wrote:
| I've just Googled the same question. The answers seem to
| glorify the suffering of writing C++ and suggest that the
| inquirer would perhaps be better off with switching to Java...
| Sounds like a case of Stockholm syndrome to me.
|
| Anyway, I'm a beginner in C/C++ world and the most convincing
| solution I've found to use in my personal project is the Single
| Compilation Unit approach
| (https://en.wikipedia.org/wiki/Single_Compilation_Unit). It is
| exemplified in the Handmade Hero github repository (which I'm
| afraid is available for paying users only). Essentially, the
| whole program is divided into modules, each within its own
| single cpp file. The modules are then all included in the SCU,
| which is the only file passed to the compiler. There can be no
| circular dependencies between modules (as then, there would be
| no order of including them in SCU which would work). In HH's
| case, there seems to be an absolutely minimal number and volume
| of headers and they only define data structures, never declare
| functions.
| aflag wrote:
| CLion is capable of adding missing includes, I'm not sure if it
| tells you about unused ones. They have a free trial, may be
| worth a try.
| inetknght wrote:
| My experience with IWYU has been mixed. In general it's a
| success. But it had trouble identifying that some headers were
| only conditionally needed (eg, debug build or macro conditional).
| Those cases are easy to work with if you own the code but can be
| annoying if it's in a third party lib.
|
| That said, I do highly recommend its use.
| anand-bala wrote:
| I've found that using IWYU Pragmas [1] for codebases you own
| and IWYU Mappings [2] for third-party libraries __almost__
| entirely eliminates weird IWYU suggestions (there are a few
| annoyingly stupid suggestions from the tool I just ignore).
|
| I've also recently been making libraries I write compatible
| with users that run IWYU by annotating all public headers with
| IWYU pragma comments that export symbols/transitive includes
| correctly, etc.
|
| [1]: https://github.com/include-what-you-use/include-what-you-
| use...
|
| [2]: https://github.com/include-what-you-use/include-what-you-
| use...
| marcodiego wrote:
| IWYU is responsible for many lines of code that have been removed
| from libreoffice:
| https://cgit.freedesktop.org/libreoffice/core/log/?qt=grep&q...
| dasloop wrote:
| C++ 20 Modules will save us (eventually)
| Kranar wrote:
| How so? It will switch the problem from include what you use to
| import what you use.
|
| Other languages with modules have a similar issue. Go is the
| only language I know of that makes it a hard compiler error to
| import an unused module.
| ot wrote:
| Modules won't allow to rely on transitive includes, which is
| one half of the problem. It won't solve the other half
| (importing too much).
| wyldfire wrote:
| Kinda. I think a primary use case for modules is to help with
| out-of-control compile times.
|
| But the specific problem of include-what-you-use will still be
| encountered if you include directly from C libraries like
| system headers or library dependencies.
| Kranar wrote:
| Unfortunately modules do not have a significant impact on
| compile times and in some cases can increase compiles times
| due to inhibiting parallelism.
| pjmlp wrote:
| VC++ already does multihreading code generation across
| multiple compiler phases, using modules won't change that.
| Kranar wrote:
| No it doesn't, cl.exe's compiler is an inherently single
| threaded application. Parallelism in VC++ is achieved by
| running multiple copies of cl.exe with one serving as the
| primary instance and the rest as followers. The primary
| instance forwards individual translation units to the
| followers and waits for the followers to complete
| compilation, then at the end the primary instance
| terminates and the linker is invoked.
| pjmlp wrote:
| Not up to date?
|
| https://docs.microsoft.com/en-
| us/cpp/build/reference/cgthrea...
| Kranar wrote:
| That is a linker option, not a compiler option. Modules
| have no effect on linking one way or another as linking
| is fairly independent of the compilation process.
| pjmlp wrote:
| I mentioned code generation, you don't execute .obj
| files.
| Kranar wrote:
| Then your comment is off-topic and creates confusion. My
| point was modules inhibit the parallelism of the
| compilation process, compile times, not that it has any
| effect on the link times.
|
| Modules do not have any effect on the linker one way or
| another. They are independent of it.
| pjmlp wrote:
| Modules will bring compiler and linker work more closer,
| just like other languages not tainted by UNIX toolchain
| model.
|
| Some C++ developers can keep using their pre-historic
| UNIX like tooling, whereas others will embrace the fusion
| of compiler, linker and build system.
| gsliepen wrote:
| This is not true. Compile times are usually much better
| with modules. They also don't inhibit parallelism, but
| perhaps you are referring to this paper (http://www.open-
| std.org/jtc1/sc22/wg21/docs/papers/2019/p144...), which
| shows that, with compiler versions from 2019, it can indeed
| be slower to compile with modules if you have a large
| number of threads and the depth of the module dependency
| graph is large.
| Kranar wrote:
| Yes that's the correct benchmark.
|
| Do you have evidence that the situation has changed? Last
| I checked it still remains the case that modules inhibit
| parallelism and hence result in slower builds in most
| practical work loads. But of course if you have evidence
| the contrary I'd be happy to see it.
| gsliepen wrote:
| I don't know of any newer benchmarks. However, I'm
| reading the results differently I guess, because the
| results show that with 128 threads, modules become slower
| only when the DAG depth is higher than 29, and that's
| quite a large depth! It also looks like each source file
| used in the benchmark only imports other modules and
| declares 300 variables, but nothing else. Practical
| workloads will have more interesting stuff in the source
| files, so I would expect the impact of module loading to
| be less, so more can be done in parallel.
| account4mypc wrote:
| > with 128 threads, modules become slower only when the
| DAG depth is higher than 29
|
| yeah, but the same graph _never_ shows modules being
| faster... it only ever shows them being the same or
| slower. If I 'm going to put in all that work, the result
| should be *faster*
| volta83 wrote:
| > This is not true. Compile times are usually much better
| with modules.
|
| What significantly improves compile-times is Pre-Compiled
| Headers (PCH), which most compilers have supported for
| decades.
|
| The study you mention, does not show data for them.
|
| Having ported one >1 million LOC C++ app to use modules
| in two compilers, the compile time improvement of modules
| over PCH was not distinguishable from noise.
|
| Modules have many advantages, like better encapsulation,
| etc.
|
| The main thing people want from them seems to be better
| compile times, which is the one thing they don't deliver,
| at least over the PCH solutions that have existed for
| decades, are already supported by all build systems, etc.
|
| Compared to modules, PCHs are "zero-effort" and deliver
| performance instantaneously.
| colomon wrote:
| Off-topic, but is there a guide to best practices for
| portable pre-compiled headers out there somewhere? I'm
| under considerable pressure to add pre-compiled headers
| for Windows to my code, and it won't have any significant
| benefit for me unless I can also make it work on MacOS
| and Linux. So far my Googling has turned up little
| information for any platform other than Windows, and
| nothing that would suggest how to do it well for all
| three platforms. (Well, more to the point, Visual C++,
| clang, and g++.)
| volta83 wrote:
| Does your project use CMake ?
|
| CMake supports these with all major compilers...
|
| I'll just google "<your build system> pre-compiled
| headers" and see if there is a flag or option that you
| can enabled.
|
| You will definetly need quite a bit of fine tuning for
| apps over 500k LOC or so, but if your app is under that,
| and you are splitting code between .h and .cpp files
| appropriately, just flipping a flag might get you 80%
| there.
|
| The speed ups you see people get from PCHs is like 20-30%
| faster compile-times. So they are more a "nice to have"
| feature than something that will solve your compile-time
| problems.
|
| If your app is structured in such a way that it takes 20
| min to compile, this can cut it to 15 min at most, but
| that would probably still suck. If you want more, then
| you'd need to consider other solutions like distributed
| build caches (sccache, etc.).
| jcelerier wrote:
| With cmake it's just target_precompile_headers: https://c
| make.org/cmake/help/latest/command/target_precompil...
| dasloop wrote:
| My understanding is just the opposite, they will decrease
| compilation times as "included files" are processed just
| once. We can see them as a better version of precompiled
| headers (although they are more than that).
| Kranar wrote:
| Yes except that includes are usually not the performance
| bottleneck, it's the semantic analysis that consumes the
| bulk of the compile times.
|
| Modules inhibit parallelism because modules are ordered
| along a DAG and must be compiled from the root of the DAG
| down to the leafs in order. So consider a traditional
| setup as follows:
|
| A.cpp <- A.h <- B.h <- C.h <- D.h
|
| B.cpp <- B.h <- C.h <- D.h
|
| C.cpp <- C.h <- D.h
|
| D.cpp <- D.h
|
| All four of those cpp files can be built in parallel,
| even though you're right that all of the header files are
| being reparsed multiple times. My claim is that parsing
| header files is incredibly cheap, it's translating the
| .cpp files that's expensive because cpp files are where
| the bulk of the semantic analysis and type checking is
| performed.
|
| With modules, the same compilation model looks like this:
|
| A.mxx <- B.mxx <- C.mxx <- D.mxx
|
| There's no longer header/source and there's no longer
| redundancy, but I can't build this in parallel anymore. I
| have to first build D.mxx, then C.mxx, then B.mxx then
| A.mxx in serial.
| [deleted]
| dbaupp wrote:
| Parsing a single header file in isolation is cheap, but
| each header will include others, and templates mean many
| headers contain large amounts of code inline. For
| instance, just including <vector> results in the compiler
| having to look at almost 30kloc, on my system:
| $ clang -x c++ -E - <<<"#include <vector>" | wc -l
| 27378
|
| Other headers are similar: algorithm
| 23103 array 23450 memory 15909
| random 52107 thread 31424 tuple
| 9240
|
| (Of course, a bunch of this code is shared, e.g.
| including both thread and vector is "only" 35713 loc
| total, not 60kloc.)
|
| I believe C++ compilers have SIMD-accelerated
| lexers/parsers because of the sheer explosion of code due
| to headers and templates.
| bradford wrote:
| semi related, but I'm coming back to C++ after a long hiatus (15
| years). I realize this is probably a newb question...
|
| The code base I'm working in is very large and I have a recurring
| problem where I see a term (class/variable/etc) being used in a
| cpp file, and want to know which header file contains the
| definition.
|
| What's the quickest, easiest way to do this?
|
| I've been using grep, but the size of the code base, combined
| with the large number of #includes in each cpp file, makes this
| inefficient.
|
| I believe I can use ctags/vim, but I last used that circa 2000
| and I'm curious to know what other static analysis solutions have
| cropped up since then.
|
| Does IWYU address this scenario? I'm using clang as a compiler if
| that's at all relevant.
| blcArmadillo wrote:
| There are lots of options:
|
| - Yes, ctags/vim would work
|
| - You could use something like vscode
|
| - Consider checking out cscope. With cscope you can also build
| a reverse index which lets you find where things are called. It
| can be used with something like vim but also has a pretty nice
| TUI.
| drummer wrote:
| If you use Visual Studio, it is as easy as right clicking on
| the typename or variable and choosing to go to the declaration
| or definition.
| inetknght wrote:
| > _The code base I 'm working in is very large and I have a
| recurring problem where I see a term (class/variable/etc) being
| used in a cpp file, and want to know which header file contains
| the definition._
|
| A good IDE will have a feature to let you locate the
| declaration and/or definition of any variable or type.
|
| I've found that a lot of IDEs have that feature completely
| broken. Qt Creator, for example, is easily confused and comes
| with all kinds of Qt garbage^H^H^H^H^H^H^H^H baggage. CLion is
| a resource hog and often just hangs. Visual Studio is usually
| pretty good -- assuming you're using Windows. VS _Code_ is
| "okay" but I've found it's more of a headache to set up. I
| don't have experience with XCode since I've never used OSX for
| development.
|
| I've found the most reliable way is to learn how to use `grep`
| and pair that with understanding _where_ to search; the project
| source directory of course but also system headers and any
| libraries installed to non-system locations. That knowledge
| translates to usefulness in other workflows too.
| anand-bala wrote:
| In most cases, what you are looking for is a language server
| like `clangd` (works for most compilers) [1].
|
| You can find a Language Server Protocol implementation for your
| editor at [2] (I don't think it lists __all__ clients, but it
| should include the most popular ones).
|
| EDIT: I realized that this is a vague answer, so let me
| clarify.
|
| An LSP implementation (especially clangd) provides actions like
| `go-to definition` or `find references` that you would find in
| full-featured IDEs like CLion (which is also amazing BTW).
| Since you mentioned vim, I am guessing you use it and don't
| necessarily want to let go of the hand-crafted vimrc you have
| created. Adding an LSP plugin to Vim is incredibly easy and
| gives you these "IDE" features with customizable mappings.
|
| [1]: https://clangd.llvm.org/
|
| [2]: https://langserver.org/#implementations-client
| bradford wrote:
| Thanks! I read about using LSP/Clangd with vim via
| [coc](https://github.com/clangd/coc-clangd) and I think
| that's the path I'll try going down.
|
| Other responses, thanks for your input. Just want to clarify
| that I have tried VS and VSCode with limited success
| (sometimes search works, sometimes it doesn't, and my biggest
| gripe is an occasional lack of transparency into what's going
| on under the cover). I think any solution is going to require
| some investment on my part and LSP sounds like a good
| investment.
| f00zz wrote:
| I use ctags+vim every day with rather large C++ codebases (but
| then again I'm a dinosaur).
| bradford wrote:
| Curious, I was under the impression that ctags offers a 'jump
| to definition' functionality, but little more. (i.e., 'find
| all references' isn't supported).
|
| Is that correct? Do you use if for functionality beyond the
| 'jump to definition/jump back to previous context'?
| MauranKilom wrote:
| Take note:
|
| > CAVEAT
|
| > This is alpha quality software -- at best (as of July 2018). It
| was originally written to work specifically in the Google source
| tree, and may make assumptions, or have gaps, that are
| immediately and embarrassingly evident in other types of code.
|
| > While we work to get IWYU quality up, we will be stinting new
| features, and will prioritize reported bugs along with the many
| existing, known bugs. The best chance of getting a problem fixed
| is to submit a patch that fixes it (along with a test case that
| verifies the fix)!
|
| https://github.com/include-what-you-use/include-what-you-use...
|
| Further useful docs:
|
| Why Include What You Use? https://github.com/include-what-you-
| use/include-what-you-use...
|
| What Is A Use? https://github.com/include-what-you-use/include-
| what-you-use...
|
| Why Include What You Use Is Difficult https://github.com/include-
| what-you-use/include-what-you-use...
| dang wrote:
| One past related discussion:
|
| _Include-what-you-use: Clang tool to analyze includes in C and
| C++ source files_ - https://news.ycombinator.com/item?id=10958186
| - Jan 2016 (40 comments)
___________________________________________________________________
(page generated 2021-04-20 23:01 UTC)