https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/ Skip to content Michal Gorny Retroactively fixing the world Posted on 2021-02-192021-02-21 by Michal Gorny The modern packager's security nightmare One of the most important tasks of the distribution packager is to ensure that the software shipped to our users is free of security vulnerabilities. While finding and fixing the vulnerable code is usually considered upstream's responsibility, the packager needs to ensure that all these fixes reach the end users ASAP. With the aid of central package management and dynamic linking, the Linux distributions have pretty much perfected the deployment of security fixes. Ideally, fixing a vulnerable dependency is as simple as patching a single shared library via the distribution's automated update system. Of course, this works only if the package in question is actually following good security practices. Over the years, many Linux distributions (at the very least, Debian, Fedora and Gentoo) have been fighting these bad practices with some success. However, today the times have changed. Today, for every 10 packages fixed, a completely new ecosystem emerges with the bad security practices at its central point. Go, Rust and to some extent Python are just a few examples of programming languages that have integrated the bad security practices into the very fabric of their existence, and recreated the same old problems in entirely new ways. The root issue of bundling dependencies has been discussed many times before. The Gentoo Wiki explains why you should not bundle dependencies, and links to more material about it. I would like to take a bit wider approach, and discuss not only bundling (or vendoring) dependencies but also two closely relevant problems: static linking and pinning dependencies. Static linking In the simplest words, static linking means embedding your program's dependencies directly into the program image. The term is generally used in contrast to dynamic linking (or dynamic loading) that keep the dependent libraries in separate files that are loaded at program's startup (or runtime). Why is static linking bad? The primary problem is that since they become the integral part of the program, they can not be easily replaced by another version. If it turns out that one of the libraries is vulnerable, you have to relink the whole program against the new version. This also implies that you need to have a system that keeps track of what library versions are used in individual programs. While you might think that rebuilding a lot of packages is only a problem for source distributions, you are wrong. While indeed the users of source distributions could be impacted a lot, as their systems remain vulnerable for a long time needed to rebuild a lot of packages, a similar problem affects binary distributions. After all, the distributions need to rebuild all affected programs in order to fully ship the fix to their end users which also involves some delay. Comparatively, shipping a new version of a shared library takes much less time and fixes all affected programs almost instantly (modulo the necessity of restarting them). The extreme case of static linking is to distribute proprietary software that is statically linked to its dependencies. This is primarily done to ensure that the software can be run easily on a variety of systems without requiring the user to install its dependencies manually. However, this scenario is really a form of bundling dependencies, so it will be discussed in the respective section. However, static linking has also been historically used for system programs that were meant to keep working even if their dependent libraries became broken. In modern packages, static linking is used for another reason entirely -- because they do not require the modern programming languages to have a stable ABI. The Go compiler does not need to be concerned about emitting code that would be binary compatible with the code coming from a previous version. It works around the problem by requiring you to rebuild everything every time the compiler is upgraded. To follow the best practices, we strongly discourage static linking in C and its derivatives. However, we can't do much about languages such as Go or Rust that put static linking at the core of their design and have time and again stated publicly that they will not switch to dynamic linking of dependencies. Pinning dependencies While static linking is bad, at least it provides a reasonably clear way for automatic updates (and therefore the propagation of vulnerability fixes) to happen, pinning dependencies means requiring a specific version of your program's dependencies to be installed. While the exact results depend on the ecosystem and the exact way of pinning the dependency, generally it means that at least some users of your package will not be able to automatically update the dependencies to newer versions. That might not seem that bad at first. However, it means that if a bug fix or -- even more importantly -- a vulnerability fix is released for the dependency, the users will not get it unless you update the pin and make a new release. And then, if somebody else pins your package, then that pin will also need to be updated and released. And the chain goes on. Not to mention what happens if some package just happens to indirectly pin to two different versions of the same dependency! Why do people pin dependencies? The primary reason is that they don't want dependency updates to suddenly break their packages for end users, or to have their CI results suddenly broken by third-party changes. However, all that has another underlying problem -- the combination of not being concerned with API stability on upstream part, and not wishing to unnecessarily update working code (that uses deprecated API) on downstream part. Truth is, pinning makes this worse because it sweeps the problem under the carpet, and actively encourages people to develop their code against specific versions of their dependencies rather than against a stable public API. Hyrum's Law in practice. Dependency pinning can have really extreme consequences. Unless you make sure to update your pins often, you may one day find yourself having to take a sudden leap -- because you have relied on a very old version of a dependency that is now known to be vulnerable, and in order to update it you suddenly have to rewrite a lot of code to follow the API changes. Long term, this approach simply does not scale anymore, the effort needed to keep things working grows exponentially. We try hard to unpin the dependencies and test packages with the newest versions of them. However, often we end up discovering that the newer versions of dependencies simply are not compatible with the packages in question. Sadly, upstreams often either ignore reports of these incompatibilities or even are actively hostile to us for not following their pins. Bundling/vendoring dependencies Now, for the worst of all -- one that combines all the aforementioned issues, and adds even more. Bundling (often called vendoring in newspeak) means including the dependencies of your program along with it. The exact consequences of bundling vary depending on the method used. In open source software, bundling usually means either including the sources of your dependencies along with your program or making the build system fetch them automatically, and then building them along with the program. In closed source software, it usually means linking the program to its dependencies statically or including the dependency libraries along with the program. The baseline problem is the same as with pinned dependencies -- if one of them turns out to be buggy or vulnerable, the users need to wait for a new release to update the bundled dependency. In open source software or closed source software using dynamic libraries, the packager has at least a reasonable chance of replacing the problematic dependency or unbundling it entirely (i.e. forcing the system library). In statically linked closed source software, it is often impossible to even reliably determine what libraries were actually used, not to mention their exact versions. Your distribution can no longer reliably monitor security vulnerabilities; the trust is shifted to software vendors. However, modern software sometimes takes a step further -- and vendor modified dependencies. The horror of it! Now not only the packager needs to work to replace the library but often has to actually figure out what was changed compared to the original version, and rebase the changes. In worst cases, the code becomes disconnected from upstream to the point that the program author is no longer capable of updating the vendored dependency properly. Sadly, this kind of vendoring is becoming more common with the rapid development happening these days. The cause is twofold. On one hand, downstream consumers find it easier to fork and patch a dependency than to work with upstreams. On the other hand, many upstreams are not really concerned with fixing bugs and feature requests that do not affect their own projects. Even if the fork is considered only as a stop-gap measure, it often takes a real lot of effort to push the changes upstream afterwards and re-synchronize the codebases. We are strongly opposed to bundling dependencies. Whenever possible, we try to unbundle them -- sometimes having to actually patch the build systems to reuse system libraries. However, this is a lot of work, and often it is not even possible because of custom patching, including the kind of patching that has been explicitly rejected upstream. To list a few examples -- Mozilla products rely on SQLite 3 patches that collide with regular usage of this library, Rust bundles a fork of LLVM. Summary Static linking, dependency pinning and bundling are three bad practices that have serious impact on the time and effort needed to eliminate vulnerabilities from production systems. They can make the difference between being able to replace a vulnerable library within a few minutes and having to spend a lot of effort and time in locating multiple copies of the vulnerable library, patching and rebuilding all the software including them. The major Linux distributions had policies against these practices for a very long time, and have been putting a lot of effort into eliminating them. Nevertheless, it feels more and more like Sisyphean task. While we have been able to successfully resolve these problems in many packages, whole new ecosystems were built on top of these bad practices -- and it does not seem that upstreams care about fixing them at all. New programming languages such as Go and Rust rely entirely on static linking, and there's nothing we can do about it. Instead of packaging the dependencies and having programs use the newest versions, we just fetch the versions pinned by upstream and make big blobs out of it. And while upstreams brag how they magically resolved all security issues you could ever think of (entirely ignoring other classes of security issues than memory-related), we just hope that we won't suddenly be caught with our pants down when a common pinned dependency of many packages turns out to be vulnerable. CategoriesGentoo, Security 22 Replies to "The modern packager's security nightmare" 1. [002974244d9ac] Shiba says: 2021-02-20 at 02:18 The unstable ABI/static linking of modern languages is a real let down and the saddest part about this is that they could have been done differently if dynamic linking had been a top priority for their developers. I'll leave you an article about "How Swift achieved dynamic linking where Rust couldn't" for when you are bored and want to get pissed off a bit more: https:// gankra.github.io/blah/swift-abi/ Reply 2. [982790dd70ad6] Alex says: 2021-02-20 at 11:44 I think, dependency pinning issue should be at least partially solvable with tool like this added in CI: https://github.com/ RustSec/cargo-audit But of course not all upstream developers are gonna use it, so the global issue remains. Cargo should probably audit lock files by default. Reply 1. [ff5429ef741b4] Michal Gorny says: 2021-02-20 at 14:02 There's also a bot on GitHub that reports outdated pinned dependencies for Python; but I don't know whether it works for Rust too. However, this doesn't really resolve the underlying issue. Tools make it possible to workaround the problem if you only make new releases fast enough. They don't make things work smoothly. It's like adding a device to push square wheels and then claiming they work as good as round ones. Reply 1. [35c395c5aeb8a] Randy Barlow says: 2021-02-21 at 22:22 You may be referring to Dependabot, and it does support Rust: https://dependabot.com/rust/ It sends pull requests to the project, and even includes some handy information like the upstream changelog. Here's an example pull request it made: https://github.com/bowlofeggs/rpick/pull/39 This, of course, doesn't necessarily help at the distribution packaging level, but it does help an upstream project stay on top of updates, and to stay aware of security issues. Reply 3. [181df4e49d1f6] John Erickson says: 2021-02-20 at 14:10 What's your view on Node/JS apps? Reply 1. [ff5429ef741b4] Michal Gorny says: 2021-02-20 at 16:35 I don't really know the Node ecosystem (and I don't really want to know it, I guess), so I can't really tell. Reply 4. [b4b3f1f183731] JCaesar says: 2021-02-20 at 16:27 > This also implies that you need to have a system that keeps track of what library versions are used in individual programs. I assume such a system doesn't exist for gentoo? How far would checking EGO_SUM and CRATES help? (Those don't list bundled dependencies, but would it be a start?) Also, is this really a new problem? What's the story with tracking C++ header-only libraries, npm dependency lock files, or maven poms? Reply 1. [ff5429ef741b4] Michal Gorny says: 2021-02-20 at 16:47 There's no such system at the moment. It is possible that the approach used by modern eclasses would help for Go/Rust. Still, somebody needs to do the work. As for header-only libraries (or pure static libraries), subslots provide a 'usually good enough' solution to that. That is, as long as all consumers depend on them directly. I don't know much about NPM, and I know we haven't managed to deal with Maven yet. Reply 5. [c24cd33fcaf78] Axel Beckert says: 2021-02-21 at 04:22 Thanks for that very nice summary. The only thing I'm missing is the fact that containerized package formats like AppImage, Flatpak, Snaps or Docker images have more or less the same issues as bundling/vendoring dependencies and accordingly hav a similar if not worse security impact. Unfortunately--like many application developers--also at least one of the bigger Linux distributions has fallen for that easy path to the dark side of packaging and has unnecessarily started shipping dummy packages which pull in containerized applications. (Yes, I look at you, Ubuntu.) Reply 6. [9c328fe0d7bd1] Carlo says: 2021-02-21 at 12:49 Too true. We're in the dark age of software security and data safety. Companies have almost no reasons to invest in it. Apparently the number of opportunistic software engineers not to care about this either seems to be growing; and a lot of younger ones learn this bad behavior. In my opinion this isn't a problem, which can be sorted out on the engineers level, it's a social and political one: The question is who can be held responsible; unless there's no chance companies and their leaders and cannot be sued out of (economic) existence for data breaches, etc., nothing will change. If the law forced long support time frames for software and computer systems, publication of used software components/ libraries and prompt publication and correction of security issues and other severe bugs, the software development and packaging world would be another one; simply because of changed requirements. Unfortunately our societies do not care or even understand the problem. Reply 7. [f670aba0e87fa] Carlo says: 2021-02-21 at 13:01 Ideas: - Use-Flagging: BLOB_UPSTREAM_UNSUPPORTIVE - Installing in (para-)virtualized environments only - In the worst case it may be better to package mask this sort of software and not invest developer time Reply 8. [b7eecb38c888a] Oz says: 2021-02-21 at 17:52 I was interested in learning Rust and Go... Not so much anymore, good info. Reply 9. [6abb8784ddfeb] Others says: 2021-02-21 at 19:47 The bit about rust and llvm is (as far as I understand) completely false. Rust supports (and tests against) stock LLVM. Can you elaborate where you got your information there? Reply 1. [379ca7e347a37] gyakovlev says: 2021-02-21 at 21:58 Gentoo rust maintainer here: I'm not sure about testing right now, maybe they do now. But some time ago the only tests were only against internal llvm copy. I think there's even an option to download their compiled llvm for build/tests landed nowadays, that narrows test/build scope even more. as for "supported" (quoted on purpose), sure it kinda is, but in reality it is not. Internal llvm copy is pinned and contains some specific codegen patches on top. so it differs from system-llvm. also, rust ebuild has a test suite, which I do run on bumps. and I see codegen failures with system-llvm more often than with internal one. not really surprising. we even have a bug for that with more points: https:// bugs.gentoo.org/735154 lu_zero will have more info on specific examples, because they were able to trigger a specific breakage with system-llvm and not bundled one. to be specific, I run tests on amd64, arm64 and ppc64le, with both system-llvm and bundled one. system-llvm in most cases fails more tests, but not always and situation has been better lately. major blocker: one can't really build firefox(or other mozilla rust consumers) with system-llvm 11 and rust llvm != 11. So one can't build firefox/thunderbird/spidermonkey if there's a mismatch between llvm versions. and cherry on top, debian, fedora and other distros do prohibit bundling of llvm. but they have a rather dated version of rust compared to gentoo, so they can move slower. In gentoo we have to keep a bootstrap path ( because rust has a specific bootstrap version requirements) and for stability I've been forcing llvm dep of rust to be the same as internal llvm, sans the patches. as for FreeBSD, last time I checked they've used bundled one, despite llvm being base system component. Maybe that has changed, they are making a lot of progress, but maybe not. they've also swiched to using bundled libssh2 and libgit2 ( as we did as well) because updating system copies of those packages broke cargo to the point of inability for rust to update itself. there are a lot of problems not seen to consumers, but developers and packagers see and deal with it daily, as blog post title says. Reply 10. [3b85e7525b4cd] Polynomial-C says: 2021-02-21 at 21:37 One upstream that drove bundling to an excess is MoonchildProductions with their Firefox fork called PaleMoon. They removed all configure switches to enable usage of system libs so you can only use their bundled stuff. Security at its best. I was even badmouthed to spread FUD when I tried to convince developers of another Firefox fork called Waterfox to not merge with Moonchild's Basilisk browser (unbranded PaleMoon): https://github.com/MrAlex94/Waterfox/issues/1290 Unfortunately the original issue at MoonchildProductions regarding bundled libs has been removed. I'm just glad that MoonchildProductions did this stupid move before I started to waste my time adding PaleMoon to Gentoo. I also convinced some other Gentoo devs to not waste their time on this as long as this ignorant upstream continues with their bundling stupidity. Reply 1. [379ca7e347a37] gyakovlev says: 2021-02-21 at 22:07 don't forget the "niceness" of palemoon upstream =) https://github.com/jasperla/openbsd-wip/issues/86 they even have specific clause for ebuilds in their license, forbidding modifications. Reply 11. [bddba3589b985] Nestor says: 2021-02-21 at 21:41 OK, this sucks. I just looked at sys-apps/ripgrep's ebuild. Why is dependency pinning so widespread in the Rust ecosystem? Are updates to its dependencies more or less always breaking a program or what? With regard to static linking: Wouldn't it be enough to generate a list of build-time dependencies from ebuilds of statically linked programs and compare that to the output of 'emerge -uDN @world' and then trigger an einfo to run something to the likes of 'emerge @static-rebuild' when a match is found? Each time an update of one of the behemoths (libreoffice, firefox, etc.) is pushed to the tree I anyways have to wait a full day until the build is finished, on my low end system. Reply 12. [233c279c012eb] Steve Klabnik says: 2021-02-21 at 22:26 Hi! Rust core team (though not compiler team) member here. There's a lot of opinion in this piece, which is totally fine of course, but I wanted to point out a factual inaccuracy: > Rust bundles a huge fork of LLVM It is not super huge, and we try to upstream patches regularly to minimize the fork. We also regularly re-base all current patches on each release of LLVM, so it's less 'fork' and more 'maintain some extra commits to fix bugs.' > and explicitly refuses to support to distributions using the genuine LLVM libraries. We always support the latest LLVM release, and try to maintain compatibilities with older ones as long as is reasonable and possible. IIRC, the last time we raised the base LLVM requirement was in Rust 1.49, at the end of last year. The minimum version it was raised to was LLVM 9, which was released in September of 2019. The current release is 11. Again, because of the lack of the extra patches, you may see miscompilation bugs if you use the stock LLVM. There's pros and cons to every choice here, of course. For more on this, please see https:// rustc-dev-guide.rust-lang.org/backend/updating-llvm.html The Rust project is and has been interested in collaborating on issues to make things easier for folks when there's demand. We've sought feedback from distros in the past and made changes to make things easier! I would hope taking advantage of the stuff above would solve at least some of your issues with packaging rustc itself. If there are other ways that we can bridge this gap, please get in touch. Reply 1. [ff5429ef741b4] Michal Gorny says: 2021-02-21 at 22:31 >> Rust bundles a huge fork of LLVM > It is not super huge, and we try to upstream patches regularly to minimize the fork. I am sorry about that. I didn't mean to say that the divergence is huge but that LLVM is huge. I tend to edit myself a few times, and I must've accidentally changed the meaning at that. Reply 13. [b93e0d57a0f24] Nathan says: 2021-02-21 at 22:41 Hi Michael, I enjoyed your post as it provides some useful context for those unaware of common issues faced by distros. Do you know of any other blogs with similar posts? The issues you describe seem very much like a cursed problem found in game development, if you aren't familiar with that term there's a great GDC youtube video on it. What are your thoughts regarding allowing both behaviors in a way that effectively favors dynamic linking under the hood. I would think the first step would be getting visibility on the problem of similarly classified code between dynamic and static binaries. There really isn't much to be done regarding the upstream maintainers not paying attention to issues unfortunately, the only thing I come up with is make it easier to collaborate on issues but its been my experience that these people have a vested interest in not changing their workflows (even if it helps them in the long run). Snap is a perfect example of what's wrong with the linux ecosystem. Reply 1. [ff5429ef741b4] Michal Gorny says: 2021-02-21 at 23:07 I can think of three that touch somewhat related topics, and haven't been linked here already: The Flameeyes' classic on bundling: https://flameeyes.blog/2009/01/02/ bundling-libraries-for-despair-and-insecurity/ Michael Orlitzky's on programming language design: https://orlitzky.com/articles/ greybeards_tomb%3A_the_lost_treasure_of_language_design.xhtml Drew Devault directly on Rust: https://drewdevault.com/2021/02/09/ Rust-move-fast-and-break-things.html Reply 14. [95a084c3a9558] Steve Klabnik says: 2021-02-21 at 22:56 It's all good! :) Reply Leave a Reply Cancel reply Your email address will not be published. Required fields are marked * [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment [ ] Name * [ ] Email * [ ] Website [ ] [Post Comment] [*] Authenticate this comment using OpenID. Post navigation Previous PostPrevious lzip decompression support for xz-utils Search for: [ ] Search Recent Posts * The modern packager's security nightmare * lzip decompression support for xz-utils * Distribution Kernels: module rebuilds, better ZFS support and UEFI executables * OpenSSL, LibreSSL, LibreTLS and all the terminological irony * How Debuggers Work: Getting and Setting x86 Registers, Part 2: XSAVE Recent Comments * Michal Gorny on The modern packager's security nightmare * Steve Klabnik on The modern packager's security nightmare * Nathan on The modern packager's security nightmare * Michal Gorny on The modern packager's security nightmare * Steve Klabnik on The modern packager's security nightmare Categories * Gentoo + Ebuild writing + Legalese & politics + PM utils + PMS test suite + Security + systemd * Personal * Programmer's digest * Uncategorized * Webdev Archives * February 2021 * January 2021 * December 2020 * October 2020 * September 2020 * August 2020 * February 2020 * December 2019 * November 2019 * October 2019 * September 2019 * July 2019 * February 2019 * January 2019 * November 2018 * September 2018 * August 2018 * May 2018 * July 2017 * March 2017 * January 2017 * September 2016 * July 2016 * June 2016 * May 2016 * April 2016 * February 2016 * January 2016 * November 2015 * January 2015 * December 2014 * November 2014 * September 2014 * July 2014 * June 2014 * March 2014 * February 2014 * August 2013 * July 2013 * May 2013 * August 2012 * July 2012 * June 2012 * April 2012 * January 2012 * December 2011 * November 2011 * October 2011 * September 2011 * August 2011 * June 2011 * May 2011 Meta * Log in * Entries feed * Comments feed * WordPress.org Proudly powered by WordPress