https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/

Skip to content

Michal Gorny

Retroactively fixing the world

Posted on 2021-02-192021-02-21 by Michal Gorny

The modern packager's security nightmare

One of the most important tasks of the distribution packager is to
ensure that the software shipped to our users is free of security
vulnerabilities. While finding and fixing the vulnerable code is
usually considered upstream's responsibility, the packager needs to
ensure that all these fixes reach the end users ASAP. With the aid of
central package management and dynamic linking, the Linux
distributions have pretty much perfected the deployment of security
fixes. Ideally, fixing a vulnerable dependency is as simple as
patching a single shared library via the distribution's automated
update system.

Of course, this works only if the package in question is actually
following good security practices. Over the years, many Linux
distributions (at the very least, Debian, Fedora and Gentoo) have
been fighting these bad practices with some success. However, today
the times have changed. Today, for every 10 packages fixed, a
completely new ecosystem emerges with the bad security practices at
its central point. Go, Rust and to some extent Python are just a few
examples of programming languages that have integrated the bad
security practices into the very fabric of their existence, and
recreated the same old problems in entirely new ways.

The root issue of bundling dependencies has been discussed many times
before. The Gentoo Wiki explains why you should not bundle
dependencies, and links to more material about it. I would like to
take a bit wider approach, and discuss not only bundling (or
vendoring) dependencies but also two closely relevant problems:
static linking and pinning dependencies.

Static linking

In the simplest words, static linking means embedding your program's
dependencies directly into the program image. The term is generally
used in contrast to dynamic linking (or dynamic loading) that keep
the dependent libraries in separate files that are loaded at
program's startup (or runtime).

Why is static linking bad? The primary problem is that since they
become the integral part of the program, they can not be easily
replaced by another version. If it turns out that one of the
libraries is vulnerable, you have to relink the whole program against
the new version. This also implies that you need to have a system
that keeps track of what library versions are used in individual
programs.

While you might think that rebuilding a lot of packages is only a
problem for source distributions, you are wrong. While indeed the
users of source distributions could be impacted a lot, as their
systems remain vulnerable for a long time needed to rebuild a lot of
packages, a similar problem affects binary distributions. After all,
the distributions need to rebuild all affected programs in order to
fully ship the fix to their end users which also involves some delay.

Comparatively, shipping a new version of a shared library takes much
less time and fixes all affected programs almost instantly (modulo
the necessity of restarting them).

The extreme case of static linking is to distribute proprietary
software that is statically linked to its dependencies. This is
primarily done to ensure that the software can be run easily on a
variety of systems without requiring the user to install its
dependencies manually. However, this scenario is really a form of
bundling dependencies, so it will be discussed in the respective
section.

However, static linking has also been historically used for system
programs that were meant to keep working even if their dependent
libraries became broken.

In modern packages, static linking is used for another reason
entirely -- because they do not require the modern programming
languages to have a stable ABI. The Go compiler does not need to be
concerned about emitting code that would be binary compatible with
the code coming from a previous version. It works around the problem
by requiring you to rebuild everything every time the compiler is
upgraded.

To follow the best practices, we strongly discourage static linking
in C and its derivatives. However, we can't do much about languages
such as Go or Rust that put static linking at the core of their
design and have time and again stated publicly that they will not
switch to dynamic linking of dependencies.

Pinning dependencies

While static linking is bad, at least it provides a reasonably clear
way for automatic updates (and therefore the propagation of
vulnerability fixes) to happen, pinning dependencies means requiring
a specific version of your program's dependencies to be installed.
While the exact results depend on the ecosystem and the exact way of
pinning the dependency, generally it means that at least some users
of your package will not be able to automatically update the
dependencies to newer versions.

That might not seem that bad at first. However, it means that if a
bug fix or -- even more importantly -- a vulnerability fix is released
for the dependency, the users will not get it unless you update the
pin and make a new release. And then, if somebody else pins your
package, then that pin will also need to be updated and released. And
the chain goes on. Not to mention what happens if some package just
happens to indirectly pin to two different versions of the same
dependency!

Why do people pin dependencies? The primary reason is that they don't
want dependency updates to suddenly break their packages for end
users, or to have their CI results suddenly broken by third-party
changes. However, all that has another underlying problem -- the
combination of not being concerned with API stability on upstream
part, and not wishing to unnecessarily update working code (that uses
deprecated API) on downstream part. Truth is, pinning makes this
worse because it sweeps the problem under the carpet, and actively
encourages people to develop their code against specific versions of
their dependencies rather than against a stable public API. Hyrum's
Law in practice.

Dependency pinning can have really extreme consequences. Unless you
make sure to update your pins often, you may one day find yourself
having to take a sudden leap -- because you have relied on a very old
version of a dependency that is now known to be vulnerable, and in
order to update it you suddenly have to rewrite a lot of code to
follow the API changes. Long term, this approach simply does not
scale anymore, the effort needed to keep things working grows
exponentially.

We try hard to unpin the dependencies and test packages with the
newest versions of them. However, often we end up discovering that
the newer versions of dependencies simply are not compatible with the
packages in question. Sadly, upstreams often either ignore reports of
these incompatibilities or even are actively hostile to us for not
following their pins.

Bundling/vendoring dependencies

Now, for the worst of all -- one that combines all the aforementioned
issues, and adds even more. Bundling (often called vendoring in
newspeak) means including the dependencies of your program along with
it. The exact consequences of bundling vary depending on the method
used.

In open source software, bundling usually means either including the
sources of your dependencies along with your program or making the
build system fetch them automatically, and then building them along
with the program. In closed source software, it usually means linking
the program to its dependencies statically or including the
dependency libraries along with the program.

The baseline problem is the same as with pinned dependencies -- if one
of them turns out to be buggy or vulnerable, the users need to wait
for a new release to update the bundled dependency. In open source
software or closed source software using dynamic libraries, the
packager has at least a reasonable chance of replacing the
problematic dependency or unbundling it entirely (i.e. forcing the
system library). In statically linked closed source software, it is
often impossible to even reliably determine what libraries were
actually used, not to mention their exact versions. Your distribution
can no longer reliably monitor security vulnerabilities; the trust is
shifted to software vendors.

However, modern software sometimes takes a step further -- and vendor
modified dependencies. The horror of it! Now not only the packager
needs to work to replace the library but often has to actually figure
out what was changed compared to the original version, and rebase the
changes. In worst cases, the code becomes disconnected from upstream
to the point that the program author is no longer capable of updating
the vendored dependency properly.

Sadly, this kind of vendoring is becoming more common with the rapid
development happening these days. The cause is twofold. On one hand,
downstream consumers find it easier to fork and patch a dependency
than to work with upstreams. On the other hand, many upstreams are
not really concerned with fixing bugs and feature requests that do
not affect their own projects. Even if the fork is considered only as
a stop-gap measure, it often takes a real lot of effort to push the
changes upstream afterwards and re-synchronize the codebases.

We are strongly opposed to bundling dependencies. Whenever possible,
we try to unbundle them -- sometimes having to actually patch the
build systems to reuse system libraries. However, this is a lot of
work, and often it is not even possible because of custom patching,
including the kind of patching that has been explicitly rejected
upstream. To list a few examples -- Mozilla products rely on SQLite 3
patches that collide with regular usage of this library, Rust bundles
a fork of LLVM.

Summary

Static linking, dependency pinning and bundling are three bad
practices that have serious impact on the time and effort needed to
eliminate vulnerabilities from production systems. They can make the
difference between being able to replace a vulnerable library within
a few minutes and having to spend a lot of effort and time in
locating multiple copies of the vulnerable library, patching and
rebuilding all the software including them.

The major Linux distributions had policies against these practices
for a very long time, and have been putting a lot of effort into
eliminating them. Nevertheless, it feels more and more like Sisyphean
task. While we have been able to successfully resolve these problems
in many packages, whole new ecosystems were built on top of these bad
practices -- and it does not seem that upstreams care about fixing
them at all.

New programming languages such as Go and Rust rely entirely on static
linking, and there's nothing we can do about it. Instead of packaging
the dependencies and having programs use the newest versions, we just
fetch the versions pinned by upstream and make big blobs out of it.
And while upstreams brag how they magically resolved all security
issues you could ever think of (entirely ignoring other classes of
security issues than memory-related), we just hope that we won't
suddenly be caught with our pants down when a common pinned
dependency of many packages turns out to be vulnerable.

CategoriesGentoo, Security

22 Replies to "The modern packager's security nightmare"

 1. [002974244d9ac] Shiba says:
    2021-02-20 at 02:18

    The unstable ABI/static linking of modern languages is a real let
    down and the saddest part about this is that they could have been
    done differently if dynamic linking had been a top priority for
    their developers. I'll leave you an article about "How Swift
    achieved dynamic linking where Rust couldn't" for when you are
    bored and want to get pissed off a bit more: https://
    gankra.github.io/blah/swift-abi/

    Reply
 2. [982790dd70ad6] Alex says:
    2021-02-20 at 11:44

    I think, dependency pinning issue should be at least partially
    solvable with tool like this added in CI: https://github.com/
    RustSec/cargo-audit

    But of course not all upstream developers are gonna use it, so
    the global issue remains. Cargo should probably audit lock files
    by default.

    Reply
     1. [ff5429ef741b4] Michal Gorny says:
        2021-02-20 at 14:02

        There's also a bot on GitHub that reports outdated pinned
        dependencies for Python; but I don't know whether it works
        for Rust too.

        However, this doesn't really resolve the underlying issue.
        Tools make it possible to workaround the problem if you only
        make new releases fast enough. They don't make things work
        smoothly. It's like adding a device to push square wheels and
        then claiming they work as good as round ones.

        Reply
         1. [35c395c5aeb8a] Randy Barlow says:
            2021-02-21 at 22:22

            You may be referring to Dependabot, and it does support
            Rust:

            https://dependabot.com/rust/

            It sends pull requests to the project, and even includes
            some handy information like the upstream changelog.
            Here's an example pull request it made:

            https://github.com/bowlofeggs/rpick/pull/39

            This, of course, doesn't necessarily help at the
            distribution packaging level, but it does help an
            upstream project stay on top of updates, and to stay
            aware of security issues.

            Reply
 3. [181df4e49d1f6] John Erickson says:
    2021-02-20 at 14:10

    What's your view on Node/JS apps?

    Reply
     1. [ff5429ef741b4] Michal Gorny says:
        2021-02-20 at 16:35

        I don't really know the Node ecosystem (and I don't really
        want to know it, I guess), so I can't really tell.

        Reply
 4. [b4b3f1f183731] JCaesar says:
    2021-02-20 at 16:27

    > This also implies that you need to have a system that keeps
    track of what library versions are used in individual programs.

    I assume such a system doesn't exist for gentoo? How far would
    checking EGO_SUM and CRATES help? (Those don't list bundled
    dependencies, but would it be a start?)

    Also, is this really a new problem? What's the story with
    tracking C++ header-only libraries, npm dependency lock files, or
    maven poms?

    Reply
     1. [ff5429ef741b4] Michal Gorny says:
        2021-02-20 at 16:47

        There's no such system at the moment. It is possible that the
        approach used by modern eclasses would help for Go/Rust.
        Still, somebody needs to do the work.

        As for header-only libraries (or pure static libraries),
        subslots provide a 'usually good enough' solution to that.
        That is, as long as all consumers depend on them directly.

        I don't know much about NPM, and I know we haven't managed to
        deal with Maven yet.

        Reply
 5. [c24cd33fcaf78] Axel Beckert says:
    2021-02-21 at 04:22

    Thanks for that very nice summary. The only thing I'm missing is
    the fact that containerized package formats like AppImage,
    Flatpak, Snaps or Docker images have more or less the same issues
    as bundling/vendoring dependencies and accordingly hav a similar
    if not worse security impact.

    Unfortunately--like many application developers--also at least one
    of the bigger Linux distributions has fallen for that easy path
    to the dark side of packaging and has unnecessarily started
    shipping dummy packages which pull in containerized applications.
    (Yes, I look at you, Ubuntu.)

    Reply
 6. [9c328fe0d7bd1] Carlo says:
    2021-02-21 at 12:49

    Too true. We're in the dark age of software security and data
    safety. Companies have almost no reasons to invest in it.
    Apparently the number of opportunistic software engineers not to
    care about this either seems to be growing; and a lot of younger
    ones learn this bad behavior.

    In my opinion this isn't a problem, which can be sorted out on
    the engineers level, it's a social and political one: The
    question is who can be held responsible; unless there's no chance
    companies and their leaders and cannot be sued out of (economic)
    existence for data breaches, etc., nothing will change.

    If the law forced long support time frames for software and
    computer systems, publication of used software components/
    libraries and prompt publication and correction of security
    issues and other severe bugs, the software development and
    packaging world would be another one; simply because of changed
    requirements.

    Unfortunately our societies do not care or even understand the
    problem.

    Reply
 7. [f670aba0e87fa] Carlo says:
    2021-02-21 at 13:01

    Ideas:
    - Use-Flagging: BLOB_UPSTREAM_UNSUPPORTIVE
    - Installing in (para-)virtualized environments only
    - In the worst case it may be better to package mask this sort of
    software and not invest developer time

    Reply
 8. [b7eecb38c888a] Oz says:
    2021-02-21 at 17:52

    I was interested in learning Rust and Go... Not so much anymore,
    good info.

    Reply
 9. [6abb8784ddfeb] Others says:
    2021-02-21 at 19:47

    The bit about rust and llvm is (as far as I understand)
    completely false. Rust supports (and tests against) stock LLVM.
    Can you elaborate where you got your information there?

    Reply
     1. [379ca7e347a37] gyakovlev says:
        2021-02-21 at 21:58

        Gentoo rust maintainer here:

        I'm not sure about testing right now, maybe they do now. But
        some time ago the only tests were only against internal llvm
        copy. I think there's even an option to download their
        compiled llvm for build/tests landed nowadays, that narrows
        test/build scope even more.

        as for "supported" (quoted on purpose), sure it kinda is, but
        in reality it is not. Internal llvm copy is pinned and
        contains some specific codegen patches on top. so it differs
        from system-llvm.
        also, rust ebuild has a test suite, which I do run on bumps.
        and I see codegen failures with system-llvm more often than
        with internal one. not really surprising.
        we even have a bug for that with more points: https://
        bugs.gentoo.org/735154
        lu_zero will have more info on specific examples, because
        they were able to trigger a specific breakage with
        system-llvm and not bundled one.
        to be specific, I run tests on amd64, arm64 and ppc64le, with
        both system-llvm and bundled one. system-llvm in most cases
        fails more tests, but not always and situation has been
        better lately.

        major blocker: one can't really build firefox(or other
        mozilla rust consumers) with system-llvm 11 and rust llvm !=
        11. So one can't build firefox/thunderbird/spidermonkey if
        there's a mismatch between llvm versions.

        and cherry on top, debian, fedora and other distros do
        prohibit bundling of llvm. but they have a rather dated
        version of rust compared to gentoo, so they can move slower.
        In gentoo we have to keep a bootstrap path ( because rust has
        a specific bootstrap version requirements) and for stability
        I've been forcing llvm dep of rust to be the same as internal
        llvm, sans the patches.

        as for FreeBSD, last time I checked they've used bundled one,
        despite llvm being base system component. Maybe that has
        changed, they are making a lot of progress, but maybe not.
        they've also swiched to using bundled libssh2 and libgit2 (
        as we did as well) because updating system copies of those
        packages broke cargo to the point of inability for rust to
        update itself.

        there are a lot of problems not seen to consumers, but
        developers and packagers see and deal with it daily, as blog
        post title says.

        Reply
10. [3b85e7525b4cd] Polynomial-C says:
    2021-02-21 at 21:37

    One upstream that drove bundling to an excess is
    MoonchildProductions with their Firefox fork called PaleMoon.
    They removed all configure switches to enable usage of system
    libs so you can only use their bundled stuff. Security at its
    best.
    I was even badmouthed to spread FUD when I tried to convince
    developers of another Firefox fork called Waterfox to not merge
    with Moonchild's Basilisk browser (unbranded PaleMoon):

    https://github.com/MrAlex94/Waterfox/issues/1290

    Unfortunately the original issue at MoonchildProductions
    regarding bundled libs has been removed.

    I'm just glad that MoonchildProductions did this stupid move
    before I started to waste my time adding PaleMoon to Gentoo. I
    also convinced some other Gentoo devs to not waste their time on
    this as long as this ignorant upstream continues with their
    bundling stupidity.

    Reply
     1. [379ca7e347a37] gyakovlev says:
        2021-02-21 at 22:07

        don't forget the "niceness" of palemoon upstream =)

        https://github.com/jasperla/openbsd-wip/issues/86

        they even have specific clause for ebuilds in their license,
        forbidding modifications.

        Reply
11. [bddba3589b985] Nestor says:
    2021-02-21 at 21:41

    OK, this sucks. I just looked at sys-apps/ripgrep's ebuild. Why
    is dependency pinning so widespread in the Rust ecosystem? Are
    updates to its dependencies more or less always breaking a
    program or what?

    With regard to static linking:
    Wouldn't it be enough to generate a list of build-time
    dependencies from ebuilds of statically linked programs and
    compare that to the output of 'emerge -uDN @world' and then
    trigger an einfo to run something to the likes of 'emerge
    @static-rebuild' when a match is found?

    Each time an update of one of the behemoths (libreoffice,
    firefox, etc.) is pushed to the tree I anyways have to wait a
    full day until the build is finished, on my low end system.

    Reply
12. [233c279c012eb] Steve Klabnik says:
    2021-02-21 at 22:26

    Hi! Rust core team (though not compiler team) member here.

    There's a lot of opinion in this piece, which is totally fine of
    course, but I wanted to point out a factual inaccuracy:

    > Rust bundles a huge fork of LLVM

    It is not super huge, and we try to upstream patches regularly to
    minimize the fork. We also regularly re-base all current patches
    on each release of LLVM, so it's less 'fork' and more 'maintain
    some extra commits to fix bugs.'

    > and explicitly refuses to support to distributions using the
    genuine LLVM libraries.

    We always support the latest LLVM release, and try to maintain
    compatibilities with older ones as long as is reasonable and
    possible.

    IIRC, the last time we raised the base LLVM requirement was in
    Rust 1.49, at the end of last year. The minimum version it was
    raised to was LLVM 9, which was released in September of 2019.
    The current release is 11.

    Again, because of the lack of the extra patches, you may see
    miscompilation bugs if you use the stock LLVM. There's pros and
    cons to every choice here, of course.

    For more on this, please see https://
    rustc-dev-guide.rust-lang.org/backend/updating-llvm.html

    The Rust project is and has been interested in collaborating on
    issues to make things easier for folks when there's demand. We've
    sought feedback from distros in the past and made changes to make
    things easier! I would hope taking advantage of the stuff above
    would solve at least some of your issues with packaging rustc
    itself. If there are other ways that we can bridge this gap,
    please get in touch.

    Reply
     1. [ff5429ef741b4] Michal Gorny says:
        2021-02-21 at 22:31

        >> Rust bundles a huge fork of LLVM

        > It is not super huge, and we try to upstream patches
        regularly to minimize the fork.

        I am sorry about that. I didn't mean to say that the
        divergence is huge but that LLVM is huge. I tend to edit
        myself a few times, and I must've accidentally changed the
        meaning at that.

        Reply
13. [b93e0d57a0f24] Nathan says:
    2021-02-21 at 22:41

    Hi Michael, I enjoyed your post as it provides some useful
    context for those unaware of common issues faced by distros. Do
    you know of any other blogs with similar posts?

    The issues you describe seem very much like a cursed problem
    found in game development, if you aren't familiar with that term
    there's a great GDC youtube video on it.

    What are your thoughts regarding allowing both behaviors in a way
    that effectively favors dynamic linking under the hood.

    I would think the first step would be getting visibility on the
    problem of similarly classified code between dynamic and static
    binaries.

    There really isn't much to be done regarding the upstream
    maintainers not paying attention to issues unfortunately, the
    only thing I come up with is make it easier to collaborate on
    issues but its been my experience that these people have a vested
    interest in not changing their workflows (even if it helps them
    in the long run).

    Snap is a perfect example of what's wrong with the linux
    ecosystem.

    Reply
     1. [ff5429ef741b4] Michal Gorny says:
        2021-02-21 at 23:07

        I can think of three that touch somewhat related topics, and
        haven't been linked here already:

        The Flameeyes' classic on bundling:
        https://flameeyes.blog/2009/01/02/
        bundling-libraries-for-despair-and-insecurity/

        Michael Orlitzky's on programming language design:
        https://orlitzky.com/articles/
        greybeards_tomb%3A_the_lost_treasure_of_language_design.xhtml

        Drew Devault directly on Rust:
        https://drewdevault.com/2021/02/09/
        Rust-move-fast-and-break-things.html

        Reply
14. [95a084c3a9558] Steve Klabnik says:
    2021-02-21 at 22:56

    It's all good! :)

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked 
*

        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
        [                                             ]
Comment [                                             ]

Name * [                              ]

Email * [                              ]

Website [                              ]

[Post Comment] 

[*] Authenticate this comment using OpenID.

Post navigation

Previous PostPrevious lzip decompression support for xz-utils
Search for: [                    ] Search
Recent Posts

  * The modern packager's security nightmare
  * lzip decompression support for xz-utils
  * Distribution Kernels: module rebuilds, better ZFS support and
    UEFI executables
  * OpenSSL, LibreSSL, LibreTLS and all the terminological irony
  * How Debuggers Work: Getting and Setting x86 Registers, Part 2:
    XSAVE

Recent Comments

  * Michal Gorny on The modern packager's security nightmare
  * Steve Klabnik on The modern packager's security nightmare
  * Nathan on The modern packager's security nightmare
  * Michal Gorny on The modern packager's security nightmare
  * Steve Klabnik on The modern packager's security nightmare

Categories

  * Gentoo
      + Ebuild writing
      + Legalese & politics
      + PM utils
      + PMS test suite
      + Security
      + systemd
  * Personal
  * Programmer's digest
  * Uncategorized
  * Webdev

Archives

  * February 2021
  * January 2021
  * December 2020
  * October 2020
  * September 2020
  * August 2020
  * February 2020
  * December 2019
  * November 2019
  * October 2019
  * September 2019
  * July 2019
  * February 2019
  * January 2019
  * November 2018
  * September 2018
  * August 2018
  * May 2018
  * July 2017
  * March 2017
  * January 2017
  * September 2016
  * July 2016
  * June 2016
  * May 2016
  * April 2016
  * February 2016
  * January 2016
  * November 2015
  * January 2015
  * December 2014
  * November 2014
  * September 2014
  * July 2014
  * June 2014
  * March 2014
  * February 2014
  * August 2013
  * July 2013
  * May 2013
  * August 2012
  * July 2012
  * June 2012
  * April 2012
  * January 2012
  * December 2011
  * November 2011
  * October 2011
  * September 2011
  * August 2011
  * June 2011
  * May 2011

Meta

  * Log in
  * Entries feed
  * Comments feed
  * WordPress.org

Proudly powered by WordPress