[HN Gopher] Fedora change aims for 99% package reproducibility
       ___________________________________________________________________
        
       Fedora change aims for 99% package reproducibility
        
       Author : voxadam
       Score  : 292 points
       Date   : 2025-04-11 13:40 UTC (9 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | ajross wrote:
       | Linux folks continue with running away with package security
       | paradigms while NPM, PyPI, cargo, et. al. (like that VSCode
       | extension registry that was on the front page last week) think
       | they can still get away with just shipping what some rando
       | pushes.
        
         | hedora wrote:
         | Shipping what randos push works great for iOS and Android too.
         | 
         | System perl is actually good. It's too bad the Linux vendors
         | don't bother with system versions of newer languages.
        
           | heinrich5991 wrote:
           | System rustc is also good on Arch Linux. I think system
           | rustc-web is also fine on Debian.
        
             | WD-42 wrote:
             | I recently re-installed and instead of installing rustup I
             | thought I'd give the Arch rust package a shot. Supposedly
             | it's not the "correct" way to do it but so far it's working
             | great, none of my projects require nightly. Updates have
             | come in maybe a day after upstream. One less thing to think
             | about, I like it.
        
           | g-b-r wrote:
           | Sure, there's never malware on the stores...
        
           | ajross wrote:
           | > Shipping what randos push works great for iOS and Android
           | too.
           | 
           | App store software is _excruciatingly_ vetted, though. Apple
           | and Google spend far, far, _FAR_ more on validating the
           | software they ship to customers than Fedora or Canonical, and
           | it 's not remotely close.
           | 
           | It only looks like "randos" because the armies of auditors
           | and datacenters of validation software are hidden behind the
           | paywall.
        
             | IshKebab wrote:
             | It's really not. At least on Android they have some
             | automated vetting but that's about it. Any human vetting is
             | mostly for business reasons.
             | 
             | Also Windows and Mac have existed for decades and there's
             | zero vetting there. Yeah malware exists but its easy to
             | avoid and _easily_ worth the benefit of actually being able
             | to get up-to-date software from anywhere.
        
               | skydhash wrote:
               | > _Also Windows and Mac have existed for decades and
               | there 's zero vetting there._
               | 
               | Isn't that only for applications? All the system software
               | are provided and vetted by the OS developer.
        
               | IshKebab wrote:
               | Yeah but distro package repositories mostly consist of
               | applications and non-system libraries.
        
               | skydhash wrote:
               | Most users only have an handful of applications
               | installed. They may not be the same one, but flatpak is
               | easy to setup if you want the latest version. And you're
               | not tied to the system package manager if you want the
               | latest for CLI software (nix, brew, toolbox,...)
               | 
               | The nice thing about Debian is that you can have 2 full
               | years of routine maintenance while getting reading for
               | the next big updates. The main issue is upstream
               | developer having bug fixes and features update on the
               | same patch.
        
               | ndiddy wrote:
               | The vetting on Windows is that basically any software
               | that isn't signed by an EV certificate will show a scary
               | SmartScreen warning on users' computers. Even if your
               | users aren't deterred by the warning, you also have a
               | significant chance of your executable getting mistakenly
               | flagged by Windows Defender, and then you have to file a
               | request for Microsoft to whitelist you.
               | 
               | The vetting on Mac is that any unsigned software will
               | show a scary warning and make your users have to dig into
               | the security options in Settings to get the software to
               | open.
               | 
               | This isn't really proactive, but it means that if you
               | ship malware, Microsoft/Apple can revoke your
               | certificate.
               | 
               | If you're interested in something similar to this
               | distribution model on Linux, I would check out Flatpak.
               | It's similar to how distribution works on Windows/Mac
               | with the added benefit that updates are handled centrally
               | (so you don't need to write auto-update functionality
               | into each program) and that all programs are manually
               | vetted both before they go up on Flathub and when they
               | change any permissions. It also doesn't cost any money to
               | list software, unlike the "no scary warnings"
               | distribution options for both Windows and Mac.
        
             | ykonstant wrote:
             | >App store software is excruciatingly vetted, though. Apple
             | and Google spend far, far, FAR more on validating the
             | software they ship to customers than Fedora or Canonical,
             | and it's not remotely close.
             | 
             | Hahah! ...they don't. They really don't, man. They _do_
             | have procedures in place that makes them look like they do,
             | though; I 'll give you that.
        
         | anotherhue wrote:
         | I have observed a sharp disconnect in the philosophies of
         | 'improving developer experience' and 'running a tight ship'.
         | 
         | I think the last twenty years of quasi-
         | marketing/sales/recruiting DevRel roles have pushed a narrative
         | of frictionless development, while on the flip side security
         | and correctness have mostly taken a back seat (special
         | industries aside).
         | 
         | I think it's a result of the massive market growth, but I so
         | welcome the pendulum swinging back a little bit. Typo squatting
         | packages being a concern at the same time as speculative
         | execution exploits shows mind bending immaturity.
        
           | psadauskas wrote:
           | "Security" and "Convenience" is always a tradeoff, you can
           | never have both.
        
             | bluGill wrote:
             | Ture. Then the Convenience folks don't understand why the
             | rest of us don't want the things they think are so great.
             | 
             | There are good middle grounds, but most package managers
             | don't even acknowledge other concerns as valid.
        
             | benterix wrote:
             | This is obvious, the question here is why everybody traded
             | security for convenience and what else has to happen for
             | people to start taking security seriously.
        
               | ziddoap wrote:
               | > _the question here is why everybody traded security for
               | convenience_
               | 
               | I don't think security was traded away for convenience.
               | Everything started with convenience, and security has
               | been trying to gain ground ever since.
               | 
               | > _happen for people to start taking security seriously_
               | 
               | Law with _enforced_ and _non-trivial_ consequences are
               | the only thing that will force people to take security
               | seriously. And even then, most probably still wont.
        
               | vladms wrote:
               | Regarding "what else has to happen": I would say
               | something catastrophic. Nothing comes to mind recently.
               | 
               | Security is good, but occasionally I wonder if technical
               | people don't imagine fantastic scenarios of evil
               | masterminds doing something with the data and manage to
               | rule the world.
               | 
               | While in reality, at least the last 5 years there are so
               | many leaders (and people) doing and saying so plainly
               | stupid that I feel we should be more afraid of stupid
               | people than of hackers.
        
               | bluGill wrote:
               | In the last 5 years several major medical providers have
               | had sensitive person data of nearly everyone compromised.
               | The political leaders are biggest problem today, but that
               | could change again.
        
             | yjftsjthsd-h wrote:
             | It's not quite a straight trade; IIRC the OpenBSD folks
             | really push on good docs and maybe good defaults precisely
             | because making it easier to hold the tool right makes it
             | safer.
        
             | cyrnel wrote:
             | I've seen this more formalized as a triangle, with
             | "functionality" being the third point:
             | https://blog.c3l-security.com/2019/06/balancing-
             | functionalit...
             | 
             | You can get secure and easy-to-use tools, but they
             | typically have to be really simple things.
        
           | phkahler wrote:
           | I think that's a consequence of programmers making tools for
           | programmers. It's something I've come to really dislike.
           | Programmers are used to doing things like editing
           | configurations, setting environment variables, or using
           | custom code to solve a problem. As a result you get programs
           | that can be configured, customized through code (scripting or
           | extensions), and little tools to do whatever. This is not
           | IMHO how good software should be designed to be used. On the
           | positive side, we have some really good tooling - revision
           | control in the software world is way beyond the equivalent in
           | any other field. But then git could be used in other fields
           | if not for it being a programmers tool designed by
           | programmers... A lot of developers even have trouble doing
           | things with git that are outside their daily use cases.
           | 
           | Dependency management tools are tools that come about because
           | it's easier and more natural for a programmer to write some
           | code than solve a bigger problem. Easier to write a tool than
           | write your own version of something or clean up a complex set
           | of dependencies.
        
         | esafak wrote:
         | The future is not evenly distributed.
        
         | Palomides wrote:
         | distros get unbelievable amounts of hate for not immediately
         | integrating upstream changes, there's really no winning
        
           | IshKebab wrote:
           | Rightly so. The idea that all software should be packaged for
           | all distros, and that you shouldn't want to use the latest
           | version of software is clearly ludicrous. It only seems
           | vaguely reasonable because it's what's always been done.
           | 
           | If Linux had evolved a more sensible system and someone came
           | along and suggested "no actually I think each distro should
           | have its own package format and they should all be
           | responsible for packaging all software in the world, and they
           | should use old versions too for stability" they would rightly
           | be laughed out of the room.
        
             | bluGill wrote:
             | There are too many distributions / formats. However
             | distribution package are much better than
             | snap/flatpack/docker for most uses, the only hard part is
             | there are so many that no program can put "and is
             | integrated into the package manager" in their release
             | steps. You can ship a docker container in your program
             | release - it is often done but rarely what should be done.
        
               | oivey wrote:
               | For some cases maybe that makes sense, but in a very
               | large percentage it does not. As example, what if I want
               | to build and use/deploy a Python app that needs the
               | latest NumPy, and the system package manager doesn't have
               | it. It would be hard to justify for me to figure out and
               | build a distro specific package for this rather than just
               | using the Python package on PyPI.
        
               | bluGill wrote:
               | The point is the distro should provide the numpy you
               | need.
        
               | aseipp wrote:
               | And what happens when they can't do that because you need
               | the latest major version with specific features?
        
             | michaelt wrote:
             | _> The idea that [...] you shouldn 't want to use the
             | latest version of software is clearly ludicrous._
             | 
             | To get to that world, we developers would have to give up
             | making breaking changes.
             | 
             | We can't have any "your python 2 code doesn't work on
             | python 3" nonsense.
             | 
             | Should we stop making breaking changes? Maybe. Will we? No.
        
               | dagw wrote:
               | _We can't have any "your python 2 code doesn't work on
               | python 3" nonsense_
               | 
               | This only happens because distros insit on shipping
               | python and then everyone insisted on using that python to
               | run their software.
               | 
               | In an alternate world everybody would just ship their own
               | python with their own app and not have that problem.
               | That's how windows basically solves this
        
               | bluGill wrote:
               | Which sounds good until you run out of disk space.
               | 
               | Of course I grew up when hard drives were not affordable
               | by normal people - my parents had to save for months to
               | get my a floppy drive.
        
               | charcircuit wrote:
               | You can have both python 2 and python 3 installed. Apps
               | should get the dependencies they request. Distros
               | swapping out dependencies out from under them has caused
               | numerous issues for developers.
        
             | marcusb wrote:
             | What is a distribution but a collection of software
             | packaged in a particular way?
        
             | danieldk wrote:
             | nixpkgs packages pretty much everything I need. It's a very
             | large package set and very fresh. It's mostly about culture
             | and tooling. I tried to contribute to Debian once and gave
             | up after months. I was contributing to nixpkgs days after I
             | started using Nix.
             | 
             | Having every package as part of a distribution is immensely
             | useful. You can declaratively define your whole system with
             | all software. I can roll out a desktop, development VM or
             | server within 5 minutes and it's fully configured.
        
           | jorvi wrote:
           | They do not. Even the vast majority of Arch users thinks the
           | policy of "integrate breakage anyway and post the warning on
           | the web changelog" is pants-on-head insane, especially
           | compared to something like SUSE Tumbleweed (also a rolling
           | distro) where things get tested and will stay staged if
           | broken.
        
             | loeg wrote:
             | > They do not.
             | 
             | I present to you sibling comment posted slightly before
             | yours: https://news.ycombinator.com/item?id=43655093
             | 
             | They do.
        
           | zer00eyz wrote:
           | Distros get real hate for being so out of date that upstream
           | gets a stream of bug reports on old and solved issues.
           | 
           | Prime example of this is what the Bottles dev team as done.
           | 
           | It isnt an easy problem to solve.
        
           | 6SixTy wrote:
           | That's, in my experience, mostly Debian Stable.
        
         | tsimionescu wrote:
         | I think the opposite is mostly true. Linux packaging folks are
         | carefully sculpting their toys, while everyone else is mostly
         | using upstream packages and docker containers to work around
         | the beautiful systems. For half the software I care about on my
         | Debian system, I have a version installed either directly from
         | the web (curl | bash style), from the developer's own APT repo,
         | or most likely from a separate package manager (be it MELPA,
         | pypi, Go cache, Maven, etc).
        
           | justinrubek wrote:
           | That sounds like an incredibly annoying way to manage
           | software. I don't have a single thing installed that way.
        
           | kombine wrote:
           | I use nix package manager on 3 of the systems I'm working
           | daily (one of them HPC cluster) and none of them run NixOS.
           | It's possible to carefully sculpt one's tools and use latest
           | and greatest.
        
       | sheepscreek wrote:
       | YES! I want more tools to be deterministic. My wish-list has
       | Proxmox config at the very top.
        
         | TheDong wrote:
         | Want to give this a try and see if it works?
         | https://github.com/SaumonNet/proxmox-nixos?tab=readme-ov-fil...
        
       | knowitnone wrote:
       | 99%? Debbie Downer says it only takes 1 package to screw the
       | pooch
        
         | ethersteeds wrote:
         | I would still much prefer playing 100:1 Russian roulette than
         | 1:1, if those are my options.
        
         | nwah1 wrote:
         | There's a long tail of obscure packages that are rarely used,
         | and almost certainly a power law in terms of which packages are
         | common. Reproducibility often requires coordination between
         | both the packagers and the developers, and achieving that for
         | each and every package is optimistic.
         | 
         | If they just started quarantining the long tail of obscure
         | packages, then people would get upset. And failing to be 100%
         | reproducible will make a subset of users upset. Lose-lose
         | proposition there, given that intelligent users could just
         | consciously avoid packages that aren't passing reproducibility
         | tests.
         | 
         | 100% reproducibility is a good goal, but as long as the
         | ubiquitous packages are reproducible then that is probably
         | going to cover most. Would be interesting to provide an easy
         | way to disallow non-reproducible packages.
         | 
         | I'm sure one day they will be able to make it a requirement for
         | inclusion into the official repos.
        
           | yjftsjthsd-h wrote:
           | There's an interesting thought - in addition to aiming for
           | 99% of _all_ packages, perhaps it would be a good idea to
           | target 100% of the packages that, say, land in the official
           | install media? (I wouldn 't even be surprised if they already
           | meet that goal TBH, but making it explicit and documenting it
           | has value)
        
         | EasyMark wrote:
         | "All I see is 1% of complete failure" --Bad Dads everywhere
        
       | patrakov wrote:
       | This goal feels like a marketing OKR to me. A proper technical
       | goal would be "all packages, except the ones that have a valid
       | reason, such as signatures, not to be reproducible".
        
         | RegnisGnaw wrote:
         | As someone who dabbles a bit in the RHEL world, IIRC all
         | packages in Fedora are signed. In additional the DNF/Yum meta-
         | data is also signed.
         | 
         | IIRC I don't think Debian packages themselves are signed
         | themselves but the apt meta-data is signed.
        
           | westurner wrote:
           | I learned this from an ansible molecule test env setup script
           | for use in containers and VMs years ago; because `which`
           | isn't necessarily installed in containers for example:
           | type -p apt && (set -x; apt install -y debsums; debsums |
           | grep -v 'OK$') || \       type -p rpm && rpm -Va  # --verify
           | --all
           | 
           | dnf reads .repo files from /etc/yum.repos.d/ [1] which have
           | various gpg options; here's an /etc/yum.repos.d/fedora-
           | updates.repo:                 [updates]       name=Fedora
           | $releasever - $basearch - Updates       #baseurl=http://downl
           | oad.example/pub/fedora/linux/updates/$releasever/Everything/$
           | basearch/       metalink=https://mirrors.fedoraproject.org/me
           | talink?repo=updates-released-f$releasever&arch=$basearch
           | enabled=1       countme=1       repo_gpgcheck=0
           | type=rpm       gpgcheck=1       metadata_expire=6h
           | gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-
           | fedora-$releasever-$basearch       skip_if_unavailable=False
           | 
           | From the dnf conf docs [1], there are actually even more per-
           | repo gpg options:                 gpgkey
           | gpgkey_dns_verification              repo_gpgcheck
           | localpkg_gpgcheck       gpgcheck
           | 
           | 1. https://dnf.readthedocs.io/en/latest/conf_ref.html#repo-
           | opti...
           | 
           | 2. https://docs.ansible.com/ansible/latest/collections/ansibl
           | e/... lists a gpgcakey parameter for the
           | ansible.builtin.yum_repository module
           | 
           | For Debian, Ubuntu, Raspberry Pi OS and other dpkg .deb and
           | apt distros:                 man sources.list       man
           | sources.list | grep -i keyring -C 10       # trusted:       #
           | signed-by:       # /etc/apt/ trusted.gpg.d/       man apt-
           | secure       man apt-key       apt-key help       less
           | "$(type -p apt-key)"
           | 
           | signing-apt-repo-faq:
           | https://github.com/crystall1nedev/signing-apt-repo-faq
           | 
           | From "New requirements for APT repository signing in 24.04"
           | (2024) https://discourse.ubuntu.com/t/new-requirements-for-
           | apt-repo... :
           | 
           | > _In Ubuntu 24.04, APT will require repositories to be
           | signed using one of the following public key algorithms: [
           | RSA with at least 2048-bit keys, Ed25519, Ed448 ]_
           | 
           | > _This has been made possible thanks to recent work in GnuPG
           | 2.4 82 by Werner Koch to allow us to specify a "public key
           | algorithm assertion" in APT when calling the gpgv tool for
           | verifying repositories._
        
         | 0zymandiass wrote:
         | If you'd bothered to read:
         | 
         | ```This definition excludes signatures and some metadata and
         | focuses solely on the payload of packaged files in a given RPM:
         | A build is reproducible if given the same source code, build
         | environment and build instructions, and metadata from the build
         | artifacts, any party can recreate copies of the artifacts that
         | are identical except for the signatures and parts of
         | metadata.```
        
         | eru wrote:
         | At Google SRE we often had very technical OKRs that were
         | formulated with some 'number of 9s'. Like 99.9999% uptime or
         | something like that. So getting two 9s of reproducibility seems
         | like a reasonable first goal. I hope they will be adding more
         | nines later.
        
       | binarymax wrote:
       | I often see initiatives and articles like this but no mention of
       | Nix. Is it just not well known enough for comparison? Because to
       | me that's the standard.
        
         | esseph wrote:
         | It's an article about Fedora, specifically.
        
           | binarymax wrote:
           | Yes, I know that. But when talking about reproducible
           | packages, we can and should learn from existing techniques.
           | Is Fedoras system capable of this goal? Is it built the right
           | way? Should it adopt an alternate package manager to achieve
           | this with less headache?
        
             | dogleash wrote:
             | "What about Nix tho?" is the "Rewrite it in Rust" of
             | reproducible builds.
        
             | johnny22 wrote:
             | No other currently distro whether it be debian, fedora, or
             | arch, or opensuse seems to want to switch up their design
             | to match nix's approach.
             | 
             | They have too many people familiar with the current
             | approaches.
        
         | djha-skin wrote:
         | It's very, very complicated. It's so far past the maximum
         | effort line of most linux users as to be in its own class of
         | tools. Reproducibility in the imperative package space is worth
         | a lot. Lots of other tools are built on RPM/DEB packages that
         | offer similar advantages of Nix -- Ansible, for one. This is
         | more of a "rising tide raises all boats" situation.
        
         | steeleduncan wrote:
         | I use Nix extensively, but the Nix daemon doesn't do much of
         | use that can't be achieved by building your code from a fixed
         | OCI container with internet turned off. The latter is certainly
         | more standard across the industry, and sadly a lot easier too.
         | Nix is not a revolutionary containerisation technology, nor
         | honestly a very good one.
         | 
         | The value in Nix comes from the package set, nixpkgs. What is
         | revolutionary is how nixpgks builds a Linux distribution
         | declaratively, and reproducibly, from source through purely
         | functional expressions. However, nixpkgs is almost an entire
         | universe unto itself, and it is generally incompatible with the
         | way any other distribution would handle things, so it would be
         | no use to Fedora, Debian, and others
        
           | fpoling wrote:
           | At work we went back to a Docker build to make reproducible
           | images. The primary reason is poor cross-compilation support
           | in Nix on Arm when developers needed to compile for an amd64
           | service and derive image checksums that are put into tooling
           | that are run locally for service version verification and
           | reproducibility.
           | 
           | With Docker it turned out relatively straightforward. With
           | Nix even when it runs in Linux Arm VM we tried but just gave
           | up.
        
             | binarymax wrote:
             | Funny, I had that experience with Docker - mostly due to
             | c++ dependencies - that were fine in Nix
        
         | lima wrote:
         | Contrary to popular opinion, Nix builds aren't reproducible:
         | https://luj.fr/blog/is-nixos-truly-reproducible.html
        
           | pxc wrote:
           | That's such a weird characterization of this article, which
           | (in contrast to other writing on this subject) clearly
           | concludes (a) that Nix achieves a very high degree of
           | reproducibility and is continuously improving in this
           | respect, and (b) Nix is moreover reproducible in a way that
           | most other distros (even distros that do well in some
           | measures of bitwise reproducibility) are not (namely, time
           | traveling-- being able to reproduce builds in different
           | environments, even months or years later, because the build
           | environment itself is more reproducible).
           | 
           | The article you linked is very clear that both qualitatively
           | and quantitatively, NixOS has made achieved high degrees of
           | reproducibility, and even explicitly rejects the possibility
           | of assessing absolute reproducibility.
           | 
           | NixOS may not be the absolute leader here (that's probably
           | stagex, or GuixSD if you limit yourself to more practical
           | distros with large package collections), but it is indeed
           | very good.
           | 
           | Did you mean to link to a different article?
        
             | looofooo0 wrote:
             | What does high degree mean? Either you achieve bit for bit
             | reproducibility of your builds or not.
        
               | kfajdsl wrote:
               | If 99% of the packages in your repos are bit-for-bit
               | reproducible, you can consider your distro to have a high
               | degree of reproducibility.
        
               | pxc wrote:
               | And with even a little bit of imagination, it's easy to
               | think of other possible measures of degrees of
               | reproducibility, e.g.:                 * % of deployed
               | systems which consist only of reproducibly built packages
               | * % of commonly downloaded disk images (install media,
               | live media, VM images, etc.) consist only of reproducibly
               | built packages       * total # of reproducibly built
               | packages available       * comparative measures of what
               | NixOS is doing right like: of packages that are
               | reproducibly built in some distros but not others, how
               | many are built reproducibly in NixOS       * binary
               | bootstrap size (smaller is better, obviously)
               | 
               | It's really not difficult to think of meaningful ways
               | that reproducibility of different distros might be
               | compared, even quantitatively.
        
           | vlovich123 wrote:
           | Sure, but in terms of absolute number of packages that are
           | truly reproducible, they outnumber Debian because Debian only
           | targets reproducibility for a smaller fraction of total
           | packages & even there they're not 100%. I haven't been able
           | to find reliable numbers for Fedora on how many packages they
           | have & in particular how many this 99% is targeting.
           | 
           | By any conceivable metric Nix really is ahead of the pack.
           | 
           | Disclaimer: I have no affiliation with Nix, Fedora, Debian
           | etc. I just recognize that Nix has done a lot of hard work in
           | this space & Fedora + Debian jumping onto this is in no small
           | part thanks to the path shown by Nix.
        
             | Foxboron wrote:
             | They are not.
             | 
             | Arch hovers around 87%-90% depending on regressions.
             | https://reproducible.archlinux.org/
             | 
             | Debian reproduces 91%-95% of their packages (architecture
             | dependent) https://reproduce.debian.net/
             | 
             | > Disclaimer: I have no affiliation with Nix, Fedora,
             | Debian etc. I just recognize that Nix has done a lot of
             | hard work in this space & Fedora + Debian jumping onto this
             | is in no small part thanks to the path shown by Nix
             | 
             | This is _completely_ the wrong way around.
             | 
             | Debian spearheaded the Reproducible Builds efforts in 2016
             | with contributions from SUSE, Fedora and Arch. NixOS got
             | onto this as well but has seen less progress until the past
             | 4-5 years.
             | 
             | The NixOS efforts owes the Debian project _all_ their
             | thanks.
        
               | vlovich123 wrote:
               | From your own link:
               | 
               | > Arch Linux is 87.7% reproducible with 1794 bad 0
               | unknown and 12762 good packages.
               | 
               | That's < 15k packages. Nix by comparison has ~100k total
               | packages they are trying to make reproducible and has
               | about 85% of them reproducible. Same goes for Debian -
               | ~37k packages tracked for reproducible builds. One way to
               | lie with percentages is when the absolute numbers are so
               | disparate.
               | 
               | > This is completely the wrong way around. Debian
               | spearheaded the Reproducible Builds efforts in 2016 with
               | contributions from SUSE, Fedora and Arch. NixOS got onto
               | this as well but has seen less progress until the past
               | 4-5 years. The NixOS efforts owes the Debian project all
               | their thanks.
               | 
               | Debian organized the broader effort across Linux distros.
               | However the Nix project was designed from the ground up
               | around reproducibility. It also pioneered architectural
               | approaches that other systems have tried to emulate
               | since. I think you're grossly misunderstanding the role
               | Nix played in this effort.
        
               | Foxboron wrote:
               | > That's < 15k packages. Nix by comparison has ~100k
               | total packages they are trying to make reproducible and
               | has about 85% of them reproducible. Same goes for Debian
               | - ~37k packages tracked for reproducible builds. One way
               | to lie with percentages is when the absolute numbers are
               | so disparate.
               | 
               | That's not a lie. That is the package target. The
               | `nixpkgs` repository in the same vein package a huge
               | number of source archives and repackages entire
               | ecosystems into their own repository. This greatly
               | inflates the number of packages. You can't look at the
               | flat numbers.
               | 
               | > However the Nix project was designed from the ground up
               | around reproducibility.
               | 
               | It wasn't.
               | 
               | > It also pioneered architectural approaches that other
               | systems have tried to emulate since.
               | 
               | This has had no bearing, and you are greatly
               | overestimating the technical details of nix here. It's
               | fundamentally invented in 2002, and things has progressed
               | since then. `rpath` hacking _really_ is not magic.
               | 
               | > I think you're grossly misunderstanding the role Nix
               | played in this effort.
               | 
               | I've been contributing to the Reproducible Builds effort
               | since 2018.
        
         | 12345hn6789 wrote:
         | Nix is to Linux users what Linux is to normies.
        
         | __MatrixMan__ wrote:
         | In the near term it makes more sense to position nix as a
         | common interface between app developers and distro maintainers
         | and not as a direct-to-user way to cut their distro maintainers
         | out of the loop entirely (although it is quite useful for
         | that).
         | 
         | Ideally, a distro maintainer would come across a project
         | packaged with nix and think:
         | 
         | > Oh good, the app dev has taken extra steps to make life easy
         | for me.
         | 
         | As-is, I don't think that's the case. You can add a flake
         | output to your project which builds an .rpm or a .deb file, but
         | it's not commonly done.
         | 
         | I'm guessing that most of the time, distro maintainers would
         | instead hook directly into a language specific build-tool like
         | cmake or cargo and ignore the nix stuff. They benefit from nix
         | only indirectly in cases where it has prevented the app dev
         | from doing crazy things in their build (or at least has made
         | that crazyness explicit, versus some kind of works-on-my-
         | machine accident or some kind of nothing-to-see here
         | skulduggery).
         | 
         | If we want to nixify the world I think we should focus less on
         | talking people out of using package managers which they like
         | and more on making the underlying packages more uniform.
        
         | skrtskrt wrote:
         | Because Nix is a huge pain ramp up on and to use for anyone who
         | is not an enthusiast about the state of their computer.
         | 
         | What will happen is concepts from Nix will slowly get absorbed
         | into other, more user-friendly tooling while Nix circles the
         | complexity drain
        
         | diffeomorphism wrote:
         | Different notions of reproducible. This project cares
         | specifically about bit-for-bit identical builds (e.g. no time
         | stamps, parallel compile artifacts etc). Nix is more about
         | being declarative and "repeatable" or whatever a good name for
         | that would be.
         | 
         | Both notions are useful for different purposes and nix is not
         | particularly good at the first one.
         | 
         | https://reproducible-builds.org/citests/
        
         | jzb wrote:
         | Oh, I assure you, it's hard to escape knowing about Nix if you
         | write about this sort of thing. Someone will be along almost
         | immediately to inform you about it.
         | 
         | Nix wasn't mentioned (I'm the author) because it really isn't
         | relevant here -- the comparable distributions, when discussing
         | what Fedora is doing, are Debian and other distributions that
         | use similar packaging schemes and such.
        
           | __MatrixMan__ wrote:
           | I agree that NixOS/nixpkgs would not be a good a basis for
           | comparison. Do you have an opinion about the use of nix by
           | app devs to specify their builds, i.e. as a make alternative,
           | not as a Fedora alternative?
           | 
           | Quoting the article:
           | 
           | > Irreproducible bits in packages are quite often "caused by
           | an error or sloppiness in the code". For example, dependence
           | on hardware architecture in architecture-independent (noarch)
           | packages is "almost always unwanted and/or a bug", and
           | reproducibility tests can uncover those bugs.
           | 
           | This is the sort of thing that nix is good at guarding
           | against, and it's convenient that it doesn't require users to
           | engage with the underlying toolchain if they're unfamiliar
           | with it.
           | 
           | For instance I can use the command below to build helix at a
           | certain commit without even knowing that it's a rust package.
           | Although it doesn't guarantee all aspects of repeatability,
           | it will fail if the build depends on any bits for which a
           | hash is not known ahead of time, which gets you half way
           | there I think.                    nix build github:helix-
           | editor/helix/340934db92aea902a61b9f79b9e6f4bd15111044
           | 
           | Used in this way, can nix help Fedora's reproducibility
           | efforts? Or does it appear to Fedora as a superfluous layer
           | to be stripped away so that they can plug into cargo more
           | directly?
        
       | froh wrote:
       | nice to see they're in this too.
       | 
       | https://news.opensuse.org/2025/02/18/rbos-project-hits-miles...
        
       | charcircuit wrote:
       | This is a waste of time compared to investing in sandboxing which
       | will actually protect users as opposed to stopping theoretical
       | attacks. Fedora's sandbox capabilities for apps is so far behind
       | other operating systems like Android that it is much more
       | important of an area to address.
        
         | johnny22 wrote:
         | I think you have to do both sandboxing and this.
        
           | charcircuit wrote:
           | Both are good for security, but prioritization is important.
           | The areas that are weakest in terms of security should get
           | the most attention.
        
         | AshamedCaptain wrote:
         | I am yet to see a form of sandboxing for the desktop that is
         | not:
         | 
         | a) effectively useless
         | 
         | or b) makes me want to throw my computer through the window and
         | replace it with a 1990's device (still more useful than your
         | average Android).
        
         | fsflover wrote:
         | If you want security through compartmentalization, you should
         | consider Qubes OS, my daily driver, https://qubes-os.org.
        
           | charcircuit wrote:
           | This only secures between vms. This side steps the problem
           | and people can still easily run multiple applications in the
           | same qube.
        
             | fsflover wrote:
             | It's impossible to isolate applications inside one VM as
             | securely as with Qubes virtualization. You should not rely
             | on intra-VM hardening if you really care about security.
             | Having said that, Qubes does provide ways to harden the
             | VMs: https://forum.qubes-os.org/t/hardening-qubes-
             | os/4935/3, https://forum.qubes-os.org/t/replacing-
             | passwordless-root-wit....
        
               | charcircuit wrote:
               | People may want to have multiple apps work together. It
               | makes sense to have security within a qube itself than to
               | just declare it a free for all.
        
               | fsflover wrote:
               | If the apps work together, they typically belong to the
               | same security domain / trust level. Do you have examples
               | when you still have to isolate them from each other?
        
         | PhilippGille wrote:
         | > Fedora's sandbox capabilities for apps
         | 
         | Do you mean Flatpaks or something else?
        
           | charcircuit wrote:
           | Sure, that is one solution, but it still needs a lot of work
           | both to patch up holes in it and to fix apps to be better
           | designed in regards to security.
        
         | colonial wrote:
         | Defaulting to Android-style nanny sandboxing ("you can't grant
         | access to your Downloads folder because we say so" etc.) is
         | unlikely to go over well with the average Linux distro
         | userbase.
         | 
         | Also, maximally opt-in sandboxes for graphical applications
         | have been possible for a while. Just use Podman and only mount
         | your Wayland socket + any working files.
        
           | charcircuit wrote:
           | >Defaulting to Android-style nanny sandboxing ("you can't
           | grant access to your Downloads folder because we say so"
           | etc.) is unlikely to go over well with the average Linux
           | distro userbase.
           | 
           | If you market it that way. Plenty of Linux users say they
           | care about security, don't want maleware, etc. This is a step
           | towards those desires. Users have been conditioned to use
           | tools badly to designed for security for decades so there
           | will be some growing pains, but it will get worse the longer
           | people wait.
           | 
           | >Just use Podman and only mount your Wayland socket + any
           | working files.
           | 
           | This won't work for the average user. Security needs to be
           | accessible.
        
       | nimish wrote:
       | As a user of fedora what does this actually get me? I mean I
       | understand it for hermetic builds but why?
        
         | jacobgkau wrote:
         | My impression is that reproducible builds improve your security
         | by helping make it more obvious that packages haven't been
         | tampered with in late stages of the build system.
         | 
         | * Edit, it's quoted in the linked article:
         | 
         | > Jedrzejewski-Szmek said that one of the benefits of
         | reproducible builds was to help detect and mitigate any kind of
         | supply-chain attack on Fedora's builders and allow others to
         | perform independent verification that the package sources match
         | the binaries that are delivered by Fedora.
        
           | Zamicol wrote:
           | Bingo.
        
           | kazinator wrote:
           | The supply chain attacks you have to most worry about are not
           | someone breaking into Fedora build machines.
           | 
           | It's the attacks on the upstream packages themselves.
           | 
           | Reproducible builds would absolutely not catch a situation
           | like the XZ package being compromised a year ago, due to the
           | project merging a contribution from a malicious actor.
           | 
           | A downstream package system or OS distro will just take that
           | malicious update and spin it into a beautifully reproducing
           | build.
        
             | yjftsjthsd-h wrote:
             | Don't let the perfect be the enemy of the good; this
             | doesn't prevent upstream problems but it removes one place
             | for compromises to happen.
        
               | kazinator wrote:
               | I'm not saying don't have reproducible builds; it's just
               | that this is an unimportant justification for them,
               | almost unnecessary.
               | 
               | Reproducible builds are such an overhelmingly good and
               | obvious thing, that build farm security is just a
               | footnote.
        
             | phkahler wrote:
             | And anything designed to catch upstream problems like the
             | XZ compromise will not detect a compromise in the Fedora
             | package build environment. Kinda need both.
        
             | bluGill wrote:
             | Reproducible builds COULD fix the xz issues. The current
             | level would not, but github could do things to make
             | creating the downloadable packages scrip table and thus
             | reproducible. Fedora could checkout the git hash instead of
             | downloading the provided tarball and again get reproducible
             | builds that bypass this.
             | 
             | The above are things worth looking at doing.
             | 
             | However I'm not sure what you can code that tries to
             | obscure the issues while looking good.
        
         | bagels wrote:
         | It's one tool of many that can be used to prevent malicious
         | software from sneaking in to the supply chain.
        
         | russfink wrote:
         | Keep in mind that compilers can be backdoored to install
         | malicious code. Bitwise/signature equivalency does not imply
         | malware-free software.
        
           | bluGill wrote:
           | True, but every step we add makes the others harder too. It
           | is unlikely Ken Thompson's "trusting trust" compiler would
           | detect modern gcc, much less successfully introduce the
           | backdoor. Even if you start with a compromised gcc of that
           | type there is a good chance that after a few years it would
           | be caught when the latest gcc fails to build anymore for
           | someone with the compromised compiler. (now add clang and
           | people using that...)
           | 
           | We may never reach perfection, but the more steps we make in
           | that direction the more likely it is we reach a point where
           | we are impossible to compromise in the real world.
        
         | kazinator wrote:
         | Reproducible builds can improve software quality.
         | 
         | If we believe we have a reproducible build, that's constitutes
         | a big test case which gives us confidence in the
         | _determininism_ of the whole software stack.
         | 
         | To validate that test case, we actually have to repeat the
         | build a number of times.
         | 
         | If we spot a difference, something is wrong.
         | 
         | For instance, suppose that a compiler being used has a bug
         | whereby it is relying on the value of an unitialized variable
         | somewhere. That could show up as a difference in the code it
         | generates.
         | 
         | Without reproducible builds, of course there are always
         | differences in the results of a build: we cannot use repeated
         | builds to discover that something is wrong.
         | 
         | (People do diffs between irreproducible builds anyway. For
         | instance, disassemble the old and new binaries, and do a
         | textual diff, validating that only some expected changes are
         | present, like string literals that have embedded build dates.
         | If you have reproducible builds, you don't have to do that kind
         | of thing to detect a change.
         | 
         | Reproducible builds will strengthen the toolchains and
         | surrounding utilities. They will flush out instabilities in
         | build systems, like parallel Makefiles with race conditions, or
         | indeterminate orders of object files going into a link job,
         | etc.
        
           | uecker wrote:
           | I don't think it is that unlikely that build hosts or some
           | related part of the infrastructure gets compromised.
        
           | tomcam wrote:
           | I don't know this area, but it seems to me it might be a boon
           | to security? So that you can tell if components have been
           | tampered with?
        
             | bobmcnamara wrote:
             | Bingo. We caught a virus tampering with one of our code
             | gens this way.
        
             | dwheeler wrote:
             | Yes! The attack on SolarWinds Orion was an attack on its
             | build process. A verified reproducible build would have
             | detected the subversion, because the builds would not have
             | matched (unless the attackers managed to detect and break
             | into all the build processes).
        
       | Dwedit wrote:
       | Reproducibility is at odds with Profile-Guided-Optimization.
       | Especially on anything that involves networking and other IO that
       | isn't consistent.
        
         | michaelt wrote:
         | Why should it be?
         | 
         | Does the profiler not output a hprof file or whatever, which is
         | the input to the compiler making the release binary? Why not
         | just store that?
        
         | gnulinux wrote:
         | It's not at odds at all but it'll be "Monadic" in the sense
         | that the output of system A will be part of the input to system
         | A+1 which is complicated to organize in a systems setting,
         | especially if you don't have access to a language that can
         | verify. But it's absolutely achievable if you do have such a
         | tool, e.g. you can do this in nix.
        
         | zbobet2012 wrote:
         | That's only the case if you did PGO with "live" data instead of
         | replays from captured runs, which is best practice afaik.
        
         | nrvn wrote:
         | from Go documentation[0]:
         | 
         | > Committing profiles directly in the source repository is
         | recommended as profiles are an input to the build important for
         | reproducible (and performant!) builds. Storing alongside the
         | source simplifies the build experience as there are no
         | additional steps to get the profile beyond fetching the source.
         | 
         | I very much hope other languages/frameworks can do the same.
         | 
         | [0]: https://go.dev/doc/pgo#building
        
           | nyrikki wrote:
           | The _Performant_ claim there is counter to research I have
           | heard. Plus as the PGO profile data is non-deterministic in
           | most cases, even when compiled on the same hardware as the
           | end machine  "Committing profiles directly in the source
           | repository" is the reason why they are deleted or at least
           | excluded from the comparison.
           | 
           | A quote from the paper that I remember on the subject[1] as
           | these profiles are just about as machine dependent as you can
           | get.
           | 
           | > Unfortunately, most code improvements are not machine
           | independent, and the few that truly are machine independent
           | interact with those that are machine dependent causing phase-
           | ordering problems. Hence, effectively there are no machine-
           | independent code improvements.
           | 
           | There were some differences between various Xeon chip's
           | implementations of the same or neighboring generations that I
           | personally ran into when we tried to copy profiles to avoid
           | the cost of the profile runs that may make me a bit more
           | sensitive to this, but I personally saw huge drops in
           | performance well into the double digits that threw off our
           | regression testing.
           | 
           | IMHO this is exactly why your link suggested the following:
           | 
           | > Your production environment is the best source of
           | representative profiles for your application, as described in
           | Collecting profiles.
           | 
           | That is very different from Fedora using some random or
           | generic profile for x86_64, which may or may not match the
           | end users specific profile.
           | 
           | [1] https://dl.acm.org/doi/10.5555/184716.184723
        
             | clhodapp wrote:
             | If those differences matter so much for your workloads,
             | treat your different machine types as different different
             | architectures, commit profiling data for _all_ of them and
             | (deterministically) compile individual builds for all of
             | them.
             | 
             | Fedora upstream was never going to do that for you anyway
             | (way too many possible hardware configurations), so you
             | were already going be in the business of setting that up
             | for yourself.
        
         | nyrikki wrote:
         | This is one of the "costs" of reproducible builds, just like
         | the requirement to use pre-configured seeds for pseudo random
         | number generators etc.
         | 
         | It does hit real projects and may be part of the reason that
         | "99%" is called out but Fedora also mentions that they can't
         | match the _official_ reproducible-builds.org meaning in the
         | above just due to how RPMs work, so we will see what other
         | constraints they have to loosen.
         | 
         | Here is one example of where suse had to re-enable it for gzip.
         | 
         | https://build.opensuse.org/request/show/499887
         | 
         | Here is a thread on PGO from the reproducible-builds mail list.
         | 
         | https://lists.reproducible-builds.org/pipermail/rb-general/2...
         | 
         | There are other _costs_ like needing to get rid of parallel
         | builds for some projects that make many people loosen the
         | official constraints. The value of PGO+LTO being one.
         | 
         | gcda profiles are unreproducible, but the code they produce is
         | typically the same. If you look into the pipeline of some
         | projects, they just delete the gcda output and then often try a
         | rebuild if the code is different or other methods.
         | 
         | While there are no ideal solutions, one that seems to work
         | fairly well, assuming the upstream is doing reproducible
         | builds, is to vendor the code, build a reproducible build to
         | validate that vendored code, then enable optimizations.
         | 
         | But I get that not everyone agrees that the value of
         | reproducibility is primarily avoiding attacks on build
         | infrastructure.
         | 
         | However reproducible builds as nothing to do with MSO model
         | checking etc... like some have claimed. Much of it is just
         | deleting non-deterministic data as you can see here with
         | debian, which fedora copied.
         | 
         | https://salsa.debian.org/reproducible-builds/strip-nondeterm...
         | 
         | As increasing the granularity of address-space randomization at
         | compile and link time is easier than at the start of program
         | execution, obviously there will be a cost (that is more than
         | paid for by reducing supply chain risks IMHO) of reduced
         | entropy for address randomization and thus does increase the
         | risk of ROP style attacks.
         | 
         | Regaining that entropy at compile and link time, if it is
         | practical to recompile packages or vendor, may be worth the
         | effort in some situations, probably best to do real PGO at that
         | time too IMHO.
        
           | goodpoint wrote:
           | Yo, the attacker has access to the same binaries, so only
           | runtime address randomization is useful.
        
       | barotalomey wrote:
       | The real treasure was the friend I found along the way
       | 
       | https://github.com/keszybz/add-determinism
        
         | m463 wrote:
         | I kind of wonder if this or something similar could somehow
         | nullify timestamps so you could compare two logfiles...
         | 
         | further would be the ability to compare logfiles with pointer
         | addresses or something
        
           | didericis wrote:
           | A different but more powerful method of ensuring
           | reproducibility is more rigorous compilation using formally
           | verifiable proofs.
           | 
           | That's what https://pi2.network/ does. It uses K-Framework,
           | which is imo very underrated/deserves more attention as a
           | long term way of solving this kind of problem.
        
       ___________________________________________________________________
       (page generated 2025-04-11 23:00 UTC)