[HN Gopher] Fedora change aims for 99% package reproducibility
___________________________________________________________________
Fedora change aims for 99% package reproducibility
Author : voxadam
Score : 292 points
Date : 2025-04-11 13:40 UTC (9 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| ajross wrote:
| Linux folks continue with running away with package security
| paradigms while NPM, PyPI, cargo, et. al. (like that VSCode
| extension registry that was on the front page last week) think
| they can still get away with just shipping what some rando
| pushes.
| hedora wrote:
| Shipping what randos push works great for iOS and Android too.
|
| System perl is actually good. It's too bad the Linux vendors
| don't bother with system versions of newer languages.
| heinrich5991 wrote:
| System rustc is also good on Arch Linux. I think system
| rustc-web is also fine on Debian.
| WD-42 wrote:
| I recently re-installed and instead of installing rustup I
| thought I'd give the Arch rust package a shot. Supposedly
| it's not the "correct" way to do it but so far it's working
| great, none of my projects require nightly. Updates have
| come in maybe a day after upstream. One less thing to think
| about, I like it.
| g-b-r wrote:
| Sure, there's never malware on the stores...
| ajross wrote:
| > Shipping what randos push works great for iOS and Android
| too.
|
| App store software is _excruciatingly_ vetted, though. Apple
| and Google spend far, far, _FAR_ more on validating the
| software they ship to customers than Fedora or Canonical, and
| it 's not remotely close.
|
| It only looks like "randos" because the armies of auditors
| and datacenters of validation software are hidden behind the
| paywall.
| IshKebab wrote:
| It's really not. At least on Android they have some
| automated vetting but that's about it. Any human vetting is
| mostly for business reasons.
|
| Also Windows and Mac have existed for decades and there's
| zero vetting there. Yeah malware exists but its easy to
| avoid and _easily_ worth the benefit of actually being able
| to get up-to-date software from anywhere.
| skydhash wrote:
| > _Also Windows and Mac have existed for decades and
| there 's zero vetting there._
|
| Isn't that only for applications? All the system software
| are provided and vetted by the OS developer.
| IshKebab wrote:
| Yeah but distro package repositories mostly consist of
| applications and non-system libraries.
| skydhash wrote:
| Most users only have an handful of applications
| installed. They may not be the same one, but flatpak is
| easy to setup if you want the latest version. And you're
| not tied to the system package manager if you want the
| latest for CLI software (nix, brew, toolbox,...)
|
| The nice thing about Debian is that you can have 2 full
| years of routine maintenance while getting reading for
| the next big updates. The main issue is upstream
| developer having bug fixes and features update on the
| same patch.
| ndiddy wrote:
| The vetting on Windows is that basically any software
| that isn't signed by an EV certificate will show a scary
| SmartScreen warning on users' computers. Even if your
| users aren't deterred by the warning, you also have a
| significant chance of your executable getting mistakenly
| flagged by Windows Defender, and then you have to file a
| request for Microsoft to whitelist you.
|
| The vetting on Mac is that any unsigned software will
| show a scary warning and make your users have to dig into
| the security options in Settings to get the software to
| open.
|
| This isn't really proactive, but it means that if you
| ship malware, Microsoft/Apple can revoke your
| certificate.
|
| If you're interested in something similar to this
| distribution model on Linux, I would check out Flatpak.
| It's similar to how distribution works on Windows/Mac
| with the added benefit that updates are handled centrally
| (so you don't need to write auto-update functionality
| into each program) and that all programs are manually
| vetted both before they go up on Flathub and when they
| change any permissions. It also doesn't cost any money to
| list software, unlike the "no scary warnings"
| distribution options for both Windows and Mac.
| ykonstant wrote:
| >App store software is excruciatingly vetted, though. Apple
| and Google spend far, far, FAR more on validating the
| software they ship to customers than Fedora or Canonical,
| and it's not remotely close.
|
| Hahah! ...they don't. They really don't, man. They _do_
| have procedures in place that makes them look like they do,
| though; I 'll give you that.
| anotherhue wrote:
| I have observed a sharp disconnect in the philosophies of
| 'improving developer experience' and 'running a tight ship'.
|
| I think the last twenty years of quasi-
| marketing/sales/recruiting DevRel roles have pushed a narrative
| of frictionless development, while on the flip side security
| and correctness have mostly taken a back seat (special
| industries aside).
|
| I think it's a result of the massive market growth, but I so
| welcome the pendulum swinging back a little bit. Typo squatting
| packages being a concern at the same time as speculative
| execution exploits shows mind bending immaturity.
| psadauskas wrote:
| "Security" and "Convenience" is always a tradeoff, you can
| never have both.
| bluGill wrote:
| Ture. Then the Convenience folks don't understand why the
| rest of us don't want the things they think are so great.
|
| There are good middle grounds, but most package managers
| don't even acknowledge other concerns as valid.
| benterix wrote:
| This is obvious, the question here is why everybody traded
| security for convenience and what else has to happen for
| people to start taking security seriously.
| ziddoap wrote:
| > _the question here is why everybody traded security for
| convenience_
|
| I don't think security was traded away for convenience.
| Everything started with convenience, and security has
| been trying to gain ground ever since.
|
| > _happen for people to start taking security seriously_
|
| Law with _enforced_ and _non-trivial_ consequences are
| the only thing that will force people to take security
| seriously. And even then, most probably still wont.
| vladms wrote:
| Regarding "what else has to happen": I would say
| something catastrophic. Nothing comes to mind recently.
|
| Security is good, but occasionally I wonder if technical
| people don't imagine fantastic scenarios of evil
| masterminds doing something with the data and manage to
| rule the world.
|
| While in reality, at least the last 5 years there are so
| many leaders (and people) doing and saying so plainly
| stupid that I feel we should be more afraid of stupid
| people than of hackers.
| bluGill wrote:
| In the last 5 years several major medical providers have
| had sensitive person data of nearly everyone compromised.
| The political leaders are biggest problem today, but that
| could change again.
| yjftsjthsd-h wrote:
| It's not quite a straight trade; IIRC the OpenBSD folks
| really push on good docs and maybe good defaults precisely
| because making it easier to hold the tool right makes it
| safer.
| cyrnel wrote:
| I've seen this more formalized as a triangle, with
| "functionality" being the third point:
| https://blog.c3l-security.com/2019/06/balancing-
| functionalit...
|
| You can get secure and easy-to-use tools, but they
| typically have to be really simple things.
| phkahler wrote:
| I think that's a consequence of programmers making tools for
| programmers. It's something I've come to really dislike.
| Programmers are used to doing things like editing
| configurations, setting environment variables, or using
| custom code to solve a problem. As a result you get programs
| that can be configured, customized through code (scripting or
| extensions), and little tools to do whatever. This is not
| IMHO how good software should be designed to be used. On the
| positive side, we have some really good tooling - revision
| control in the software world is way beyond the equivalent in
| any other field. But then git could be used in other fields
| if not for it being a programmers tool designed by
| programmers... A lot of developers even have trouble doing
| things with git that are outside their daily use cases.
|
| Dependency management tools are tools that come about because
| it's easier and more natural for a programmer to write some
| code than solve a bigger problem. Easier to write a tool than
| write your own version of something or clean up a complex set
| of dependencies.
| esafak wrote:
| The future is not evenly distributed.
| Palomides wrote:
| distros get unbelievable amounts of hate for not immediately
| integrating upstream changes, there's really no winning
| IshKebab wrote:
| Rightly so. The idea that all software should be packaged for
| all distros, and that you shouldn't want to use the latest
| version of software is clearly ludicrous. It only seems
| vaguely reasonable because it's what's always been done.
|
| If Linux had evolved a more sensible system and someone came
| along and suggested "no actually I think each distro should
| have its own package format and they should all be
| responsible for packaging all software in the world, and they
| should use old versions too for stability" they would rightly
| be laughed out of the room.
| bluGill wrote:
| There are too many distributions / formats. However
| distribution package are much better than
| snap/flatpack/docker for most uses, the only hard part is
| there are so many that no program can put "and is
| integrated into the package manager" in their release
| steps. You can ship a docker container in your program
| release - it is often done but rarely what should be done.
| oivey wrote:
| For some cases maybe that makes sense, but in a very
| large percentage it does not. As example, what if I want
| to build and use/deploy a Python app that needs the
| latest NumPy, and the system package manager doesn't have
| it. It would be hard to justify for me to figure out and
| build a distro specific package for this rather than just
| using the Python package on PyPI.
| bluGill wrote:
| The point is the distro should provide the numpy you
| need.
| aseipp wrote:
| And what happens when they can't do that because you need
| the latest major version with specific features?
| michaelt wrote:
| _> The idea that [...] you shouldn 't want to use the
| latest version of software is clearly ludicrous._
|
| To get to that world, we developers would have to give up
| making breaking changes.
|
| We can't have any "your python 2 code doesn't work on
| python 3" nonsense.
|
| Should we stop making breaking changes? Maybe. Will we? No.
| dagw wrote:
| _We can't have any "your python 2 code doesn't work on
| python 3" nonsense_
|
| This only happens because distros insit on shipping
| python and then everyone insisted on using that python to
| run their software.
|
| In an alternate world everybody would just ship their own
| python with their own app and not have that problem.
| That's how windows basically solves this
| bluGill wrote:
| Which sounds good until you run out of disk space.
|
| Of course I grew up when hard drives were not affordable
| by normal people - my parents had to save for months to
| get my a floppy drive.
| charcircuit wrote:
| You can have both python 2 and python 3 installed. Apps
| should get the dependencies they request. Distros
| swapping out dependencies out from under them has caused
| numerous issues for developers.
| marcusb wrote:
| What is a distribution but a collection of software
| packaged in a particular way?
| danieldk wrote:
| nixpkgs packages pretty much everything I need. It's a very
| large package set and very fresh. It's mostly about culture
| and tooling. I tried to contribute to Debian once and gave
| up after months. I was contributing to nixpkgs days after I
| started using Nix.
|
| Having every package as part of a distribution is immensely
| useful. You can declaratively define your whole system with
| all software. I can roll out a desktop, development VM or
| server within 5 minutes and it's fully configured.
| jorvi wrote:
| They do not. Even the vast majority of Arch users thinks the
| policy of "integrate breakage anyway and post the warning on
| the web changelog" is pants-on-head insane, especially
| compared to something like SUSE Tumbleweed (also a rolling
| distro) where things get tested and will stay staged if
| broken.
| loeg wrote:
| > They do not.
|
| I present to you sibling comment posted slightly before
| yours: https://news.ycombinator.com/item?id=43655093
|
| They do.
| zer00eyz wrote:
| Distros get real hate for being so out of date that upstream
| gets a stream of bug reports on old and solved issues.
|
| Prime example of this is what the Bottles dev team as done.
|
| It isnt an easy problem to solve.
| 6SixTy wrote:
| That's, in my experience, mostly Debian Stable.
| tsimionescu wrote:
| I think the opposite is mostly true. Linux packaging folks are
| carefully sculpting their toys, while everyone else is mostly
| using upstream packages and docker containers to work around
| the beautiful systems. For half the software I care about on my
| Debian system, I have a version installed either directly from
| the web (curl | bash style), from the developer's own APT repo,
| or most likely from a separate package manager (be it MELPA,
| pypi, Go cache, Maven, etc).
| justinrubek wrote:
| That sounds like an incredibly annoying way to manage
| software. I don't have a single thing installed that way.
| kombine wrote:
| I use nix package manager on 3 of the systems I'm working
| daily (one of them HPC cluster) and none of them run NixOS.
| It's possible to carefully sculpt one's tools and use latest
| and greatest.
| sheepscreek wrote:
| YES! I want more tools to be deterministic. My wish-list has
| Proxmox config at the very top.
| TheDong wrote:
| Want to give this a try and see if it works?
| https://github.com/SaumonNet/proxmox-nixos?tab=readme-ov-fil...
| knowitnone wrote:
| 99%? Debbie Downer says it only takes 1 package to screw the
| pooch
| ethersteeds wrote:
| I would still much prefer playing 100:1 Russian roulette than
| 1:1, if those are my options.
| nwah1 wrote:
| There's a long tail of obscure packages that are rarely used,
| and almost certainly a power law in terms of which packages are
| common. Reproducibility often requires coordination between
| both the packagers and the developers, and achieving that for
| each and every package is optimistic.
|
| If they just started quarantining the long tail of obscure
| packages, then people would get upset. And failing to be 100%
| reproducible will make a subset of users upset. Lose-lose
| proposition there, given that intelligent users could just
| consciously avoid packages that aren't passing reproducibility
| tests.
|
| 100% reproducibility is a good goal, but as long as the
| ubiquitous packages are reproducible then that is probably
| going to cover most. Would be interesting to provide an easy
| way to disallow non-reproducible packages.
|
| I'm sure one day they will be able to make it a requirement for
| inclusion into the official repos.
| yjftsjthsd-h wrote:
| There's an interesting thought - in addition to aiming for
| 99% of _all_ packages, perhaps it would be a good idea to
| target 100% of the packages that, say, land in the official
| install media? (I wouldn 't even be surprised if they already
| meet that goal TBH, but making it explicit and documenting it
| has value)
| EasyMark wrote:
| "All I see is 1% of complete failure" --Bad Dads everywhere
| patrakov wrote:
| This goal feels like a marketing OKR to me. A proper technical
| goal would be "all packages, except the ones that have a valid
| reason, such as signatures, not to be reproducible".
| RegnisGnaw wrote:
| As someone who dabbles a bit in the RHEL world, IIRC all
| packages in Fedora are signed. In additional the DNF/Yum meta-
| data is also signed.
|
| IIRC I don't think Debian packages themselves are signed
| themselves but the apt meta-data is signed.
| westurner wrote:
| I learned this from an ansible molecule test env setup script
| for use in containers and VMs years ago; because `which`
| isn't necessarily installed in containers for example:
| type -p apt && (set -x; apt install -y debsums; debsums |
| grep -v 'OK$') || \ type -p rpm && rpm -Va # --verify
| --all
|
| dnf reads .repo files from /etc/yum.repos.d/ [1] which have
| various gpg options; here's an /etc/yum.repos.d/fedora-
| updates.repo: [updates] name=Fedora
| $releasever - $basearch - Updates #baseurl=http://downl
| oad.example/pub/fedora/linux/updates/$releasever/Everything/$
| basearch/ metalink=https://mirrors.fedoraproject.org/me
| talink?repo=updates-released-f$releasever&arch=$basearch
| enabled=1 countme=1 repo_gpgcheck=0
| type=rpm gpgcheck=1 metadata_expire=6h
| gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-
| fedora-$releasever-$basearch skip_if_unavailable=False
|
| From the dnf conf docs [1], there are actually even more per-
| repo gpg options: gpgkey
| gpgkey_dns_verification repo_gpgcheck
| localpkg_gpgcheck gpgcheck
|
| 1. https://dnf.readthedocs.io/en/latest/conf_ref.html#repo-
| opti...
|
| 2. https://docs.ansible.com/ansible/latest/collections/ansibl
| e/... lists a gpgcakey parameter for the
| ansible.builtin.yum_repository module
|
| For Debian, Ubuntu, Raspberry Pi OS and other dpkg .deb and
| apt distros: man sources.list man
| sources.list | grep -i keyring -C 10 # trusted: #
| signed-by: # /etc/apt/ trusted.gpg.d/ man apt-
| secure man apt-key apt-key help less
| "$(type -p apt-key)"
|
| signing-apt-repo-faq:
| https://github.com/crystall1nedev/signing-apt-repo-faq
|
| From "New requirements for APT repository signing in 24.04"
| (2024) https://discourse.ubuntu.com/t/new-requirements-for-
| apt-repo... :
|
| > _In Ubuntu 24.04, APT will require repositories to be
| signed using one of the following public key algorithms: [
| RSA with at least 2048-bit keys, Ed25519, Ed448 ]_
|
| > _This has been made possible thanks to recent work in GnuPG
| 2.4 82 by Werner Koch to allow us to specify a "public key
| algorithm assertion" in APT when calling the gpgv tool for
| verifying repositories._
| 0zymandiass wrote:
| If you'd bothered to read:
|
| ```This definition excludes signatures and some metadata and
| focuses solely on the payload of packaged files in a given RPM:
| A build is reproducible if given the same source code, build
| environment and build instructions, and metadata from the build
| artifacts, any party can recreate copies of the artifacts that
| are identical except for the signatures and parts of
| metadata.```
| eru wrote:
| At Google SRE we often had very technical OKRs that were
| formulated with some 'number of 9s'. Like 99.9999% uptime or
| something like that. So getting two 9s of reproducibility seems
| like a reasonable first goal. I hope they will be adding more
| nines later.
| binarymax wrote:
| I often see initiatives and articles like this but no mention of
| Nix. Is it just not well known enough for comparison? Because to
| me that's the standard.
| esseph wrote:
| It's an article about Fedora, specifically.
| binarymax wrote:
| Yes, I know that. But when talking about reproducible
| packages, we can and should learn from existing techniques.
| Is Fedoras system capable of this goal? Is it built the right
| way? Should it adopt an alternate package manager to achieve
| this with less headache?
| dogleash wrote:
| "What about Nix tho?" is the "Rewrite it in Rust" of
| reproducible builds.
| johnny22 wrote:
| No other currently distro whether it be debian, fedora, or
| arch, or opensuse seems to want to switch up their design
| to match nix's approach.
|
| They have too many people familiar with the current
| approaches.
| djha-skin wrote:
| It's very, very complicated. It's so far past the maximum
| effort line of most linux users as to be in its own class of
| tools. Reproducibility in the imperative package space is worth
| a lot. Lots of other tools are built on RPM/DEB packages that
| offer similar advantages of Nix -- Ansible, for one. This is
| more of a "rising tide raises all boats" situation.
| steeleduncan wrote:
| I use Nix extensively, but the Nix daemon doesn't do much of
| use that can't be achieved by building your code from a fixed
| OCI container with internet turned off. The latter is certainly
| more standard across the industry, and sadly a lot easier too.
| Nix is not a revolutionary containerisation technology, nor
| honestly a very good one.
|
| The value in Nix comes from the package set, nixpkgs. What is
| revolutionary is how nixpgks builds a Linux distribution
| declaratively, and reproducibly, from source through purely
| functional expressions. However, nixpkgs is almost an entire
| universe unto itself, and it is generally incompatible with the
| way any other distribution would handle things, so it would be
| no use to Fedora, Debian, and others
| fpoling wrote:
| At work we went back to a Docker build to make reproducible
| images. The primary reason is poor cross-compilation support
| in Nix on Arm when developers needed to compile for an amd64
| service and derive image checksums that are put into tooling
| that are run locally for service version verification and
| reproducibility.
|
| With Docker it turned out relatively straightforward. With
| Nix even when it runs in Linux Arm VM we tried but just gave
| up.
| binarymax wrote:
| Funny, I had that experience with Docker - mostly due to
| c++ dependencies - that were fine in Nix
| lima wrote:
| Contrary to popular opinion, Nix builds aren't reproducible:
| https://luj.fr/blog/is-nixos-truly-reproducible.html
| pxc wrote:
| That's such a weird characterization of this article, which
| (in contrast to other writing on this subject) clearly
| concludes (a) that Nix achieves a very high degree of
| reproducibility and is continuously improving in this
| respect, and (b) Nix is moreover reproducible in a way that
| most other distros (even distros that do well in some
| measures of bitwise reproducibility) are not (namely, time
| traveling-- being able to reproduce builds in different
| environments, even months or years later, because the build
| environment itself is more reproducible).
|
| The article you linked is very clear that both qualitatively
| and quantitatively, NixOS has made achieved high degrees of
| reproducibility, and even explicitly rejects the possibility
| of assessing absolute reproducibility.
|
| NixOS may not be the absolute leader here (that's probably
| stagex, or GuixSD if you limit yourself to more practical
| distros with large package collections), but it is indeed
| very good.
|
| Did you mean to link to a different article?
| looofooo0 wrote:
| What does high degree mean? Either you achieve bit for bit
| reproducibility of your builds or not.
| kfajdsl wrote:
| If 99% of the packages in your repos are bit-for-bit
| reproducible, you can consider your distro to have a high
| degree of reproducibility.
| pxc wrote:
| And with even a little bit of imagination, it's easy to
| think of other possible measures of degrees of
| reproducibility, e.g.: * % of deployed
| systems which consist only of reproducibly built packages
| * % of commonly downloaded disk images (install media,
| live media, VM images, etc.) consist only of reproducibly
| built packages * total # of reproducibly built
| packages available * comparative measures of what
| NixOS is doing right like: of packages that are
| reproducibly built in some distros but not others, how
| many are built reproducibly in NixOS * binary
| bootstrap size (smaller is better, obviously)
|
| It's really not difficult to think of meaningful ways
| that reproducibility of different distros might be
| compared, even quantitatively.
| vlovich123 wrote:
| Sure, but in terms of absolute number of packages that are
| truly reproducible, they outnumber Debian because Debian only
| targets reproducibility for a smaller fraction of total
| packages & even there they're not 100%. I haven't been able
| to find reliable numbers for Fedora on how many packages they
| have & in particular how many this 99% is targeting.
|
| By any conceivable metric Nix really is ahead of the pack.
|
| Disclaimer: I have no affiliation with Nix, Fedora, Debian
| etc. I just recognize that Nix has done a lot of hard work in
| this space & Fedora + Debian jumping onto this is in no small
| part thanks to the path shown by Nix.
| Foxboron wrote:
| They are not.
|
| Arch hovers around 87%-90% depending on regressions.
| https://reproducible.archlinux.org/
|
| Debian reproduces 91%-95% of their packages (architecture
| dependent) https://reproduce.debian.net/
|
| > Disclaimer: I have no affiliation with Nix, Fedora,
| Debian etc. I just recognize that Nix has done a lot of
| hard work in this space & Fedora + Debian jumping onto this
| is in no small part thanks to the path shown by Nix
|
| This is _completely_ the wrong way around.
|
| Debian spearheaded the Reproducible Builds efforts in 2016
| with contributions from SUSE, Fedora and Arch. NixOS got
| onto this as well but has seen less progress until the past
| 4-5 years.
|
| The NixOS efforts owes the Debian project _all_ their
| thanks.
| vlovich123 wrote:
| From your own link:
|
| > Arch Linux is 87.7% reproducible with 1794 bad 0
| unknown and 12762 good packages.
|
| That's < 15k packages. Nix by comparison has ~100k total
| packages they are trying to make reproducible and has
| about 85% of them reproducible. Same goes for Debian -
| ~37k packages tracked for reproducible builds. One way to
| lie with percentages is when the absolute numbers are so
| disparate.
|
| > This is completely the wrong way around. Debian
| spearheaded the Reproducible Builds efforts in 2016 with
| contributions from SUSE, Fedora and Arch. NixOS got onto
| this as well but has seen less progress until the past
| 4-5 years. The NixOS efforts owes the Debian project all
| their thanks.
|
| Debian organized the broader effort across Linux distros.
| However the Nix project was designed from the ground up
| around reproducibility. It also pioneered architectural
| approaches that other systems have tried to emulate
| since. I think you're grossly misunderstanding the role
| Nix played in this effort.
| Foxboron wrote:
| > That's < 15k packages. Nix by comparison has ~100k
| total packages they are trying to make reproducible and
| has about 85% of them reproducible. Same goes for Debian
| - ~37k packages tracked for reproducible builds. One way
| to lie with percentages is when the absolute numbers are
| so disparate.
|
| That's not a lie. That is the package target. The
| `nixpkgs` repository in the same vein package a huge
| number of source archives and repackages entire
| ecosystems into their own repository. This greatly
| inflates the number of packages. You can't look at the
| flat numbers.
|
| > However the Nix project was designed from the ground up
| around reproducibility.
|
| It wasn't.
|
| > It also pioneered architectural approaches that other
| systems have tried to emulate since.
|
| This has had no bearing, and you are greatly
| overestimating the technical details of nix here. It's
| fundamentally invented in 2002, and things has progressed
| since then. `rpath` hacking _really_ is not magic.
|
| > I think you're grossly misunderstanding the role Nix
| played in this effort.
|
| I've been contributing to the Reproducible Builds effort
| since 2018.
| 12345hn6789 wrote:
| Nix is to Linux users what Linux is to normies.
| __MatrixMan__ wrote:
| In the near term it makes more sense to position nix as a
| common interface between app developers and distro maintainers
| and not as a direct-to-user way to cut their distro maintainers
| out of the loop entirely (although it is quite useful for
| that).
|
| Ideally, a distro maintainer would come across a project
| packaged with nix and think:
|
| > Oh good, the app dev has taken extra steps to make life easy
| for me.
|
| As-is, I don't think that's the case. You can add a flake
| output to your project which builds an .rpm or a .deb file, but
| it's not commonly done.
|
| I'm guessing that most of the time, distro maintainers would
| instead hook directly into a language specific build-tool like
| cmake or cargo and ignore the nix stuff. They benefit from nix
| only indirectly in cases where it has prevented the app dev
| from doing crazy things in their build (or at least has made
| that crazyness explicit, versus some kind of works-on-my-
| machine accident or some kind of nothing-to-see here
| skulduggery).
|
| If we want to nixify the world I think we should focus less on
| talking people out of using package managers which they like
| and more on making the underlying packages more uniform.
| skrtskrt wrote:
| Because Nix is a huge pain ramp up on and to use for anyone who
| is not an enthusiast about the state of their computer.
|
| What will happen is concepts from Nix will slowly get absorbed
| into other, more user-friendly tooling while Nix circles the
| complexity drain
| diffeomorphism wrote:
| Different notions of reproducible. This project cares
| specifically about bit-for-bit identical builds (e.g. no time
| stamps, parallel compile artifacts etc). Nix is more about
| being declarative and "repeatable" or whatever a good name for
| that would be.
|
| Both notions are useful for different purposes and nix is not
| particularly good at the first one.
|
| https://reproducible-builds.org/citests/
| jzb wrote:
| Oh, I assure you, it's hard to escape knowing about Nix if you
| write about this sort of thing. Someone will be along almost
| immediately to inform you about it.
|
| Nix wasn't mentioned (I'm the author) because it really isn't
| relevant here -- the comparable distributions, when discussing
| what Fedora is doing, are Debian and other distributions that
| use similar packaging schemes and such.
| __MatrixMan__ wrote:
| I agree that NixOS/nixpkgs would not be a good a basis for
| comparison. Do you have an opinion about the use of nix by
| app devs to specify their builds, i.e. as a make alternative,
| not as a Fedora alternative?
|
| Quoting the article:
|
| > Irreproducible bits in packages are quite often "caused by
| an error or sloppiness in the code". For example, dependence
| on hardware architecture in architecture-independent (noarch)
| packages is "almost always unwanted and/or a bug", and
| reproducibility tests can uncover those bugs.
|
| This is the sort of thing that nix is good at guarding
| against, and it's convenient that it doesn't require users to
| engage with the underlying toolchain if they're unfamiliar
| with it.
|
| For instance I can use the command below to build helix at a
| certain commit without even knowing that it's a rust package.
| Although it doesn't guarantee all aspects of repeatability,
| it will fail if the build depends on any bits for which a
| hash is not known ahead of time, which gets you half way
| there I think. nix build github:helix-
| editor/helix/340934db92aea902a61b9f79b9e6f4bd15111044
|
| Used in this way, can nix help Fedora's reproducibility
| efforts? Or does it appear to Fedora as a superfluous layer
| to be stripped away so that they can plug into cargo more
| directly?
| froh wrote:
| nice to see they're in this too.
|
| https://news.opensuse.org/2025/02/18/rbos-project-hits-miles...
| charcircuit wrote:
| This is a waste of time compared to investing in sandboxing which
| will actually protect users as opposed to stopping theoretical
| attacks. Fedora's sandbox capabilities for apps is so far behind
| other operating systems like Android that it is much more
| important of an area to address.
| johnny22 wrote:
| I think you have to do both sandboxing and this.
| charcircuit wrote:
| Both are good for security, but prioritization is important.
| The areas that are weakest in terms of security should get
| the most attention.
| AshamedCaptain wrote:
| I am yet to see a form of sandboxing for the desktop that is
| not:
|
| a) effectively useless
|
| or b) makes me want to throw my computer through the window and
| replace it with a 1990's device (still more useful than your
| average Android).
| fsflover wrote:
| If you want security through compartmentalization, you should
| consider Qubes OS, my daily driver, https://qubes-os.org.
| charcircuit wrote:
| This only secures between vms. This side steps the problem
| and people can still easily run multiple applications in the
| same qube.
| fsflover wrote:
| It's impossible to isolate applications inside one VM as
| securely as with Qubes virtualization. You should not rely
| on intra-VM hardening if you really care about security.
| Having said that, Qubes does provide ways to harden the
| VMs: https://forum.qubes-os.org/t/hardening-qubes-
| os/4935/3, https://forum.qubes-os.org/t/replacing-
| passwordless-root-wit....
| charcircuit wrote:
| People may want to have multiple apps work together. It
| makes sense to have security within a qube itself than to
| just declare it a free for all.
| fsflover wrote:
| If the apps work together, they typically belong to the
| same security domain / trust level. Do you have examples
| when you still have to isolate them from each other?
| PhilippGille wrote:
| > Fedora's sandbox capabilities for apps
|
| Do you mean Flatpaks or something else?
| charcircuit wrote:
| Sure, that is one solution, but it still needs a lot of work
| both to patch up holes in it and to fix apps to be better
| designed in regards to security.
| colonial wrote:
| Defaulting to Android-style nanny sandboxing ("you can't grant
| access to your Downloads folder because we say so" etc.) is
| unlikely to go over well with the average Linux distro
| userbase.
|
| Also, maximally opt-in sandboxes for graphical applications
| have been possible for a while. Just use Podman and only mount
| your Wayland socket + any working files.
| charcircuit wrote:
| >Defaulting to Android-style nanny sandboxing ("you can't
| grant access to your Downloads folder because we say so"
| etc.) is unlikely to go over well with the average Linux
| distro userbase.
|
| If you market it that way. Plenty of Linux users say they
| care about security, don't want maleware, etc. This is a step
| towards those desires. Users have been conditioned to use
| tools badly to designed for security for decades so there
| will be some growing pains, but it will get worse the longer
| people wait.
|
| >Just use Podman and only mount your Wayland socket + any
| working files.
|
| This won't work for the average user. Security needs to be
| accessible.
| nimish wrote:
| As a user of fedora what does this actually get me? I mean I
| understand it for hermetic builds but why?
| jacobgkau wrote:
| My impression is that reproducible builds improve your security
| by helping make it more obvious that packages haven't been
| tampered with in late stages of the build system.
|
| * Edit, it's quoted in the linked article:
|
| > Jedrzejewski-Szmek said that one of the benefits of
| reproducible builds was to help detect and mitigate any kind of
| supply-chain attack on Fedora's builders and allow others to
| perform independent verification that the package sources match
| the binaries that are delivered by Fedora.
| Zamicol wrote:
| Bingo.
| kazinator wrote:
| The supply chain attacks you have to most worry about are not
| someone breaking into Fedora build machines.
|
| It's the attacks on the upstream packages themselves.
|
| Reproducible builds would absolutely not catch a situation
| like the XZ package being compromised a year ago, due to the
| project merging a contribution from a malicious actor.
|
| A downstream package system or OS distro will just take that
| malicious update and spin it into a beautifully reproducing
| build.
| yjftsjthsd-h wrote:
| Don't let the perfect be the enemy of the good; this
| doesn't prevent upstream problems but it removes one place
| for compromises to happen.
| kazinator wrote:
| I'm not saying don't have reproducible builds; it's just
| that this is an unimportant justification for them,
| almost unnecessary.
|
| Reproducible builds are such an overhelmingly good and
| obvious thing, that build farm security is just a
| footnote.
| phkahler wrote:
| And anything designed to catch upstream problems like the
| XZ compromise will not detect a compromise in the Fedora
| package build environment. Kinda need both.
| bluGill wrote:
| Reproducible builds COULD fix the xz issues. The current
| level would not, but github could do things to make
| creating the downloadable packages scrip table and thus
| reproducible. Fedora could checkout the git hash instead of
| downloading the provided tarball and again get reproducible
| builds that bypass this.
|
| The above are things worth looking at doing.
|
| However I'm not sure what you can code that tries to
| obscure the issues while looking good.
| bagels wrote:
| It's one tool of many that can be used to prevent malicious
| software from sneaking in to the supply chain.
| russfink wrote:
| Keep in mind that compilers can be backdoored to install
| malicious code. Bitwise/signature equivalency does not imply
| malware-free software.
| bluGill wrote:
| True, but every step we add makes the others harder too. It
| is unlikely Ken Thompson's "trusting trust" compiler would
| detect modern gcc, much less successfully introduce the
| backdoor. Even if you start with a compromised gcc of that
| type there is a good chance that after a few years it would
| be caught when the latest gcc fails to build anymore for
| someone with the compromised compiler. (now add clang and
| people using that...)
|
| We may never reach perfection, but the more steps we make in
| that direction the more likely it is we reach a point where
| we are impossible to compromise in the real world.
| kazinator wrote:
| Reproducible builds can improve software quality.
|
| If we believe we have a reproducible build, that's constitutes
| a big test case which gives us confidence in the
| _determininism_ of the whole software stack.
|
| To validate that test case, we actually have to repeat the
| build a number of times.
|
| If we spot a difference, something is wrong.
|
| For instance, suppose that a compiler being used has a bug
| whereby it is relying on the value of an unitialized variable
| somewhere. That could show up as a difference in the code it
| generates.
|
| Without reproducible builds, of course there are always
| differences in the results of a build: we cannot use repeated
| builds to discover that something is wrong.
|
| (People do diffs between irreproducible builds anyway. For
| instance, disassemble the old and new binaries, and do a
| textual diff, validating that only some expected changes are
| present, like string literals that have embedded build dates.
| If you have reproducible builds, you don't have to do that kind
| of thing to detect a change.
|
| Reproducible builds will strengthen the toolchains and
| surrounding utilities. They will flush out instabilities in
| build systems, like parallel Makefiles with race conditions, or
| indeterminate orders of object files going into a link job,
| etc.
| uecker wrote:
| I don't think it is that unlikely that build hosts or some
| related part of the infrastructure gets compromised.
| tomcam wrote:
| I don't know this area, but it seems to me it might be a boon
| to security? So that you can tell if components have been
| tampered with?
| bobmcnamara wrote:
| Bingo. We caught a virus tampering with one of our code
| gens this way.
| dwheeler wrote:
| Yes! The attack on SolarWinds Orion was an attack on its
| build process. A verified reproducible build would have
| detected the subversion, because the builds would not have
| matched (unless the attackers managed to detect and break
| into all the build processes).
| Dwedit wrote:
| Reproducibility is at odds with Profile-Guided-Optimization.
| Especially on anything that involves networking and other IO that
| isn't consistent.
| michaelt wrote:
| Why should it be?
|
| Does the profiler not output a hprof file or whatever, which is
| the input to the compiler making the release binary? Why not
| just store that?
| gnulinux wrote:
| It's not at odds at all but it'll be "Monadic" in the sense
| that the output of system A will be part of the input to system
| A+1 which is complicated to organize in a systems setting,
| especially if you don't have access to a language that can
| verify. But it's absolutely achievable if you do have such a
| tool, e.g. you can do this in nix.
| zbobet2012 wrote:
| That's only the case if you did PGO with "live" data instead of
| replays from captured runs, which is best practice afaik.
| nrvn wrote:
| from Go documentation[0]:
|
| > Committing profiles directly in the source repository is
| recommended as profiles are an input to the build important for
| reproducible (and performant!) builds. Storing alongside the
| source simplifies the build experience as there are no
| additional steps to get the profile beyond fetching the source.
|
| I very much hope other languages/frameworks can do the same.
|
| [0]: https://go.dev/doc/pgo#building
| nyrikki wrote:
| The _Performant_ claim there is counter to research I have
| heard. Plus as the PGO profile data is non-deterministic in
| most cases, even when compiled on the same hardware as the
| end machine "Committing profiles directly in the source
| repository" is the reason why they are deleted or at least
| excluded from the comparison.
|
| A quote from the paper that I remember on the subject[1] as
| these profiles are just about as machine dependent as you can
| get.
|
| > Unfortunately, most code improvements are not machine
| independent, and the few that truly are machine independent
| interact with those that are machine dependent causing phase-
| ordering problems. Hence, effectively there are no machine-
| independent code improvements.
|
| There were some differences between various Xeon chip's
| implementations of the same or neighboring generations that I
| personally ran into when we tried to copy profiles to avoid
| the cost of the profile runs that may make me a bit more
| sensitive to this, but I personally saw huge drops in
| performance well into the double digits that threw off our
| regression testing.
|
| IMHO this is exactly why your link suggested the following:
|
| > Your production environment is the best source of
| representative profiles for your application, as described in
| Collecting profiles.
|
| That is very different from Fedora using some random or
| generic profile for x86_64, which may or may not match the
| end users specific profile.
|
| [1] https://dl.acm.org/doi/10.5555/184716.184723
| clhodapp wrote:
| If those differences matter so much for your workloads,
| treat your different machine types as different different
| architectures, commit profiling data for _all_ of them and
| (deterministically) compile individual builds for all of
| them.
|
| Fedora upstream was never going to do that for you anyway
| (way too many possible hardware configurations), so you
| were already going be in the business of setting that up
| for yourself.
| nyrikki wrote:
| This is one of the "costs" of reproducible builds, just like
| the requirement to use pre-configured seeds for pseudo random
| number generators etc.
|
| It does hit real projects and may be part of the reason that
| "99%" is called out but Fedora also mentions that they can't
| match the _official_ reproducible-builds.org meaning in the
| above just due to how RPMs work, so we will see what other
| constraints they have to loosen.
|
| Here is one example of where suse had to re-enable it for gzip.
|
| https://build.opensuse.org/request/show/499887
|
| Here is a thread on PGO from the reproducible-builds mail list.
|
| https://lists.reproducible-builds.org/pipermail/rb-general/2...
|
| There are other _costs_ like needing to get rid of parallel
| builds for some projects that make many people loosen the
| official constraints. The value of PGO+LTO being one.
|
| gcda profiles are unreproducible, but the code they produce is
| typically the same. If you look into the pipeline of some
| projects, they just delete the gcda output and then often try a
| rebuild if the code is different or other methods.
|
| While there are no ideal solutions, one that seems to work
| fairly well, assuming the upstream is doing reproducible
| builds, is to vendor the code, build a reproducible build to
| validate that vendored code, then enable optimizations.
|
| But I get that not everyone agrees that the value of
| reproducibility is primarily avoiding attacks on build
| infrastructure.
|
| However reproducible builds as nothing to do with MSO model
| checking etc... like some have claimed. Much of it is just
| deleting non-deterministic data as you can see here with
| debian, which fedora copied.
|
| https://salsa.debian.org/reproducible-builds/strip-nondeterm...
|
| As increasing the granularity of address-space randomization at
| compile and link time is easier than at the start of program
| execution, obviously there will be a cost (that is more than
| paid for by reducing supply chain risks IMHO) of reduced
| entropy for address randomization and thus does increase the
| risk of ROP style attacks.
|
| Regaining that entropy at compile and link time, if it is
| practical to recompile packages or vendor, may be worth the
| effort in some situations, probably best to do real PGO at that
| time too IMHO.
| goodpoint wrote:
| Yo, the attacker has access to the same binaries, so only
| runtime address randomization is useful.
| barotalomey wrote:
| The real treasure was the friend I found along the way
|
| https://github.com/keszybz/add-determinism
| m463 wrote:
| I kind of wonder if this or something similar could somehow
| nullify timestamps so you could compare two logfiles...
|
| further would be the ability to compare logfiles with pointer
| addresses or something
| didericis wrote:
| A different but more powerful method of ensuring
| reproducibility is more rigorous compilation using formally
| verifiable proofs.
|
| That's what https://pi2.network/ does. It uses K-Framework,
| which is imo very underrated/deserves more attention as a
| long term way of solving this kind of problem.
___________________________________________________________________
(page generated 2025-04-11 23:00 UTC)