[HN Gopher] Nixos-unstable's ISO_minimal.x86_64-Linux is 100% re...
       ___________________________________________________________________
        
       Nixos-unstable's ISO_minimal.x86_64-Linux is 100% reproducible
        
       Author : todsacerdoti
       Score  : 735 points
       Date   : 2021-06-20 20:01 UTC (1 days ago)
        
 (HTM) web link (discourse.nixos.org)
 (TXT) w3m dump (discourse.nixos.org)
        
       | avalys wrote:
       | Can anyone comment on the significance of this accomplishment,
       | and why it was hard to achieve before?
       | 
       | I (naively, apparently) assumed this had been possible with open-
       | source toolchains for a long time.
        
         | peterkelly wrote:
         | For some reason, many compilers and build scripts have
         | traditionally been written in a way that's not referentially
         | transparent (a pure function from input to output). Unnecessary
         | information like the time of the build, absolute path names of
         | sources and intermediate files, usernames and hostnames often
         | would find their way into build outputs. Compiling the same
         | source on different machines or at different times would yield
         | different results.
         | 
         | Reproducible builds avoid all this and always produce the same
         | outputs given the same inputs. There's no good reason (that I
         | can think of) why this shouldn't have been the case all along,
         | but for a long time I guess it just wasn't seen as a priority.
         | 
         | The benefit of reproducible builds is that it's possible to
         | verify that a distributed binary was definitely compiled from
         | known source files and hasn't been tampered with, because you
         | can recompile the program yourself and check that the result
         | matches the binary distribution.
        
           | otabdeveloper4 wrote:
           | > The benefit of reproducible builds is that it's possible to
           | verify that a distributed binary was definitely compiled from
           | known source files and hasn't been tampered with, because you
           | can recompile the program yourself and check that the result
           | matches the binary distribution.
           | 
           | It's not just security. If a hash of the input sources maps
           | directly to a hash of the output binaries, then you can
           | automatically cache build artefacts by hash tag and get huge
           | speedups when compiling stuff from scratch.
           | 
           | This was the primary motivation for Nix, since Nix does a
           | whole lot of building from scratch and caching.
        
             | kohlerm wrote:
             | I agree being able to support distributed caching of
             | results is one of the major benefits.
        
           | dane-pgp wrote:
           | > There's no good reason (that I can think of) why this
           | shouldn't have been the case all along
           | 
           | Well, it's not like developers consciously thought "How can I
           | make my build process as non-deterministic as possible?",
           | it's just that by the time people started to become aware of
           | the benefits of reproducibility, various forms of non-
           | determinism had already crept in.
           | 
           | For example, someone writing an archiving tool would be
           | completely right to think it is a useful feature to store the
           | creation date of the archive in the archive's metadata. The
           | idea that a user might want to force this value to instead be
           | some fixed constant would only occur to someone later when
           | they noticed that their packages were non-reproducible
           | because of this.
           | 
           | But you're right; if the goal had been thought of from the
           | start, there's no reason why every build tool wouldn't have
           | supported this.
        
             | russfink wrote:
             | Thank you both. I was wondering the same thing.
        
         | xyzzy_plugh wrote:
         | There's a lot of problems with reproducible builds. Filesystem
         | paths, timestamps, deterministic build order to say the least.
         | This is a pretty great achievement and I'm looking forward to a
         | non-minimal stable ISO.
        
           | bombcar wrote:
           | Yeah even the "gcc compiled Jan 23, 2021 at 11:23AM" messages
           | you often see breaks deterministic builds.
        
         | twisrkrr wrote:
         | The code has to be changed so that things like system specific
         | paths, time of compilation, hardware, etc. Don't cause the
         | compiled program to be unique to that computer (meaning
         | compiling the same code on a different computer will give you a
         | file that still works but has a different md5 hash)
         | 
         | By being able to reproduce the file completely, down to
         | identical md5 hashes, you know you have the same file the
         | creator has, and know with certainty that the file has not been
         | tampered with
        
           | secondcoming wrote:
           | Does this mean that the code cannot be built with CPU
           | specific optimisations (march option with gcc)
        
             | Avamander wrote:
             | Pretty much. But hopefully x86_64 feature levels will
             | provide the benefits of native builds to a reasonable
             | extent.
        
             | Denvercoder9 wrote:
             | The software doesn't suddenly become incompatible with CPU-
             | specific optimisations (or many other compiler flags that
             | change its output), but if you do so, you won't be able to
             | reproduce the distribution binaries. Distributions don't
             | enable CPU-specific optimisations anyway, since they want
             | to be usable on more than one CPU model.
        
             | pas wrote:
             | Likely it means that with the same input arguments the end
             | result is bit-by-bit identical. (As I understand the
             | problems were hard to control output elements. So it was
             | not enough to se the same args, set the same time, and use
             | the same path and filesystem, because there were things
             | that happened at different speeds, so they ended up
             | happening at relative different elapsed times, so the
             | outputs contained different timestamps, etc.)
        
             | clhodapp wrote:
             | No, just that you need to avoid naively conflating the
             | machine that is doing the compilation with the one that
             | optimization is being performed for.
             | 
             | Concretely, you would need to keep track of and reproduce
             | e.g. the march flag value as a part of your build input. If
             | you wanted to optimize for multiple architectures, that
             | would mean separate builds or a larger binary with function
             | multi-versioning.
        
             | maartenh wrote:
             | Nixpkgs contains the build / patch instructions for any
             | packages in NixOS.
             | 
             | If you want to compile any piece of software available in
             | Nixpkgs, you can override it's attributes (inputs used to
             | build it).
             | 
             | One can trivially have an almost identical operation system
             | to your colleagues install, but override just one package
             | to enable optimisations for a certain cpu. This would
             | however imply that you'd lose the transparent binary cache
             | that you could otherwise use.
             | 
             | Exactly this method is used to configure the entire
             | operating install! Your OS install is just another package
             | that has some custom inputs set.
        
         | danbst wrote:
         | Just recently, there were large non-reproducible projects:
         | python, gcc. Not sure where is the history of non-r13y.
         | 
         | ---
         | 
         | There is Debian initiative to create bit-to-bit reproducible
         | builds for all their software (well, all critical).
         | 
         | https://reproducible-builds.org/
         | 
         | R13y is akin to "computer proofs" in math -- if you don't have
         | it, that's fine, but if you have it, that's awesome.
         | 
         | There are practical reasons to favor reproducibility too, but
         | those are more for distro maintainers.
         | 
         | The fact that NixOS (not Debian) got this 100% is mostly
         | because
         | 
         | - minimal image has a small subset of packages
         | (https://hydra.nixos.org/build/146009592#tabs-build-deps)
         | 
         | - Nix tooling was created 15 years ago *exactly* for this, Nix
         | is mad to make packages bit-to-bit rebuildable from scratch.
         | 
         | - Nix/Nixpkgs is growing in number of maintainers and got more
         | funds
         | 
         | - Nix has fewer Docker/Snap pragmatics
        
           | dataflow wrote:
           | > There's no good reason (that I can think of) why this
           | shouldn't have been the case all along
           | 
           | Determinism can decrease performance dramatically. Like
           | concatenating items (say, object files into a library) in
           | order is clearly more expensive in both time & space than
           | processing them out of order. One requires you to store
           | everything in memory and then sort them before you start
           | doing any work, whereas the other one lets you do your work
           | in a streaming fashion. Enforcing determinism can turn an
           | O(1)-space/O(n)-time algorithm into an O(n)-space/O(n log
           | n)-time one, increasing latency _and_ decreasing throughput.
           | You wouldn 't take a performance hit like that without a good
           | reason to justify it.
        
           | Foxboron wrote:
           | >- Nix tooling was created 15 years ago _exactly_ for this,
           | Nix is mad to make packages bit-to-bit rebuildable from
           | scratch.
           | 
           | I don't think this is accurate?
           | 
           | Nix is about reproducing system behaviour, largely by
           | capturing the dependency graph and replaying the build. But
           | this doesn't entail bit-for-bit identical binaries. It's very
           | much sits in the same group such as Docker and similar
           | technologies. This is also how I read the original thesis
           | from Eelco[0].
           | 
           | And well, claims like this always rubs me the wrong way since
           | nixos only really started using the word "reproducible
           | builds" after Debian started their efforts in 2015-2016[1],
           | and started their reproducible builds effort later. It also
           | muddies the language since people are now talking about
           | "reproducible builds" in terms of system behavior as well as
           | bit-for-bit identical builds. The result has been that people
           | talk about "verifiable builds" instead.
           | 
           | [0]: https://edolstra.github.io/pubs/phd-thesis.pdf
           | 
           | [1]: https://github.com/NixOS/nixpkgs/issues/9731
        
           | infogulch wrote:
           | Being bit-for-bit reproduceable means you could do fun things
           | like distribute packages as just sources and a big blob of
           | signatures, and you can still run only signed binaries.
        
         | mananaysiempre wrote:
         | The GCC developers in particular were hostile to such efforts
         | for a long time, IIRC. (This is a non-trivial issue because
         | randomized data structures exist and can be a good idea to use:
         | treaps, universal hashes, etc. I'd guess it also pays for
         | compiler heuristics to be randomized sometimes. Incremental
         | compilation is much harder to achieve when you require bit-for-
         | bit identical output. Even just stripping your compile paths
         | from debug info is not entirely straightforward.)
        
           | pas wrote:
           | How/why was the randomness part not "solveable" via using
           | fixed seeds?
        
             | bruce343434 wrote:
             | the security benefit of things like stack canaries rest on
             | them being random and not known beforehand, I guess.
             | Otherwise stack smashing malware could know to avoid them.
        
               | mananaysiempre wrote:
               | Wait, how is that relevant? Nothing says stack canaries
               | have to use the same RNG as the main program, let alone
               | the same seed, and there are cases such as this one where
               | they probably shouldn't, so it makes sense to separate
               | them.
        
           | adonovan wrote:
           | GCC used to attempt certain optimizations (or more generally,
           | choose different code-generation strategies) only if there
           | was plenty of memory available. We discovered this in the
           | course of designing Google's internal build system, which
           | prizes reproducibility.
        
           | moonchild wrote:
           | > Incremental compilation is much harder to achieve when you
           | require bit-for-bit identical output
           | 
           | Presumably, incremental compilation is only for development.
           | For release, you would do a clean build, which would be
           | reproducible.
           | 
           | > Even just stripping your compile paths from debug info is
           | not entirely straightforward
           | 
           | Just use the same paths.
        
             | mananaysiempre wrote:
             | > Presumably, incremental compilation is only for
             | development. For release, you would do a clean build, which
             | would be reproducible.
             | 
             | I'd say that's exactly the wrong approach: given how hard
             | incremental anything is, it would make sense to insist on
             | bit-exact output and then fuzz the everliving crap out of
             | it until bit-exactness was reached. (The GCC maintainers do
             | not agree.) But yes, you could do that. It's not impossible
             | to do reproducible builds with GCC 4.7 or whatever, it's
             | just intensely unpleasant, especially as a distro
             | maintainer faced with yet another bespoke build system.
             | (Saying that with all the self-awareness of a person making
             | their own build system.)
             | 
             | > Just use the same paths.
             | 
             | I mean, sure, but then you have to build _and_ debug in a
             | chroot and waste half a day of your life figuring out how
             | to do that and just generally feel stupid. And your debug
             | info is still useless to anybody not using the exact same
             | setup. Can't we just embed _relative_ paths instead, or
             | even arbitrary prefixes it the code is coming from more
             | than one place? In recent GCC versions we can, just chuck
             | the right incantation into CPPFLAGS and you're golden.
             | 
             | All of this is not really difficult except insofar as
             | getting a large and complicated program to do anything is
             | difficult. (Stares in the direction of the 17-year-old
             | Firefox bug for XDG basedir support.) That's why I said it
             | wasn't a GCC problem so much as a maintainer attitude
             | problem.
        
         | [deleted]
        
       | [deleted]
        
         | [deleted]
        
       | Arnavion wrote:
       | Is there a list of the 1486 packages in the minimal ISO?
        
         | danbst wrote:
         | https://hydra.nixos.org/build/146009592#tabs-build-deps
        
       | taviso wrote:
       | I don't see a single comment doubting the value of
       | reproducibility, so I'll be the resident skeptic :)
       | 
       | I think build reproducibility is a cargo cult. The website says
       | reproducibility can reduce the risk of developers being
       | threatened or bribed to backdoor their software, but that is just
       | ridiculous. Developers have a perfect method for making their own
       | software malicious: bugdoors. A bugdoor (bug + backdoor) is a
       | deliberately introduced "vulnerability" that the vendor can
       | "exploit" when they want backdoor access. If the bug is ever
       | discovered you simply issue a patch and say it was a mistake,
       | it's perfectly deniable. It's not unusual for major vendors to
       | patch critical vulnerabilities every month, there is zero penalty
       | for doing this.
       | 
       | The existence of bugdoors means you _have_ to trust the vendor
       | who provided the source code, there is no way around this.
       | 
       | You have to trust the developer, but in theory, reproducible
       | builds could be used to convince yourself their build server
       | hasn't been hacked. This isn't really necessary or useful, you
       | can already produce a trustworthy binary by just building the
       | source code yourself. You still have to trust the vendor to keep
       | hackers off everything else though!
       | 
       | Okay, but building software is tedious, and for some reason you
       | are particularly concerned about build servers being hacked.
       | Perhaps you will nominate a dozen different organization that
       | will all build the code, and make this a consensus system. If
       | they all agree, then you can be sure enough the binaries were
       | built with a trustworthy toolchain. A modest improvement in
       | theory, but that introduces a whole bunch of new crazy problems.
       | 
       | You can't just pick one or two consensus servers, because then an
       | attacker can stop you getting updates by compromising any one of
       | them. You will have to do something like choose a lot of servers,
       | and only require 51% to agree.
       | 
       | Now, imagine a contentious update like a adopting a
       | cryptocurrency fork, or switching to systemd (haha). If the
       | server operators rebel, they can effectively veto a change the
       | vendor wants to make. Perhaps vendors will implement a killswitch
       | that allows them to have the final say, or perhaps they operate
       | all the consensus build servers themselves.
       | 
       | The problem is now you've either just replaced build servers with
       | killswitches, or just replicated the same potentially-compromised
       | buildserver.
       | 
       | I wrote a blog post about this a while ago, although I should
       | update it at some point.
       | 
       | https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducibl...
        
         | pabs3 wrote:
         | There are other solutions to the problem of trusting
         | maintainers; namely incremental distributed code review. The
         | Rust folks are working on that:
         | 
         | https://github.com/crev-dev/
         | 
         | You still need Reproducible Builds and Bootstrappable Builds
         | even if you have a fully reviewed codebase though.
        
           | [deleted]
        
         | nixpulvis wrote:
         | You'll never beat the source.
         | 
         | IIUC Reproducible Builds guarantees that source is turned into
         | an artifact in a consistent and unchanging way. So as long as
         | the source doesn't change neither will the build.
        
           | taviso wrote:
           | I don't really understand what you're saying.
           | 
           | If you're saying "reproducible builds are reproducible", then
           | that is obviously true, but the question is what is the
           | benefit?
           | 
           | Some people claim that the benefit is that there will be less
           | incentive to threaten developers with violence, and I'm
           | saying that's nonsense. If you cut through the nonsense,
           | there are some modest claims that are true, but doing
           | reproducible builds properly is very complicated and the
           | benefit is negligible.
        
             | nixpulvis wrote:
             | > ... the question is what is the benefit?
             | 
             | I don't think I should have to explain this. It has nothing
             | directly to do with violence against developers, that's
             | taking _many_ leaps.
             | 
             | It simply gives you what you expect, which is kinda the
             | basis of safety and security.
        
               | taviso wrote:
               | > It has nothing directly to do with violence against
               | developers, that's taking many leaps.
               | 
               | It is literally the very first claim on the front page of
               | https://reproducible-builds.org.
        
               | nixpulvis wrote:
               | That's just a bunch of marketing hype. I'm trying to stay
               | focused closer to the matters at hand.
               | 
               | Perhaps my rambling on Development vs Distribution is
               | relevant to the discussion?
               | https://nixpulvis.com/ramblings/2021-02-02-signing-and-
               | notar...
        
         | EE84M3i wrote:
         | Honestly I think the biggest benefit of reproducibility is just
         | debuggability. We both check out the same git repo and build
         | it, we can later hash the binary and compare the hashes to know
         | we're running the exact same code.
         | 
         | On security, if you really care about compromised build servers
         | you might as well just build from source yourself. I think
         | reproducibility might matter most in systems where side loading
         | is hard/impossible like app stores, but I'm not familiar with
         | the current state of the art in terms of iOS reproducable
         | builds and checking them.
        
         | theon144 wrote:
         | >Developers have a perfect method for making their own software
         | malicious: bugdoors.
         | 
         | I think rather than malicious developers the focus is on
         | malicious build machines. How many things are built solely via
         | CI these days, on machines that nobody has ever seen, using
         | docker images that nobody has validated?
         | 
         | It's much easier to imagine a malicious provider (as in
         | Sourceforge bundling in adware) than malicious developers, I
         | think.
         | 
         | But yes, you're right that reproducible builds don't remove the
         | need to trust the source.
         | 
         | >You have to trust the developer, but in theory, reproducible
         | builds could be used to convince yourself their build server
         | hasn't been hacked. This isn't really necessary or useful, you
         | can already produce a trustworthy binary by just building the
         | source code yourself.
         | 
         | This is pretty much all false though - not only the "just"
         | part, as setting up a proper build environment is pretty non-
         | trivial for many projects, and building _everything_ from
         | source is a task only the most dedicated Gentoomen would take
         | up; you can also think of reproducible builds as a  "litmus
         | test". If you can, with reasonable accuracy, check whether a
         | build machine is compromised at any time, you have a much
         | greater base on which to trust it and its outputs. The benefits
         | of having build machines probably shouldn't need explaining.
         | 
         | >You can't just pick one or two consensus servers, because then
         | an attacker can stop you getting updates by compromising any
         | one of them. You will have to do something like choose a lot of
         | servers, and only require 51% to agree.
         | 
         | >...
         | 
         | >The problem is now you've either just replaced build servers
         | with killswitches, or just replicated the same potentially-
         | compromised buildserver.
         | 
         | I really don't understand this argument; compromised
         | infrastructure probably shouldn't be a regular occurrence, and
         | even if so, automated killswitches seem like the vastly more
         | preferable option, no?
        
           | taviso wrote:
           | > I really don't understand this argument; compromised
           | infrastructure probably shouldn't be a regular occurrence,
           | and even if so, automated killswitches seem like the vastly
           | more preferable option, no?
           | 
           | I'm pointing out how complex implementing reproducible builds
           | is. It introduces a bunch of really hard unsolved problems
           | that people are very handwavy about.
           | 
           | Who will do the reproducing? You say that users won't be able
           | to do it. That makes sense, because if they could, then
           | reproducible builds would be useless! However, you also say
           | they will be able to check if a build server is compromised
           | at any time. In order for both of those claims to be true we
           | will have to design and build a complex consensus system
           | operated by mutually untrusted volunteers. That's really
           | hard, and seems like it provides a pretty negligible benefit.
        
         | danieldk wrote:
         | _I think build reproducibility is a cargo cult._
         | 
         | Most people here are debating you on the security angle, but in
         | the case of Nix (and Guix) there is another important angle -
         | reproducible builds make a content-addressed store possible.
         | 
         | In Nix, the store is traditionally addressed by the hash of the
         | derivation (the recipe that builds the package). For example,
         | _lr96h..._ in the path
         | /nix/store/lr96h3dlny8aiba9p3rmxcxfda0ijj08-coreutils-8.32
         | 
         | is the hash of the (normalized) derivation that was used to
         | build coreutils. Since the derivation includes build inputs,
         | either changing the derivation for coreutils itself or one of
         | its inputs (dependencies) results in a different hash and a
         | rebuild of coreutils.
         | 
         | This also means that if somebody changes the derivation of
         | coreutils _every_ package that depends on coreutils will be
         | rebuilt, even if this change does not result in a different
         | output path (compiled package).
         | 
         | This is being addressed by the new work on the content-
         | addressed Nix store (although content-addressing aws already
         | discussed in Eelco Dolstra's PhD thesis about Nix). In the
         | content-addressed store, the hash in the path, such as the on
         | above is a hash of the output path (the built package), rather
         | than a hash of the normalized derivation. This means that if
         | the derivation of coreutils is changed in such a way that it
         | does not change the output path, none of the packages that
         | depend on coreutils are rebuilt.
         | 
         | However, this only works reliably with reproducible builds,
         | because if there is non-determinism in the build, how do you
         | know whether a change in the output path is changed as a result
         | of changing a derivation or as a result of uninteresting non-
         | determinisms (the output hash would change in both cases).
        
           | taviso wrote:
           | I don't really have any complaints about using deterministic
           | builds for non-security reasons, but the number one claim
           | most proponents make is that it somehow prevents backdoors.
           | Literally the first claim on reproducible-builds.org is that
           | build determinism will prevent threats of violence and
           | blackmail.
        
           | londons_explore wrote:
           | Where the dependency chain is long, this substantially
           | reduces build work during development too.
           | 
           | I'd guess that more than half of the invocations of gcc done
           | by Make for example end up producing the exact same bit for
           | bit output as some previous invocation.
        
             | taviso wrote:
             | I would point out that is literally what ccache (and Google
             | goma) does, but doesn't require deterministic builds.
             | Instead, it records hashes of preprocessed input and
             | compiler commandlines.
             | 
             | They don't make any security claims about this, it's just
             | for speeding up builds.
        
               | Ericson2314 wrote:
               | What we currently do --- hashing inputs --- is the same
               | ccache way. We just don't yet sandbox with the
               | granularity yet.
               | 
               | What we want to id hash _outputs_. Say I replace 1 + 2
               | with 0 + 3. That will cause ccache to rebuild. We don 't
               | want downstream stuff to also be rebuilt. C-linking
               | withing a package is nice in parallelizable, but in the
               | general case there is more dependency chains and now that
               | sort of thing starts to matter.
        
         | cesarb wrote:
         | > What isn't clear is what benefit the reproducibility
         | provides. The only way to verify that the untrusted binary is
         | bit-for-bit identical to the binary that would be produced by
         | building the source code, is to produce your own trusted binary
         | first and then compare it. At that point you already have a
         | trusted binary you can use, so what value did reproducible
         | builds provide?
         | 
         | That's not the interesting case. The interesting case is when
         | the untrusted binary _doesn 't_ match the binary produced by
         | building the source code. Assuming that the untrusted binary
         | has been signed by its build system, you now have proof that
         | the build system is misbehaving. And that proof can be
         | distributed and reproduced by everyone else.
         | 
         | Once Debian is fully reproducible, I expect several
         | organizations (universities, other Linux distribution vendors,
         | governments, etc) to silently rebuild every single Debian
         | package, and compare the result with the Debian binaries; if
         | they find any mismatch, they can announce it publicly (with
         | proof), so that the whole world (starting with the Debian
         | project itself) will know that there's something wrong. This
         | does not need any complicated consensus mechanism.
         | 
         | > More often, attackers want signing keys so they can sign
         | their own binaries, steal proprietary source code, inject
         | malicious code into source code tarballs, or malicious patches
         | into source repositories.
         | 
         | In Debian, compromising the build server is not enough to
         | inject malicious code into source code tarballs or patches,
         | since the source code is also signed by the package maintainer.
         | Unexpected changes on which maintainer signed the source code
         | for a given package could be flagged as suspicious.
         | 
         | The only attack left from that list, at least for Debian, would
         | be for the attacker to sign their own top-level Release file
         | (on Debian, individual packages are not signed, instead a file
         | containing the hash of a file containing the hash of the
         | package is what is signed). But the attacker cannot distribute
         | the resulting compromised packages to everyone, since those who
         | rebuild and compare every package would notice it not matching
         | the corresponding source code, and warn everyone else.
        
           | goodpoint wrote:
           | > I expect several organizations (universities, other Linux
           | distribution vendors, governments, etc) to silently rebuild
           | every single Debian package, and compare the result with the
           | Debian binaries
           | 
           | This has been happening for many years. A lot of large
           | companies that care about security and maintainability sign
           | big contracts to with tech companies that often include
           | indemnification.
        
         | dane-pgp wrote:
         | > If the server operators rebel, they can effectively veto a
         | change the vendor wants to make.
         | 
         | How often do you think there will be a change so controversial
         | that teams who have volunteered to secure the update system
         | will start effectively carrying out a Denial of Service attack
         | against all the users of that distro?
         | 
         | We also have to imagine that these malicious attestation nodes
         | can easily be ignored by users just updating a config file, so
         | the only thing the node operators could achieve by boycotting
         | the attestation process is temporarily inconveniencing people
         | who used to rely on them (which is not a great return on
         | investment for the reputation they burn in doing this).
        
           | taviso wrote:
           | I don't know what reputation damage will happen, they're just
           | third parties compiling code. There is no reputational damage
           | for operating a malicious tor exit relay, why would this be
           | different?
        
             | dane-pgp wrote:
             | As I understand it, Tor does have a way of detecting
             | whether an exit node is failing to connect users to their
             | intended destination. (With TLS enforced, the only thing a
             | malicious exit node could do is prevent valid connections).
             | 
             | In any case, I don't think anyone is proposing that the
             | attestation nodes be run by random anonymous people on the
             | internet. It would make more sense to have half a dozen or
             | so teams running these nodes, with each team being known
             | and trusted by the distro in question.
             | 
             | I'm not sure what the costs/requirements would be for
             | running one of these nodes, but it might be possible for
             | distros to each run a node dedicated to building each
             | other's distros (or at least the packages that are pushed
             | as security updates to stable releases).
             | 
             | Alternatively, individual developers that already work on a
             | distro can offer to build packages on their own machines
             | and contribute signed hashes to a log maintained by the
             | distro itself.
        
               | taviso wrote:
               | The point was that a reproducible build _doesn 't_ mean
               | you don't have to trust the developer.
               | 
               | Build servers rebelling was just an example of the
               | additional complexities and attacks that it introduces,
               | for very negligible benefit.
        
         | smoldesu wrote:
         | Reproducability is an option to mitigate backdoors and
         | incentive developers to operate openly. It's no panacea, but it
         | makes a lot of sense in open-source projects where individual
         | actors are going to represent your largest threat vector. That
         | way, it becomes a lot harder to push an infected blob to main,
         | even if it still is _technically_ possible. Hashes are also
         | "technically pointless", but we still implement them liberally
         | to quickly account for data integrity.
        
           | taviso wrote:
           | Signatures are not technically pointless, they mean you only
           | have to trust the developer - not the mirror operators.
           | 
           | Reproducibility _is_ technically pointless, because you still
           | have to trust the developer, and they can still add
           | backdoors.
        
             | dane-pgp wrote:
             | Reproducibility means you don't have to worry that the
             | developer might have a backdoored toolchain (which also
             | means that they can't _pretend_ that a malicious toolchain
             | added the malicious code without their knowledge).
             | 
             | A talented developer might still be able to create a
             | bugdoor which gets past code review, but that takes more
             | effort and skill than just putting the malicious code into
             | a local checkout and then saying "How did that get there?".
        
               | taviso wrote:
               | Every major vendor has vulnerabilities introduced all the
               | time, by accident! No talent is necessary to introduce a
               | bugdoor, just malice.
               | 
               | You can already verify that a toolchain wasn't backdoored
               | today, reproducible builds aren't necessary for that.
        
               | wildfire wrote:
               | > You can already verify that a > toolchain wasn't
               | backdoored ) > today
               | 
               | How, exactly?
               | 
               | If we both compiled hello.c (a prototypical hello world
               | program), and exchanged binaries; how would you verify my
               | build wasn't malicious?
        
               | taviso wrote:
               | I think the workflow you're proposing is to take some
               | trusted source code, then compile it to make a trusted
               | binary. Now compare the trusted binary to the untrusted
               | binary provided by the vendor - If they're the same -
               | then it must have been made by an uncompromised
               | toolchain.
               | 
               | That does require reproducible builds, but here is how to
               | do it without reproducible builds:
               | 
               | Take the trusted source code, then compile it to make a
               | trusted binary. Now put the untrusted binary in the
               | trash, cause you already have a trusted binary :)
        
               | squiggleblaz wrote:
               | How about if the system will only run signed builds?
               | Couldn't you use it to verify the signed build by
               | stripping the signature and comparing them?
        
             | sayrer wrote:
             | Is it technically pointless if you view it as a check on
             | your own build, rather than a check on the work of others?
             | 
             | You are obviously familiar with Bazel/Blaze etc. Wouldn't
             | reproducibility be necessary for those systems to work well
             | most of the time? I can think of exceptions (like PGO), but
             | it seems useful to produce at least some binaries this way.
             | Also covered in this:
             | https://security.googleblog.com/2021/06/introducing-slsa-
             | end...
        
               | taviso wrote:
               | > Is it technically pointless if you view it as a check
               | on your own build, rather than a check on the work of
               | others?
               | 
               | That depends, I think it's difficult and mostly still
               | pointless. I wrote about this a bit in the blog post I
               | linked to. It's a big trade off, for questionable
               | benefit.
               | 
               | > Wouldn't reproducibility be necessary for those systems
               | to work well most of the time?
               | 
               | Yes, there are definitely some good non-security reasons
               | to want deterministic builds. My gripe is only with the
               | security arguments, like claims it can reduce threats of
               | violence against developers (!?!).
        
             | franga2000 wrote:
             | > Reproducibility is technically pointless, because you
             | still have to trust the developer, and they can still add
             | backdoors.
             | 
             | Builder != developer - and with reproducible builds, you no
             | longer beed to trust the builder. CI is commonly used for
             | the final distributable builds and you can't always trust
             | the CI server. Even if you do, many rely on third party
             | thingd like docker images - if the base build image gets
             | compromised, code could trivially be injected into builds
             | running on it and without reproducible builds, that would
             | not be detectable.
             | 
             | As a developer, it would be quite reassuring to build my
             | binary (which I already do for testing) and compare the
             | hash with the one from the CI server to confirm nothing has
             | been tampered with. As a bonus, distro maintainers who have
             | their own CI can also check against my hashes to verify
             | their build systems aren't doing something fishy (malicious
             | or otherwise).
        
               | taviso wrote:
               | > As a developer, it would be quite reassuring to build
               | my binary (which I already do for testing) and compare
               | the hash with the one from the CI server to confirm
               | nothing has been tampered with.
               | 
               | That makes sense! However, this is not a good argument
               | for reproducible builds, because you can already do that
               | today.
               | 
               | You already have to build a trusted binary locally for
               | testing right? You're dreaming of being able to compare
               | that against the untrusted binary so that you can make
               | sure it's a trusted binary too - but you already have a
               | trusted binary!
               | 
               | Okay - but it's a hassle, you don't want to have to do
               | that, right? Too bad - reproducible builds only work if
               | someone reproduces them. You're still going to have to
               | replicate it somewhere you trust, so you gained
               | practically nothing.
        
               | eru wrote:
               | With the reproducible build, you can start using the
               | untrusted binary while you are still building your
               | trusted one.
               | 
               | You can also have ten people on the internet verify the
               | untrusted binary. With signatures, adding more people
               | doesn't help.
        
               | taviso wrote:
               | > With the reproducible build, you can start using the
               | untrusted binary while you are still building your
               | trusted one.
               | 
               | That's not how it works, you _have_ to reproduce it
               | before it becomes trusted.
               | 
               | > You can also have ten people on the internet verify the
               | untrusted binary.
               | 
               | Sure, then we have to build a complex consensus system
               | that introduces a bunch of unsolved problems. My opinion
               | is that this just isn't worth it, there is practically
               | nothing to gain and it's really really hard.
        
               | eru wrote:
               | > That's not how it works, you have to reproduce it
               | before it becomes trusted.
               | 
               | Eh, there's stuff you can do with software before you
               | trust it. Eg you can start pressing the CDs or
               | distributing the data to your servers. Just don't execute
               | it, yet.
               | 
               | > Sure, then we have to build a complex consensus system
               | that introduces a bunch of unsolved problems. My opinion
               | is that this just isn't worth it, there is practically
               | nothing to gain and it's really really hard.
               | 
               | It's the same informal system that keeps eg debian or the
               | Linux kernel secure currently:
               | 
               | People don't do kernel reviews themselves. They just use
               | the official kernel, and when someone finds a bug (or
               | spots otherwise bad code), they notify the community.
               | 
               | Similar with reproducible builds: most normal people will
               | just use the builds from their distro's server, but
               | independent people can do 'reviews' by running builds.
               | 
               | If ever a build doesn't reproduce, that'll be a loud
               | failure. People will complain and investigate.
               | 
               | Reproducible builds in this scenario don't protect you
               | from untrusted code upfront, but they make sure you'll
               | know when you have been attacked.
        
               | taviso wrote:
               | > People don't do kernel reviews themselves. They just
               | use the official kernel, and when someone finds a bug (or
               | spots otherwise bad code), they notify the community.
               | 
               | There's a big difference here. When a vulnerability is
               | found in the Linux kernel, that doesn't mean that you
               | were compromised.
               | 
               | If a build was found to be malicious, then you definitely
               | were compromised and it's little solace that it was
               | discovered after the fact. This is why package managers
               | check the deb/rpm signature _before_ installing the
               | software, not after.
        
       | brianzelip wrote:
       | Here's an informative podcast episode on Nix with one of its
       | maintainers, https://changelog.com/podcast/437.
        
       | jnxx wrote:
       | A good sign that the friendly competition by Guix has a positive
       | influence :)
       | 
       | https://guix.gnu.org/manual/en/html_node/Bootstrapping.html
       | 
       | https://guix.gnu.org/en/blog/2020/guix-further-reduces-boots...
        
         | delroth wrote:
         | This smaller bootstrap seed thing is a different problem from
         | reproducible builds. nixpkgs does still have a pretty big
         | initial TCB (aka. stage0) compared to Guix. But as far as I can
         | tell NixOS has the upper hand in terms of how much can be built
         | reproducibly (aka. the output hash matches across separate
         | builds).
        
           | siraben wrote:
           | There's an issue for this[0]. Currently Nixpkgs relies on a
           | 130 MB (!) uncompressed tarball, which is pretty big compared
           | to Guix. It would be amazing to get it down to something like
           | less than 1 KB with live-bootstrap.
           | 
           | Also, due to the way Nixpkgs is architectured, it also lets
           | us experiment with more unusual ideas like a uutils-based
           | stdenv[1] instead of GNU coreutils.
           | 
           | [0] https://github.com/NixOS/nixpkgs/issues/123095
           | 
           | [1] https://github.com/NixOS/nixpkgs/pull/116274
        
           | jnxx wrote:
           | Bootstrapping from a very small binary core (I think 512
           | bytes) with an initial C compiler written in Scheme also has
           | the advantage that the system can easily be ported to
           | different hardware. Which is one major strength of the GNU
           | projects and tools.
        
             | delroth wrote:
             | Not necessarily. Usually these very small cores end up
             | being more architecture specific binaries than a stage0
             | consisting of gcc + some other core packages. A good
             | illustration of this is that Guix's work on bootstrap seed
             | reduction has been so far mostly applied to i686/amd64 and
             | not even other architectures they support (at least, not
             | fully).
        
       | rejectedandsad wrote:
       | I really want to adopt Nix and NixOS for my systems but the cost
       | of wrapping packages is just a little too high for me right now
       | (or perhaps I'm out of date and a new cool tool that does it
       | automatically is out). IMHO, a dependency graph-based build
       | system that builds a hermetically sealed transitive closure of an
       | app's dependencies that can be plopped into a rootfs via Nix [0]
       | is far superior security wise to the traditional practice of
       | writing docker files.
       | 
       | [0] https://yann.hodique.info/blog/using-nix-to-build-docker-
       | ima...
        
         | SuperSandro2000 wrote:
         | Did you try https://github.com/Mic92/nix-ld or
         | https://nixos.org/manual/nixpkgs/unstable/#setup-hook-autopa...
         | ?
        
           | rejectedandsad wrote:
           | Hm, this seems like a lower level set of tools that can be
           | composed into something a bit more user-friendly (one of my
           | personal complaints with Nix as well, despite being a big fan
           | of the concept and overall execution. Nothing too steep that
           | can't be learned eventually, but the curve exists). I'm
           | wondering if there would be an audience for a higher level
           | abstraction on top of Nix, or if one already exists.
        
       | rswail wrote:
       | Recently, President Biden put out an executive order that
       | mandates that NIST et al work out, over the next year, an
       | SBOM/supply chain mandate for software used by Federal
       | departments.
       | 
       | That's going to require the equivalent of "chain of custody"
       | attestations along the entire build chain.
       | 
       | Along with SOC and PCI/DSS and other standards, this is going to
       | require companies and developers to adopt NixOS type immutable
       | environments.
        
         | ffk wrote:
         | Unfortunately, I don't think this is going to be the outcome.
         | We're more likely to end up with "Here is the list of
         | filenames, subcomponents, and associated hashes" as opposed to
         | requiring NixOS style environments. Vendors to the
         | subcontractors will likely be required to provide the same list
         | of filename/subcomponent/hashes, a far cry from repeatable
         | builds.
        
         | 1MachineElf wrote:
         | I doubt it will happen until NixOS or similar tool has a
         | corresponding DISA STIG.
        
       | pabs3 wrote:
       | I wonder what else other than disorderfs they could throw in to
       | flush out additional currently hidden non-determinism.
        
       | koolba wrote:
       | There's something very poetic about "unstable" being
       | "reproducible". It's like controlled chaos.
        
         | dane-pgp wrote:
         | I believe that's known as the Chaotic Good alignment.
        
       | toastal wrote:
       | I really liked how easy it was to create a custom ISO when I
       | installed Nix. For once I had Dvorak as the default keyboard from
       | the outset, neovim for editing, and the proprietary WiFi drivers
       | I needed all from a minimal config file and `nix build`.
        
       | fouronnes3 wrote:
       | Are there synergies with the Debian reproducible build project
       | that this can benefit from?
        
         | Denvercoder9 wrote:
         | In general, Debian aims to upstream the changes they make to
         | software. That allows all other distributions, including Nix,
         | to profit from their work making software reproducible.
        
         | aseipp wrote:
         | Debian has been a major driver in making many pieces of
         | software reproducible across _every_ distribution; that Debian
         | maintainers so often submit patches upstream and work directly
         | to solve these issues is a big reason for this.
         | 
         | In other words: the work Debian has done absolutely set the
         | stage for this to happen, and it would have taken much longer
         | without them.
        
       | amelius wrote:
       | Hopefully this will one day also work with NVidia's software
       | packages.
        
         | jeroenhd wrote:
         | The trick with nvidia on Linux is to not expect that they will
         | ever work on anything. If you want to be sure that stuff works,
         | either don't buy Nvidia or use Windows.
        
           | amelius wrote:
           | What would you recommend instead of NVidia's Jetson embedded
           | platform?
        
             | jeroenhd wrote:
             | I'm not familiar with the market the Jetson is in and what
             | purposes it serves. From a quick Google, it seems to build
             | boards for machine learning? If that's true, I'm pretty
             | sure Google and Intel have products in that space, and I'm
             | sure there's other brands I don't know of.
             | 
             | If Nvidia has its own distribution, it might well work for
             | as long as it's willing to maintain the software because
             | then they can tune their open source stuff to make it work
             | with their proprietary drivers, the same way Apple is
             | hiding their tensorflow code. I still would be hesitant to
             | rely on Nvidia in that case given their history.
        
               | aseipp wrote:
               | Google and Intel's solutions are just as proprietary,
               | with the downside almost nobody uses them so bugs,
               | performance, supported tooling, community, and support
               | windows are often much worse. It's not even clear their
               | solutions actually offer better performance in general,
               | given this. (And if you think proprietary Nvidia software
               | packages are infuriating messes, wait until you try Intel
               | proprietary software.) How you feel about their history
               | of Linux support all that said is basically irrelevant,
               | and they'll continue to dominate because of it.
        
       | solarkraft wrote:
       | That is a pretty big deal.
       | 
       | This means everyone building NixOS will get the _exact_ same
       | binary, meaning you can now trust _any_ source for it because you
       | can verify the hash.
       | 
       | It's a huge win compared to the current default distribution
       | model of "just trust these 30 american entities that the software
       | does what they say it does".
       | 
       | Big congratulations to the team.
        
       | groodt wrote:
       | This is a big deal. Congratulations to all involved.
       | 
       | In Software, complexity naturally increases over time and
       | dependencies and interactions between components become
       | impossible to reason about. Eventually this complexity causes the
       | Software to collapse under its own weight.
       | 
       | Truly reproducible builds (such as NixOS and Nixpkgs) provides us
       | with islands of "determinism" which can be taken as true
       | invariants. This enables us to build more Systems and Software on
       | top of deterministic foundations that can be reproduced by
       | others.
       | 
       | This reproducibility also enables powerful things like
       | decentralized / distributed trust. Different third-parties can
       | build the same software and compare the results. If they differ,
       | it could indicate one of the sources has been compromised. See
       | Trustix https://github.com/tweag/trustix
        
       | dcposch wrote:
       | This really deserves more love.
       | 
       | Who remembers Ken Thompson's "Reflections on Trusting Trust"?
       | 
       | The norm today is auto-updating, pre-built software.
       | 
       | This places a ton of trust in the publisher. Even for open-
       | source, well-vetted software, we all collectively cross our
       | fingers and hope that whoever is building these binaries and
       | running the servers that disseminate them, is honest and good at
       | security.
       | 
       | So far this has mostly worked out due to altruism (for open
       | source maintainers) and self interest (companies do not want to
       | attack their own users). But the failure modes are very serious.
       | 
       | I predict that everyone's imagination on this topic will expand
       | once there's a big enough incident in the news. Say some package
       | manager gets compromised, nobody finds out, and 6mo later every
       | computer on earth running `postgres:latest` from docker hub gets
       | ransomwared.
       | 
       | There are only two ways around this:
       | 
       | - Build from source. This will always be a deeply niche thing to
       | do. It's slow, inconvenient, and inaccessible except to nerds.
       | 
       | - Reproducible builds.
       | 
       | Reproducible builds are way more important than is currently
       | widely appreciated.
       | 
       | I'm grateful to the nixos team for being beating a trail thru the
       | jungle here. Retrofitting reproducibility onto a big software
       | project that grew without it, is hard work.
        
         | swiley wrote:
         | >self interest (companies do not want to attack their own
         | users).
         | 
         | Anyone who has bought an Android phone in the past 5 years
         | knows that's not true.
        
         | staticassertion wrote:
         | > Reproducible builds are way more important than is currently
         | widely appreciated.
         | 
         | Why? How will this help with the problems you're talking about?
         | 
         | I can't come up with a single benefit to security from
         | reproducible builds. It seems nice for operational reasons and
         | performance reasons though.
        
           | pilif wrote:
           | _> I can 't come up with a single benefit to security from
           | reproducible builds._
           | 
           | It is a means to allow to detect a compromised supply chain.
           | If people rebuilding a distro cannot get the same hash as the
           | distro shipping from the distributor, then likely the
           | distributors infrastructure has been compromised
        
             | staticassertion wrote:
             | How does this work in practice? The distro is owned, so
             | where are you getting the hash from? I mean, specifically,
             | what does the attacker have control of and how does a
             | repeatable build help me stop them.
        
               | pilif wrote:
               | The idea is that multiple independent builders build the
               | same distro. You expect all of them to have the same
               | final hash.
               | 
               | This doesn't help against the sources being owned, but it
               | helps about build machines being owned.
               | 
               | Accountability for source integrity is in theory provided
               | by the source control system. Accountability for the
               | build machine integrity can be provided by reproducible
               | builds.
               | 
               | To answer your specific questions: The attacker has
               | access to the distro's build servers and is packaging and
               | shipping altered binaries that do not correspond to the
               | sources but instead contain added malware.
               | 
               | Reproducible builds allow third parties to also build
               | binaries from the same sources and once multiple third
               | parties achieve consensus about the build output, it
               | becomes apparent that the distro's build infrastructure
               | could be compromised.
        
               | staticassertion wrote:
               | OK so a build machine is owned and we have a sort of
               | consensus for trusted builders, and if there's a
               | consensus mismatch we know something's up.
               | 
               | I suppose that's reasonable. Sounds like reproducible
               | builds is a big step towards that, though clearly this
               | requires quite a lot of infrastructure support beyond
               | just that.
        
         | tester756 wrote:
         | >There are only two ways around this:
         | 
         | >- Build from source. This will always be a deeply niche thing
         | to do. It's slow, inconvenient, and inaccessible except to
         | nerds.
         | 
         | if you trust the compiler :)
        
         | User23 wrote:
         | > Who remembers Ken Thompson's "Reflections on Trusting Trust"?
         | 
         | > The norm today is auto-updating, pre-built software.
         | 
         | This is a little bit misleading. The actual paper[1] explains
         | that you can't even trust source available code.
         | 
         | [1]
         | https://users.ece.cmu.edu/~ganger/712.fall02/papers/p761-tho...
        
         | Accujack wrote:
         | >The norm today is auto-updating, pre-built software.
         | 
         | Only if you define "norm" as what's prevalent in consumer
         | electronics and phones. Certainly, if you go by numbers, it's
         | more common than anything else.
         | 
         | That's not due to choice, though, it's because of the desires
         | of corporations for ever more extensive control of their
         | revenue streams.
        
         | cookiengineer wrote:
         | Actually, being able to build projects much easier from GitHub
         | is the sole reason why I'm currently using Arch as my main OS.
         | 
         | Building a project is just a shell script with a couple of
         | defined functions. Quite literally.
         | 
         | I really admire NixOS's philosophy of pushing the boundaries as
         | a distro where everything, including configurations and
         | modifications, can be done in a reproducible manner. They're
         | basically trying to automate the review process down the line,
         | which is absurdly complex as a challenge.
         | 
         | And given stability and desktop integrations improve over time,
         | I really think that Nix has the potential to be the base for
         | easily forkable distributions. Building a live/bootable distro
         | will be so much easier, as everything is just a set of
         | configuration files anyways.
        
           | takeda wrote:
           | This is slightly different thing. Nix and NixOS are trying to
           | solve multiple things, and that's what it might be a bit
           | confusing.
           | 
           | Many people don't realize that, but if you get for example
           | mentioned project from github and I do and we compile it on
           | our machines we get a different file (it'll work the same but
           | it won't be exactly the same).
           | 
           | Say we use the same dependencies, we still will get a
           | different files, because maybe you used slightly different
           | version of the compiler, or maybe those dependencies were
           | compiled with different dependencies or compilers. Maybe the
           | project while building inserts a date, or pulls some file.
           | There are million ways that we would end up with different
           | files.
           | 
           | The goal here is to get bit by bit identical files and it's
           | like a Holy Grail in this area. NixOS just appears to
           | achieved that and all packages that come with the system are
           | now fully reproducible.
        
             | eru wrote:
             | A rich source of non-reproducibility is non-determinism
             | introduced by parallel building.
             | 
             | Preserving parallel execution, but arriving at
             | deterministic outputs, is an interesting and ongoing
             | challenge. With a rich mathematical structure, too.
        
         | marcosdumay wrote:
         | > I predict that everyone's imagination on this topic will
         | expand once there's a big enough incident in the news.
         | 
         | How the Solarwinds incident, with about every large software
         | vendor being silently compromised for years does not qualify?
         | 
         | Because it does not, people's imagination is as closed as it
         | always was.
        
           | yeowMeng wrote:
           | Solarwinds is closed source so the choice to build from
           | source is not really an option.
        
             | pabs3 wrote:
             | They could have distributed the code to a few select
             | parties for the purposes of doing a build and nothing more.
        
               | marcosdumay wrote:
               | Specifically Microsoft did distribute the code to several
               | parties for the purposes of auditing. But they didn't
               | allow building it.
        
         | zamadatix wrote:
         | Unless you are going to be the equivalent of a full time
         | maintainer doing code review for every piece of software you
         | use you need to trust other software maintainers reproducible
         | builds or not. Considering this is Linux and not even Linus can
         | deeply review every change in just the kernel anymore that
         | philosophy can't apply to meaningfully large software like
         | Nixos.
        
           | Taek wrote:
           | You can't solve this problem without having a full history of
           | code to inspect (unless you are decompiling), reproducibility
           | is the first step and bootstrapability is the second step.
           | Then we refine the toolchains and review processes to ensure
           | high impact code is properly scrutinized.
           | 
           | What we can't do is throw our hands up and say anyone who
           | compromises the toolchain deep enough is just allowed to win.
           | It will happen at some point if we don't put the right
           | barriers in place.
           | 
           | It's the first step of a long journey, but it is a step we
           | should be taking.
        
             | donio wrote:
             | https://github.com/fosslinux/live-bootstrap is another
             | approach, bootstrapping from a tiny binary seed that you
             | could hand-assemble and type in as hex. But it doesn't
             | address the dependency on the underling OS being
             | trustworthy.
        
           | jnxx wrote:
           | That's too black-and-white. Being able to reproduce stuff
           | makes some kind of attacks entirely uninteresting because
           | malicious changes can be traced back. Which is what many
           | types of attackers do not want. Debian, or the Linux kernel,
           | for example, are not fool-proof, but both are in practice
           | quite safe to work with.
        
             | zamadatix wrote:
             | Who are you going to trace it back to if not the maintainer
             | anyways? If the delivery method then why is the delivery of
             | the source from the maintainer inherently any safer?
        
               | jnxx wrote:
               | No, it is not always the maintainer. Imagine you download
               | a binary software package via HTTPS. In theory, the
               | integrity of the download is protected by the server
               | certificate. However, it is possible that certificates
               | get hacked, get stolen, or that nation states force CAs
               | to give out back doors. In that case, your download could
               | have been changed on the fly with arbitrary alterations.
               | Reproducible builds make it possible to detect such
               | changes.
        
               | zamadatix wrote:
               | Same as when you download the source instead of the
               | binary and see it reproducibly builds the backdoored
               | binary. And at this point we're back to "Build from
               | source. This will always be a deeply niche thing to do.
               | It's slow, inconvenient, and inaccessible except to
               | nerds." anyways.
               | 
               | It's not that reproducible builds provide 0 value it's
               | that they don't truly solve the trust problem as
               | initially stated. They also have non-security value to
               | boot which is often understated compared to the security
               | value IMO.
        
               | eru wrote:
               | Reproducible builds still help a lot with security. For
               | example, they let you shift build latency around.
               | 
               | Eg suppose you have a software package X, available both
               | as a binary and in source.
               | 
               | With reproducible builds, you can start distributing the
               | binary to your fleet of computers, while at the same time
               | you are kicking off the build process yourself.
               | 
               | If the result of your own build is the same as the binary
               | you got, you can give the command to start using it.
               | (Otherwise, you quarantine the downloaded binary, and
               | ring some alarm bells.)
               | 
               | Similarly, you can re-build some random sample of
               | packages locally, just to double-check, and report.
               | 
               | If most debian users were to do something like that, any
               | tempering with the debian repositories would be quickly
               | detected.
               | 
               | (Having a few malicious users wouldn't hurt this strategy
               | much, they can only insert noise in the system, but not
               | give you false answers that you trust.)
        
               | robocat wrote:
               | > and inaccessible except to nerds.
               | 
               | So was most every part of computer hardware and software
               | initially - this is just another milestone in that
               | journey.
        
               | bigiain wrote:
               | I guess reproducible builds solve some of the problems in
               | the same way TLS/SSL solves some of the problems.
               | 
               | Most of the world is happy enough with the soft guarantee
               | of: "This is _probably_ your bank's real website. Unless
               | a nation state is misusing their control over state owned
               | certificate authorities, or GlobalSign or LetsEncrypt or
               | whoever has been p0wned."
               | 
               | Expecting binary black and white solutions to trust
               | problems isn't all that useful, in my opinion. Often
               | providing 50% more "value" in trust compared to the
               | status quo is extremely valuable in the bigger picture.
        
               | zamadatix wrote:
               | Reproducible builds solve many security problems for sure
               | but but the problems it solves in no way help you if the
               | maintainer is not alturistic or bad at security as
               | originally stated. It helps tell you if the maintainers
               | toolchain wasn't compromised and it does it AFTER the
               | payload is delivered and you built your own payload not
               | made by the maintainer anyways. It doesn't even tell you
               | the transport/hosting wasn't compromised unless you can
               | somehow get a copy of the source used to compile not made
               | by the maintainer directly as the transport/hosting for
               | the source they maintain could be as well.
               | 
               | Solving that singular attack vector in the delivery chain
               | does nothing for solving the need to trust the altruism
               | and self interest of maintainers. A good thing(tm)?
               | Absolutely, along with the other non security benefits,
               | but has nothing to do with needing to trust maintainers
               | or be in the niche that reviews source code when
               | automatic updates come along as originally sold.
        
               | pabs3 wrote:
               | There are other solutions to the problem of trusting
               | maintainers; namely incremental distributed code review.
               | The Rust folks are working on that:
               | 
               | https://github.com/crev-dev/
        
               | squiggleblaz wrote:
               | The question isn't whether they're perfect, nor is it
               | whether they prevent anything. But it does help a person
               | who suspects something is up rule certain things in and
               | out, which increases the chances that the weak link can
               | be found and eliminated.
               | 
               | If you have a fair suspicion that something is up and you
               | discover that when you compile reproduceable-package you
               | get a different output than when you download a prebuilt
               | reproduceable-package, you've now got something to work
               | with.
               | 
               | Your observation that they don't truly solve the trust
               | problem is true. But it's somehow not relevant. It is
               | better to be better off.
        
               | eptcyka wrote:
               | Even if the original attack happened upstream, if the
               | upstreamed piece of software was pinned via git, then
               | it'd be trivial to bisect the upstream project to find
               | the culprit.
        
               | dragonsky67 wrote:
               | This is great if you are looking at attributing blame.
               | Not so great if you are trying to prevent all the worlds
               | computers getting owned....
               | 
               | I'd imagine that if I were looking at causing world wide
               | chaos, I'd love nothing better than getting into the tool
               | chain in a way that I could later on utilise on a wide
               | spread basis.
               | 
               | At that point I would have achieved my aims and if that
               | means I've burnt a few people along the way, so be it,
               | I'm a bad guy, the damage has been done, the objective
               | met.
        
           | IgorPartola wrote:
           | No. I can review 0.1% of the code and verify that it compiles
           | correctly and then let another 999 people review their own
           | portion. It only takes one person to find a bit of malicious
           | code, we don't all need to review every single line.
        
             | remram wrote:
             | That only works if you coordinate. With even more people,
             | you can pick randomly and be relatively sure you've read it
             | all, but I posit that 1) you don't pick randomly, you pick
             | a part that is accessible or interesting to you (and
             | therefore probably others) and 2) reading code locally is
             | not sufficient to find bugs or backdoors in the whole.
        
               | pabs3 wrote:
               | The crev folks are working on a co-ordination system for
               | incremental distributed code review:
               | 
               | https://github.com/crev-dev/
        
               | remram wrote:
               | Crev is a great idea, unfortunately it is only really
               | available for Rust right now.
        
               | pabs3 wrote:
               | I noticed there is a git-crev project, might that be
               | useful for other languages? Also there is pip-crev for
               | Python.
        
               | IgorPartola wrote:
               | I actually wonder if it's possible to write code at such
               | a macro level as to obfuscate, say, a keylogger in a huge
               | codebase such that reviewing just a single module/unit
               | would not reveal that something bad is going on.
        
               | eru wrote:
               | Depends on how complicated the project itself is. A
               | simple structure with the bare minimum of side-effects
               | (think, functional programming) would make this effort
               | harder.
               | 
               | For something like C, all bets are off:
               | http://www.underhanded-c.org/ or
               | https://en.wikipedia.org/wiki/Underhanded_C_Contest
        
             | xvector wrote:
             | > It only takes one person to find a bit of malicious code,
             | we don't all need to review every single line.
             | 
             | This is just objectively wrong. I have worked on projects
             | at FAANG where entire teams did not spot critical security
             | issues during review.
             | 
             | You are very unlikely to spot an issue with just one pair
             | of eyes. You need many if you want _any_ hope of catching
             | bugdoors.
        
               | IgorPartola wrote:
               | You are misunderstanding what I am saying. I am saying
               | that it only takes one person who finds a vulnerability
               | to disclose it, to a first approximation. Realistically
               | it's probably closer to 2-3 since the first might be
               | working for the NSA, the CCP, etc. I am making no
               | arguments about what amount of effort it takes to find a
               | vulnerability, just talking about how not every single
               | user of a piece of code needs to verify it.
        
           | radicalcentrist wrote:
           | Reproducibility is what allows you to rely on other
           | maintainers' reviews. Without reproducibility, you can't be
           | certain that what you're running has been audited at all.
           | 
           | It's true that no single person can audit their entire
           | dependency tree. But many eyes make all bugs shallow.
        
         | jnxx wrote:
         | This is great! The one fly in the ointment, pardon, is that Nix
         | is a bit lax about trusting proprietary and binary-only stuff.
         | It would be great if there were a FLOSS-only core system for
         | NixOS which would be fully transparent.
        
           | rejectedandsad wrote:
           | > It would be great if there were a FLOSS-only core system
           | for NixOS
           | 
           | Might be wrong but isn't this part of the premise for
           | Guix/GuixSD?
        
             | Filligree wrote:
             | And it's good that it exists, I _guess?_
             | 
             | But it can't do any of the things I bought my computer to
             | do, so it's of limited value to me.
        
           | quarantine wrote:
           | Nix/Nixpkgs blocks unfree packages by default, so I presume
           | it would be relatively easy to disable packages with the
           | `unFree` attribute.
        
             | jnxx wrote:
             | I totally believe it is possible, it is perhaps more of a
             | cultural thing.
        
               | eptcyka wrote:
               | It's the pragmatic thing. I wouldn't use nixOS if I
               | wasn't able to use it on a 16 core modern desktop. I
               | don't think there's a performant and 100% FLOSS
               | compatible computer that wouldn't make me want to gouge
               | my eyes out with a rusty spoon when building stuff for
               | ARM.
        
               | zamadatix wrote:
               | Talos has 44 core/176 thread server options which can
               | take 2 TBs of DDR4 that are FSF certified. The board
               | firmware is also open and has reproducible builds.
        
               | eptcyka wrote:
               | Thanks, I was legitimately unaware of this option. That
               | does smash my argument, but I'm not likely to be using a
               | system like that anytime soon due to cost concerns
               | mostly.
        
               | tadfisher wrote:
               | That is way more expensive than a 16-core desktop,
               | though. Workstations are a class above consumer-grade
               | desktops and that's reflected in the price.
        
               | zamadatix wrote:
               | Talos have as low as 8 core desktop options as well this
               | is just an example of how far you can take FLOSS
               | hardware. Not that I consider a 16 core x86 desktop
               | "consumer-grade" in the first place (speaking as a 5950X
               | owner).
               | 
               | Probably not fit for replacing Grandma's budget PC but
               | then again grandma probably isn't worried about the ARM
               | cross compile performance of their machine running NixOS
               | either.
        
               | kaba0 wrote:
               | And it's not just hardware, there is a useful limit on
               | purity of licenses. In many cases only proprietary
               | programs can do the work at all, or orders of magnitudes
               | better.
        
         | londons_explore wrote:
         | > and 6mo later every computer ... gets ransomwared.
         | 
         | I'm really surprised such an attack hasn't happened already. It
         | seems so trivial for a determined attacker to take over an
         | opensource project (plenty of very popular projects have just a
         | single jaded maintainer).
         | 
         | The malicious compiler could inject an extra timed event into
         | the main loop for the time the attack is scheduled to begin,
         | but only if it's >3 hours away, which simply retrieves a URL
         | and executes whatever is received.
         | 
         | Detecting this by chance is highly unlikely - because to find
         | it, someone would have to have their clock set months ahead, be
         | running the software for many days, _and_ be monitoring the
         | network.
         | 
         | That code is probably only a few hundred bytes, so it probably
         | won't be noticed in any disassembly, and is only executed once,
         | so probably won't show up in debugging sessions or cpu
         | profiling.
         | 
         | It just baffles me that this hasn't been done already!
        
           | Gravyness wrote:
           | > I'm really surprised such an attack hasn't happened
           | already.
           | 
           | If you count npm packages this happened quite a few times
           | already. People (who don't understand security very well)
           | seems to be migrating to python now.
        
           | schelling42 wrote:
           | How do you know it hasn't been done already? (with a more
           | silent payload than ransomware) /s
        
             | Tabular-Iceberg wrote:
             | What does the /s mean in this context?
        
               | Zetaphor wrote:
               | /s is internet parlance to show that the message should
               | be read in a sarcastic tone.
        
               | Tabular-Iceberg wrote:
               | Yes, but what confused me is that as far as I can tell we
               | really don't know that it hasn't been done before.
        
               | ghoward wrote:
               | Not GP, but I think it indicates sarcasm?
        
         | esjeon wrote:
         | > I'm grateful to the nixos team for being beating a trail thru
         | the jungle here. Retrofitting reproducibility onto a big
         | software project that grew without it, is hard work.
         | 
         | Actually, it's Debian guys who pushed reproducible build hard
         | in the early days. They upstreamed necessary changes and also
         | spread the concept itself. This is a two-decade long community
         | effort.
         | 
         | In turn, NixOS is mostly just wrapping those projects with
         | their own tooling, literally a cherry on the top. NixOS is
         | disproportionately credited here.
        
           | catern wrote:
           | That's somewhat uncharitable. patchelf, for example, is one
           | tool developed by NixOS which is widely used for reproducible
           | build efforts. (although I don't know concretely if Debian
           | uses it today)
        
             | Foxboron wrote:
             | patchelf is not really widely used for solving reproducible
             | builds issues. It's made for rewriting RPATHs which is
             | essential for NixOS, but not something you would be seeing
             | in other distributions except for when someone need to work
             | around poor upstream decisions.
        
           | zucker42 wrote:
           | I don't think NixOS is getting too much credit. This is an
           | accomplishment, even if it was built on the shoulders of
           | giants.
        
           | theon144 wrote:
           | By the way, here's the stats on Debian's herculean share of
           | the efforts: https://wiki.debian.org/ReproducibleBuilds
        
             | raziel2p wrote:
             | The ratio of reproducible to non-reproducible packages
             | doesn't seem to have changed that much in the last 5 years.
        
               | kzrdude wrote:
               | They have new challenges with new packages. In the last 5
               | years there entered a lot of rust packages for example, a
               | new compiler to tackle reproducibility with (and not
               | trivial, even if upstream has worked on it a lot).
        
               | stavros wrote:
               | In my experience, rustc builds are reproducible if you
               | build on the same path. They come out byte for byte
               | identical.
        
               | kungito wrote:
               | Yeah I remember there was some drama regarding build
               | machine path leaking into the release binaries
        
               | kzrdude wrote:
               | Aha.. don't all compilers behave the same way, with debug
               | info?
               | 
               | I mean it's worthwhile to fix, but that behaviour seems
               | so standard.
        
               | KirillPanov wrote:
               | No, rust leaks the path to the source code on the _build_
               | machine. This path likely does not even exist on the
               | execution machine, so there 's absolutely no good reason
               | for this leakage. It is very nonstandard.
               | 
               | It is really, really annoying that the Rust team is not
               | taking this problem seriously.
        
               | shawnz wrote:
               | I don't think this is correct. Most compilers include the
               | path to the source code on the build machine in the debug
               | info, and it's a common problem for reproducible builds.
               | This is not a rust-specific issue.
               | 
               | Obviously the binary can't contain paths from the
               | execution machine because it doesn't know what the
               | execution machine will be at compile time, and the source
               | code isn't stored on the execution machine anyway. The
               | point of including the source path in the debug info is
               | for the developer to locate the code responsible if
               | there's a crash.
               | 
               | See: https://reproducible-builds.org/docs/build-path/
        
               | colejohnson66 wrote:
               | But is it only on debug builds? Or are release builds
               | affected? Because if it's the latter, that's a big issue.
               | But for the former, does it really matter?
        
           | mikepurvis wrote:
           | I think both efforts have been important and have benefitted
           | each other. Nix has _always_ had purity /reproducibility as
           | tenets, but indeed it was Debian that got serious about it on
           | a bit-for-bit basis, with changes to the compilers, tools
           | like diffoscope, etc. The broader awareness and feasibility
           | of reproducible builds then made it possible for Nix to
           | finally realise the original design goal of a content-
           | addressed rather than input-addressed store, where you don't
           | need to actually sign your binary cache, but rather just sign
           | a mapping between input hashes and content hashes.
        
             | Ericson2314 wrote:
             | > where you don't need to actually sign your binary cache,
             | but rather just sign a mapping between input hashes and
             | content hashes.
             | 
             | Though you can and should sign the mapping!
        
               | mikepurvis wrote:
               | Of course, yes-- that was what I was saying. But the
               | theory with content-addressability is that unlike a
               | conventional distro where the binaries must all be built
               | and then archived and distributed centrally, Nix could do
               | things like age-out the cache and only archive the
               | hashes, and a third party could later offer a rebuild-on-
               | demand service where the binaries that come out of it are
               | known to be identical to those which were originally
               | signed. A similar guarantee is super useful when it comes
               | to things like debug symbols.
        
           | dcposch wrote:
           | Has a full linux image--something you can actually boot--
           | existed as a reproducible build before today?
        
             | 0xEFF wrote:
             | Forgive my ignorance but isn't that Slackware?
        
               | heisenzombie wrote:
               | No, if I build Slackware on my computer and you build
               | Slackware on yours; the binaries we end up with will not
               | be bit-for-bit identical.
        
           | chriswarbo wrote:
           | > This is a two-decade long community effort.
           | 
           | So is Nix/NixOS, which has reproducibility in mind from the
           | start.
           | 
           | The earliest example I can find is "Nix: A Safe and Policy-
           | Free System for Software Deployment" from 2004 ( https://www.
           | usenix.org/legacy/event/lisa04/tech/full_papers/... ):
           | 
           | > Build farms are also important for release management - the
           | production of software releases - which must be an automatic
           | process to ensure reproducibility of releases, which is in
           | turn important for software maintenance and support.
           | 
           | Eelco's thesis (from 2006) also has this as the first bullet-
           | point in its conclusion:
           | 
           | > The purely functional deployment model implemented in Nix
           | and the cryptographic hashing scheme of the Nix store in
           | particular give us important features that are lacking in
           | most deployment systems, such as complete dependencies,
           | complete deployment, side-by-side deployment, atomic upgrades
           | and rollbacks, transparent source/binary deployment and
           | reproducibility (see Section 1.5).
        
         | 0xbadcafebee wrote:
         | Supply chain attacks are definitely important to deal with, but
         | defense-in-depth saves us in the end. Even if a postgres
         | container is backdoored, if the admins put postgres by itself
         | in a network with no ingress or egress except the webserver
         | querying it, an attack on the database itself would be very
         | difficult. If on the other hand, the database is run on
         | untrusted networks, and sensitive data kept on it... yeah,
         | they're boned.
        
           | dcposch wrote:
           | In the case of a supply chain attack, you don't even need
           | ingress or egress.
           | 
           | Say the posgres binary or image is set to encrypt the data on
           | a certain date. Then it asks you to pay X ZEC to a shielded
           | address to get your decryption key. This would work even if
           | the actual database was airgapped.
        
         | radicalcentrist wrote:
         | Reproducibility is necessary, but unfortunately not sufficient,
         | to stop a "Trusting Trust" attack. Nixpkgs still relies on a
         | bootstrap tarball containing e.g. gcc and binutils, so
         | theoretically such an attack could trace its lineage back to
         | the original bootstrap tarball, if it was built with a
         | compromised toolchain.
        
           | beermonster wrote:
           | And also shipped firmware or binary blobs.
        
           | mjg59 wrote:
           | Diverse double compilation should allow a demonstration that
           | the toolchain is trustworthy.
        
             | smitty1e wrote:
             | And how _about_ that hardware and firmware microcode?
        
             | Foxboron wrote:
             | Indeed, and with the work done by Guix and the Reproducible
             | Builds project we do have a real-world example of diverse
             | double compilation which is not just a toy example
             | utilizing the GNU Mes C compiler.
             | 
             | https://dwheeler.com/trusting-trust/#real-world
        
               | dane-pgp wrote:
               | Projects like GNU Mes are part of the Bootstrappable
               | Builds effort[0]. Another great achievement in that area
               | is the live-bootstrap project, which has automated a
               | build pipeline that goes from a minimal binary seed up to
               | tinycc then gcc 4 and beyond.[1]
               | 
               | [0] https://www.bootstrappable.org/
               | 
               | [1] https://github.com/fosslinux/live-
               | bootstrap/blob/master/part...
        
               | Foxboron wrote:
               | I feel the need to point out that the "Bootstrappable
               | Builds" project is a working group from a Reproducible
               | Builds project which where interested in the next step
               | beyond reproducing binaries. Obviously this project has
               | seen most effort from Guix :)
               | 
               | The GNU Mes C experiment mentioned above was also
               | conducted during the 2019 Reproducible Builds summit in
               | Marrakesh.
               | 
               | https://reproducible-builds.org/events/Marrakesh2019/
        
             | naniwaduni wrote:
             | In principle, diverse double-compiling merely increases the
             | number of compilers the adversary needs to subvert. There
             | are obvious practical concerns, of course, but frankly this
             | raises the bar _less_ than maintaining the backdoor across
             | future versions of the same compiler did in the first
             | place, since at least backdooring multiple contemporary
             | compilers doesn 't rely on guessing, well ahead of time,
             | what change future people are going to make.
             | 
             | Critically, it shouldn't be taken as a _demonstration_ that
             | the toolchain is trustworthy unless you trust whoever 's
             | picking the compilers! This kind of ruins approaches based
             | on having any particular outside organization certify
             | certain compilers as "trusted".
        
               | XorNot wrote:
               | There is an uphill effort here to actually do this. While
               | theoretically a very informed adversary might get it
               | right first time, human adversaries are unlikely to and
               | their resources are large, but far from infinite.
               | 
               | Your entire effort is potentially brought down by someone
               | making a change in a way you didn't expect and someone
               | goes "huh, that's funny..."
        
               | GauntletWizard wrote:
               | Quite frankly, I'm surprised that is hasn't come up
               | multiple times in the course of getting to NixOS and etc.
               | The attacks are easy to hide and hard to attribute.
        
             | User23 wrote:
             | Really? How does that accomplish more than proving the
             | build is a fixed point? An attacker may well be aware of
             | the fixed point combinator after all.
             | 
             | Edit: I think that tone may have come off as snarky, but I
             | meant it as an honest question. If any expert can answer
             | I'd really appreciate it.
        
               | eru wrote:
               | Fixed points don't come in here at all, unless you
               | specifically want to talk about compiling compilers.
               | 
               | Diverse double compilation is useful for run-of-the mill
               | programs, too.
        
               | chriswarbo wrote:
               | Programs built by different compilers aren't generally
               | binary comparable, e.g. we shouldn't expect empty output
               | from `diff <(gcc run-of-the-mill.c) <(clang run-of-the-
               | mill.c)`
               | 
               | However, the _behaviour_ of programs built by different
               | compilers should be the same. Run-of-the-mill programs
               | could use this as part of a test suite, for example; but
               | diverse double compilation goes a step further:
               | 
               | We build compiler A using several different compilers X,
               | Y, Z; then use those binaries A-built-with-X, A-built-
               | with-Y, A-built-with-Z to compile A. The binaries
               | A-built-with-(A-built-with-X), A-built-with-(A-built-
               | with-Y), A-built-with-(A-built-with-Z) should all be
               | identical. Hence for 'fully countering trusting trust
               | through diverse double-compiling', we must compile
               | compilers https://dwheeler.com/trusting-trust/
        
         | tbrock wrote:
         | Why does building from source help? It's not like people are
         | reading every line of the source before building it anyway
         | 99.99% of the time.
        
           | xvector wrote:
           | If the package maintainer's build pipeline is compromised
           | (eg. Solarwinds), you are unlikely to be affected if you
           | build from reviewed source yourself.
        
             | pjmlp wrote:
             | Except hardly anyone reviews a single line of code.
        
               | squiggleblaz wrote:
               | So? We are trying to protect against a malicious
               | interloper damaging the machine of a trusted and
               | trustworthy partner.
               | 
               | You are bringing up red herrings about trusted partners
               | being malicious and untrustworthy.
               | 
               | Do you genuinely believe we should only solve a problem
               | if it leads to a perfect outcome?
        
               | pjmlp wrote:
               | I genuinely believe to spend resources on issues where
               | ROI is positive.
               | 
               | So far exploits on FOSS kind of prove the point not
               | everyone is using Gentoo, reading every line of code on
               | their emerged packakges, let alone similar computing
               | models.
               | 
               | Now if we are speaking about driving the whole industry
               | to where security bugs, caused by using languages like C
               | that cannot save us from code reviews unless done by ISO
               | C language lawyers and compiler experts in UB
               | optimizations, are heavily punished like construction
               | companies are for a fallen bridge, then that would be
               | interesting.
        
               | therealjumbo wrote:
               | > I genuinely believe to spend resources on issues where
               | ROI is positive.
               | 
               | How are you measuring the ROI of security efforts inside
               | an OSS distro like debian or nixos? The effort in such
               | orgs is freely given, so nobody knows how much it costs.
               | And how would you calculate the return on attacks that
               | have been prevented? Even if an attack wasn't prevented
               | you don't know how much it cost, and you might not even
               | know if it happened (or if it happened due to a lapse in
               | debian.)
               | 
               | >So far exploits on FOSS kind of prove the point not
               | everyone is using Gentoo, reading every line of code on
               | their emerged packakges, let alone similar computing
               | models.
               | 
               | Reproducible builds is attempting to mitigate a very
               | specific type of attack, not all attacks in general. That
               | is, it focuses on a specific threat model and countering
               | that, nothing else. It's not a cure for cancer either.
               | 
               | >Now if we are speaking about driving the whole industry
               | to where security bugs, caused by using languages like C
               | that cannot save us from code reviews unless done by ISO
               | C language lawyers and compiler experts in UB
               | optimizations, are heavily punished like construction
               | companies are for a fallen bridge, then that would be
               | interesting.
               | 
               | This is just a word salad of red herrings. Different
               | people can work on different stuff at the same time.
        
         | 1vuio0pswjnm7 wrote:
         | "- Build from source. This will always be a deeply niche thing
         | to do. It's slow, inconvenient, and inaccessible except to
         | nerds."
         | 
         | I prefer compiling from source to binary packages. For me it is
         | neither slow, incovenient nor inaccessible.
         | 
         | Only with larger, more complex programs does compiling from
         | source become a PITA.
         | 
         | The "solution" I take is to prefer smaller, less complex
         | programs over larger, more complex ones.
         | 
         | If I cannot compile a program from source relatively quickly
         | and easily, I do not voluntarily choose it as a program to use
         | daily and depend on.
         | 
         | For compiling OS, I use NetBSD so perhaps I am spoiled because
         | it is relatively easy to compile.
         | 
         | That said, I understand the value of reproducible builds and
         | appreciate the work being done on such projects.
        
           | kixiQu wrote:
           | "except to nerds" was conversationally phrased shorthand for
           | "except to people with rarefied technical skills".
        
           | [deleted]
        
           | kaba0 wrote:
           | You don't use a browser or an office suite? Because those are
           | a pain in the ass to compile (in terms of time).
        
             | 1vuio0pswjnm7 wrote:
             | Not just time, IME. Also 1. highly resource intensive,
             | e.g., cannot compile on small form factor computers (easier
             | for me to compile a kernel than a "modern" browser) and 2.
             | brittle.
        
           | zucker42 wrote:
           | Don't take this the wrong way, but I think you qualify as a
           | nerd. :)
        
           | brigandish wrote:
           | Unfortunately, it's easy to break a lot of builds by things
           | such as deciding not to install to /usr/local, or by building
           | on a Mac. Pushing publishers to practices that aid
           | reproducible builds would help both sides.
           | 
           | I'd love to try building NetBSD, btw, I must try that!
        
           | vore wrote:
           | I think using NetBSD might put you in the nerd camp ;-)
        
         | hsbauauvhabzb wrote:
         | I don't have the resources to audit every component of my
         | system. I favour enterprise distros who audit code which ends
         | up in their repos and avoid pip, npm, etc. but there are some
         | glaring trade offs on both productivity and scalability.
         | 
         | The problem is unmaintainability, I can't imagine it'd be
         | easier for medium sized teams where security isn't a priority,
         | either.
        
         | initplus wrote:
         | Building from source doesn't have to be inaccessible, if the
         | build tooling around it is strong. Modern compiled languages
         | like Go (or modern toolchains on legacy languages like vcpkg)
         | have a convention of building everything possible from source.
         | 
         | So at least for software libraries building from source is
         | definitely viable. Fro end user applications it's another story
         | though, doubt we will ever be at a point where building your
         | own browser from source makes sense...
        
           | garmaine wrote:
           | Binary reproducible builds are still pretty inaccessible
           | though.
        
           | bigiain wrote:
           | Building from source also doesn't buy you very much, if you
           | haven't inspected/audited the source.
           | 
           | The upthread hypothetical of a compromised package manager
           | equally applies to a compromised source repo.
           | 
           | _Maybe _ you always check the hashes? _Maybe_ you always get
           | the hashes from a different place to the code? _Maybe_ the
           | hypothetical attacker couldn't replace both the code you
           | download and the hash you use to check it?
           | 
           | (And as Ken pointed out decades ago, maybe the attacker
           | didn't fuck with your compiler so you had lost before you
           | even started.)
        
       | goodpoint wrote:
       | Reminder: https://reproducible-builds.org/ was born in Debian and
       | pioneered reproducible building.
       | 
       | It took very significant efforts and largely benefit build tools
       | (compilers, linkers, libraries) that are not Debian-specific.
        
       | georgyo wrote:
       | Mandatory link to the Debian single purpose site:
       | https://isdebianreproducibleyet.com/
       | 
       | However that is for everything in Debian, not just the iso. It is
       | truly remarkable to see all the Linux distributions move the
       | needle forward.
        
         | Foxboron wrote:
         | And Arch Linux :)
         | 
         | https://reproducible.archlinux.org/
        
           | [deleted]
        
       | pabs3 wrote:
       | I wonder when PyPI and similar ecosystems will get deterministic
       | reproducible builds.
        
       | mraza007 wrote:
       | This might be a dumb question but what's a reproducible build
        
         | egberts1 wrote:
         | The ability to recreate a binary image from the same set of
         | source files and getting that binary to be identical to the
         | package-provided binary.
         | 
         | This is useful form of ensuring that nothing is amiss during
         | compile/link time.
         | 
         | Today's GNU toolchain clutters the interior of binary files
         | with random hash values, full file path (that you couldn't
         | recreate ... easily), and random tmpfile directories.
         | 
         | The idea is to make it easier to verify a binary, compare it
         | with earlier-built-but-same-source binary, or to be able to
         | reverse engineering it (and catch unexpected changes in code).
        
           | mraza007 wrote:
           | Oh okay thank you for explanation. You explained it really
           | well
        
       | aseipp wrote:
       | I think r13y has said the minimal ISO was less than 10 packages
       | away from 100% over _2 years_ now. The long tail has finally been
       | overcome! Huge news.
        
         | jonringer117 wrote:
         | Some of the issues were really difficult to tackle, like the
         | linux kernel generating random hashes.
         | 
         | The last mile was done by removing the use of ruby (which uses
         | some random tmp directories) from the final image. Asciidoctor
         | (ruby) was replaced with asciidoc (python).
        
       ___________________________________________________________________
       (page generated 2021-06-21 23:02 UTC)