https://lwn.net/SubscriberLink/898522/9cf50ee3f96f90c1/ LWN.net Logo LWN .net News from the source LWN * Content + Weekly Edition + Archives + Search + Kernel + Security + Distributions + Events calendar + Unread comments + ------------------------------------------------------------- + LWN FAQ + Write for us User: [ ] Password: [ ] [Log in] | [Subscribe] | [Register] Subscribe / Log in / New account Whatever happened to SHA-256 support in Git? [LWN subscriber-only content] Welcome to LWN.net The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net! By Jonathan Corbet June 23, 2022 The news has been proclaimed loudly and often: the SHA-1 hash algorithm is terminally broken and should not be used in any situation where security matters. Among other things, this news gave some impetus to the longstanding effort to support a more robust hash algorithm in the Git source-code management system. As time has passed, though, that work seems to have slowed to a stop, leaving some users wondering when, if ever, Git will support a hash algorithm other than SHA-1. Hash functions are, of course, at the core of how Git works. Every object in its data store -- every version of every file, among other things -- is hashed, with the resulting value serving as the key under which that object is stored. Commits, too, are represented by a hash of the current state of the tree, the commit message, and the hash (es) of the parent commit(s). The security of the hash function is a key part of the integrity of a repository as a whole. If an attacker could replace a commit with another having the same hash value, they could perhaps inject malicious code into a repository without risking detection. That prospect is worrisome to anybody who depends on the security of code stored in Git repositories -- everybody, in other words. The Git project has long since chosen SHA-256 as the replacement for SHA-1. Git was originally written with SHA-1 deeply wired into the code, but all of that code has since been refactored and can handle multiple hash types, with SHA-256 being the second supported type. It is now possible to create a Git repository using SHA-256 (just use the --object-format=sha256 flag) and most local operations will work just fine. The foundation for support of alternative hash algorithms in Git was part of the 2.29 release in 2020 and appears to be solid. That 2.29 release, though, is the last one that features alternative-hash work in any significant way; there has been no mention of this work in the project's release notes since a fix showed up in 2.31, released in March 2021. The 2.29 work marked SHA-256 as experimental and warned that "`that there is no interoperability between SHA-1 and SHA-256 repositories yet'". There was some work toward interoperability posted in 2020, but those patches do not appear to have ever been merged into the Git mainline. In other words, work on supporting the use of a hash algorithm other than SHA-1 in Git appears to have ground to a halt. That recently led Stephen Smith to post a query about its status to the development list. This response from AEvar Arnfjord Bjarmason is illuminating and, for those looking forward to full SHA-256 support, potentially discouraging: I wouldn't recommend that anyone use it for anything serious at the moment, as far as I can tell the only users (if any) are currently (some) people work on git itself. Bjarmason pointed out that there is still no interoperability between SHA-1 and SHA-256 repositories, and that none of the Git hosting providers appear to be supporting SHA-256. That support (or the lack thereof) matters; a repository that cannot be pushed to a Git forge will be essentially useless to many people. There is also the risk (which cannot really be made to go away) that the longer hashes used with SHA-256 may break tools developed outside of the Git project. The overall picture is one of a feature that is not yet ready for real-world use. That said, it is worth noting that brian m. carlson, who has done the bulk of the hash-transition work so far, disagrees with Bjarmason's assessment. In his view, the only "`defensible'" reason to use SHA-1 at this point is interoperability with the Git forge providers. Otherwise, he said, SHA-1 is obsolete, and performance with SHA-256 can be "`substantially faster'". But he agrees that the needed interoperability does not exist, and nobody has said that it is coming anytime soon. What has happened here looks, to an extent at least, like a story that has played out numerous times over the course of free-software history. A problem has been identified, and a great deal of core foundational work has been done to solve it. That solution appears to be well considered and solidly implemented. In a sense, the job is 90% done. All that is left is the hard work of making the transition to a new hash easy for users -- what could be thought of as "the other 90%" of the job. This sort of interface and compatibility development is hard and developers often do not find it particularly rewarding, so it tends to be neglected by our community. The Git project, one might argue, is especially prone to user-interface challenges, but the problem is wider than that. There are certain sorts of tasks that volunteers are often uninclined to pick up, and that companies may not feel the need to fund. Given the threat that the SHA-1 hash poses, one might think that there would be a stronger incentive for somebody to support this work. But, as Bjarmason continued, that incentive is not actually all that strong. The project adopted the SHA-1DC variant of SHA-1 for the 2.13 release in 2017, which makes the project more robust against the known SHA-1 collision attacks, so there does not appear to be any sort of imminent threat of this type of attack against Git. Even if creating a collision were feasible for an attacker, Bjarmason pointed out, that is only the first step in the development of a successful attack. Finding a collision of any type is hard; finding one that is still working code, that has the functionality the attacker is after, and that looks reasonable to both humans and compilers is quite a bit harder -- if it is possible at all. So few people are losing sleep over the possibility that a Git repository could be deliberately corrupted by way of an SHA-1 hash collision anytime soon. The combination of a lack of urgency and little apparent interest in doing the work has seemingly brought the SHA-256 transition to a halt. Perhaps that is how the situation will remain until another SHA-1 weakness turns up and brings attention back to the situation. But, as Randall Becker pointed out, there is a cost to this inaction: Adding my own 0.02, what some of us are facing is resistance to adopting git in our or client organizations because of the presence of SHA-1. There are organizations where SHA-1 is blanket banned across the board - regardless of its use. [...] Getting around this blanket ban is a serious amount of work and I have very recently seen customers move to older much less functional (or useful) VCS platforms just because of SHA-1. It is a bit of a stretch to imagine that remaining with SHA-1 will threaten Git's dominance in the near future. But it could, perhaps, give a toehold to a competitor that would lead to trouble in the longer term, especially if the security of SHA-1 crumbles further. Given that, one might think that companies that are dependent on Git would see some value in solving this particular problem. Many companies use Git, but some have based their entire business model around it. The latter companies have benefited greatly from the community's investment in Git, and they have a lot to lose if Git loses its prominence. It would seem to make sense for one or more of these companies to make the relatively small investment needed to push this transition to completion; that would be good for the community -- and for their own future as well. [Send a free link] ----------------------------------------- (Log in to post comments) Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 14:19 UTC (Thu) by LtWorf (subscriber, #124958) [ Link] Wouldn't the people that care about security just sign the commits? [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 14:35 UTC (Thu) by dtlin (subscriber, #36537) [ Link] Git signs the commit or tag object, not the whole file tree. So if you don't trust SHA-1, GPG doesn't add any security - the file content under a signed commit or tag could still be replaced. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 19:10 UTC (Thu) by NYKevin (subscriber, #129325) [Link] Furthermore, all currently feasible SHA-1 attacks are collision attacks - i.e. attacks in which the same person creates both the "good" commit and the "bad" commit. Signatures are primarily designed to deal with the case where the "good" and "bad" commits are created by different people (i.e. they are used to prove that a given commit was authored by the person identified in its metadata, and not an imposter). You can also use signatures to prove that some third party has reviewed the commit and believes it to be non-malicious, but to my understanding, that is not the typical use case (and, as you say, it is defeated by the collision attack anyway). [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 20:04 UTC (Thu) by walters (subscriber, #7396) [ Link] I started https://github.com/cgwalters/git-evtag before the sha1 breakage, I think it still makes sense. May try at some point to get it into git again. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 14:42 UTC (Thu) by smoogen (subscriber, #97) [ Link] Then you need to build into your checkout tooling to check that the signatures are actually valid. That means knowing what this 3rd party person's key is, how valid it is to that project, etc. I expect that like most of the usage of 'signatures'... it would be dictated but turned off in any build system because it is so hard to keep working. Most developers would rather have someone giving them hacked code than deal with GPG signature problems on a checkout. Which is why saying 'you can't use SHA-1' is an easier dictum from a security groups compliance method. You know that the signature's etc would be better, but you know within 2 minutes of saying using it would be ok that it would be turned off in the name of 'get that build out the door'. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 14:43 UTC (Thu) by martin.langhoff (subscriber, # 61417) [Link] The git ecosystem is vast. This is both needed, and something that'll break all sorts of stuff. Which reminds me... how is that IPv6 transition going? :-) [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 19:03 UTC (Thu) by Sesse (subscriber, #53779) [ Link] Around than 40% of end users (desktop or mobile) support IPv6. https: //www.google.com/intl/en/ipv6/statistics.html If you are an ISP and enable IPv6 in your network, you can expect to see more IPv6 traffic than IPv4 traffic on average. [Reply to this comment] IPv6 Posted Jun 23, 2022 19:18 UTC (Thu) by corbet (editor, #1) [Link] As a highly precise and rigorous experiment that surely generalizes to the net as a whole, I did a couple of greps out of the LWN server log and found that just under 20% of our hits come from IPv6 addresses. [Reply to this comment] IPv6 Posted Jun 23, 2022 19:21 UTC (Thu) by Sesse (subscriber, #53779) [ Link] Note that if your IPv6 connectivity is significantly slower than your IPv4 connectivity (on average), clients will generally prefer IPv4 (they send SYN packets for both, and let them race, with a slight preference for IPv6). [Reply to this comment] IPv6 Posted Jun 23, 2022 20:32 UTC (Thu) by jem (subscriber, #24231) [Link ] I suspect IPv6 net traffic is skewed towards connections over the mobile network, with the traditional DSL connections still being IPv4-only. Maybe lwn.net readers are predominantly using computers in a traditional (home) office setting? Company networks also typically don't support IPv6, which can be seen in the graphs published by Google as spikes during the weekends. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 14:55 UTC (Thu) by mathstuf (subscriber, #69389) [Link] Was there discussion of prefixing sha256 hashes with some constant prefix to know that they're not sha1? For example, all sha256 hashes are prefixed with `h`, so commit `h000000` is known to not be a sha1. I skip over `g` because `gdeadbeef` is already common to demarcate hashes in snapshot tarballs in various places. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 18:21 UTC (Thu) by klossner (subscriber, #30046) [Link] Isn't the length of the hash all you need to distinguish? SHA1 hashes are 40 characters long while SHA256 hashes are 64. (Which breaks any home-brew software that operates on git trees and hard-codes the 40-character width.) [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 18:25 UTC (Thu) by engla (subscriber, #47454) [ Link] A lot of the tools (for example git log --graph) use abbreviated hashes. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 20:26 UTC (Thu) by wtarreau (subscriber, #51152) [Link] There's no problem with that at all, not more than there is any with abbreviated commits nowadays. Git would just need to try to resolve an abbreviated commit to both SHA1 and SHA2 and complain in case of multiple matches. Then for the 40-char ones (SHA1) it would just have to do the same. In practice you won't design SHA2 hashes that purposely commit with SHA1, and the probability that it happens by accident is as low as having two identical SHA1 commits by accident, i.e. so close to zero that it practically is for our entire civilization. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 15:28 UTC (Thu) by zblaxell (subscriber, #26385) [Link] > seen customers move to older much less functional (or useful) VCS platforms Which VCS platforms both 1) don't use SHA1 and 2) don't introduce a ton of additional vulnerabilities compared to git? Do these customers prefer to do the relentless auditing tasks to ensure the integrity of a centralized VCS? If the customers are doing it anyway, wouldn't that auditing also detect a successful SHA1 collision attack against a git repo? [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 16:02 UTC (Thu) by dullfire (subscriber, #111432) [Link] While I don't know the details of any such corporation, it wouldn't surprise me if some organizations don't care about the security aspect, just that what ever solution they adopt isn't banned by a policy (presumable set by people who know better... but more likely people just blacklisting technical terms that they see bad reputations for). [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 17:13 UTC (Thu) by tlater (subscriber, #116684) [ Link] It's a regulation thing: https://csrc.nist.gov/Projects/ Hash-Functions/NIST-Policy... Various industries in the US require that you comply by those standards, and given how many companies at least work with US companies that means it tears through a lot of the world. While it's a bit ridiculously broad to state you simply are not allowed to generate sha-1 hashes ever (one of the addendum documents makes this explicit), it does make sense from a policy perspective. Otherwise there's just no incentive to ever change, and some deeply rooted uses of sha-1 will be passed over and eventually found to be problematic. If git can't adapt, in theory competitors should step up and through all that newfound industrial funding eventually become less of a mess. The policy makes sense, even if it in practice results in some pretty silly trade-offs in the short term. Of course, companies should just spend the money to get that 10% of the work done, but well, not everybody lives in the open source world, and I imagine a lot of the people who decide where budget goes just understand git as yet another product, not a community project that they have the power to modify. I imagine they also look at the competitors and don't see the problems, especially given they likely migrated to git at some point in the past, so it's just regressing back to the state of 10 years ago, which isn't that long in the kind of industry that cares about regulation like this. [Reply to this comment] Whatever happened to SHA-256 support in Git? Posted Jun 23, 2022 19:48 UTC (Thu) by Cyberax ( supporter , # 52523) [Link] The last time SHA-256 in git came up, I almost vomited from the clumsy format for hashes that they'd chosen. Hashes in SHA-1 are in binhex, why not instead just use a different alphabet to encode SHA-256 hashes? And it's not hard to do. For example, instead of 0-9a-f use g-v. Or abuse the first letter of the hash as the version. [Reply to this comment] Copyright (c) 2022, Eklektix, Inc. Comments and public postings are copyrighted by their creators. Linux is a registered trademark of Linus Torvalds