[HN Gopher] A Git story: Not so fun this time
       ___________________________________________________________________
        
       A Git story: Not so fun this time
        
       Author : thunderbong
       Score  : 272 points
       Date   : 2024-07-01 19:10 UTC (2 days ago)
        
 (HTM) web link (blog.brachiosoft.com)
 (TXT) w3m dump (blog.brachiosoft.com)
        
       | hoistbypetard wrote:
       | Thanks for sharing a fun read.
       | 
       | Bitkeeper was neat, and my overall take on it mirrors Larry
       | McVoy's: I wish he had open sourced it, made his nut running
       | something just like github but for Bitkeeper, and that it had
       | survived.
       | 
       | I only had one interaction with him. In the early '00s, I had
       | contributed a minor amount of code to TortoiseCVS. (Stuff like
       | improving the installer and adding a way to call a tool that
       | could provide a reasonable display for diffs of `.doc` and `.rtf`
       | files.) I had a new, very niche, piece of hardware that I was
       | excited about and wanted to add support for in the Linux kernel.
       | Having read the terms of his license agreement for Bitkeeper, and
       | intending to maintain my patches for TortoiseCVS, I sent him an
       | email asking if it was OK for me to use Bitkeeper anyway. He told
       | me that it did not look like I was in the business of version
       | control software (I wasn't!) and said to go ahead, but let him
       | know if that changed.
       | 
       | I use git all the time now, because thankfully, it's good enough
       | that I shouldn't spend any of my "innovation tokens" in this
       | domain. But I'd still rather have bitkeeper or mercurial or
       | fossil. I just can't justify the hit that being different would
       | impose on collaboration.
        
         | nmz wrote:
         | I wouldn't put fossil in that list of collaboration, since its
         | not really a collaborative tool, or more like, there are
         | barriers to that collaboration, like creating a username for
         | each fossil repository. That's a huge barrier in my view. It
         | would be nice if there was something like a general auth
         | identity that can be used everywhere but that's still not
         | implemented.
         | 
         | FWIW, mercurial seems to have an advantage over git, and that
         | support for BIG repositories which seems to be provided by
         | facebook of all people, so until facebook moves to git,
         | mercurial lives on.
        
           | thunderbong wrote:
           | You can have one repository and link all the others to it via
           | "Login Groups"
           | 
           | https://www.fossil-scm.org/home/doc/trunk/www/caps/login-
           | gro...
        
         | sunshowers wrote:
         | Like I tell lots of people, check out Jujutsu. It's a very
         | Mercurial-inspired-but-better-than-it UI (the lead dev and I
         | worked on Mercurial together for many years) with Git as one of
         | the main supported backends. I've been using it full time for
         | almost a year now.
        
           | JoshTriplett wrote:
           | I would _love_ to use jujutsu, and it seems like a great
           | model. I think it 'd be a bad outcome if the world starts
           | building top a piece of software with a single company owner
           | and a CLA, though.
           | 
           | I hope that the CLA goes away one day.
        
             | sunshowers wrote:
             | Note that the CLA does not transfer copyright, so "single
             | company owner" is not accurate from a copyright
             | perspective.
        
               | JoshTriplett wrote:
               | It's accurate from the perspective of "there's a single
               | company with the right to change the licensing
               | arbitrarily".
        
               | ilyagr wrote:
               | No, it is not accurate. That is not what Google's CLA
               | says.
               | 
               | OTOH, IANAL, but AFAIK anyone can fork `jj` and sell a
               | proprietary product based on jj (and distribute it under
               | pretty much whatever license they like, with very few
               | restrictions) because it is currently Apache licensed,
               | but that is unrelated to the Google CLA.
        
           | codetrotter wrote:
           | https://martinvonz.github.io/jj/latest/tutorial/
           | 
           | https://github.com/martinvonz/jj
           | 
           | Seems an interesting take indeed :)
        
         | yencabulator wrote:
         | I was a heavy user of BitKeeper.
         | 
         | To me, Git is almost exactly like a ground-up cleaner rewrite
         | of BitKeeper. Gitk and git-gui are essentially clones of the
         | BitKeeper GUI.
         | 
         | I don't understand why you'd want to keep using BitKeeper.
        
           | hoistbypetard wrote:
           | I think my memory is probably colored by BitKeeper being my
           | first DVCS. I was never a heavy user of it.
           | 
           | I was exposed to BitKeeper when I was managing my team's CVS
           | server. On my next team, we moved to svn, which always felt
           | like cvs with better porcelain from a developer perspective,
           | but when administering that server fell onto my plate, I
           | liked it a lot better than CVS. And I thought BitKeeper would
           | be nicer from a developer perspective.
           | 
           | Then on my next team, we used mercurial. I really, really,
           | really liked mercurial, both as a developer and as a dev
           | infrastructure administrator. It also sucked a lot less on
           | Windows than git or BitKeeper.
           | 
           | The last time I had to decide for a new team, mercurial and
           | git were the obvious options. I went with git because that
           | was clearly what the world liked best, and because bringing
           | new team members up to speed would require less from me that
           | way.
           | 
           | All that goes to say... my direct comparison of git and
           | bitkeeper came from when bitkeeper was mature and git
           | decidedly was not. Then I lumped it in with mercurial (which
           | I really would still prefer, right now) and fossil (ditto).
           | You're probably exactly right about BK.
        
           | cmrdporcupine wrote:
           | Conceptually git is more powerful. But I recall the bitkeeper
           | CLI being far more sensible in its interface.
        
             | yencabulator wrote:
             | It had its own weird quirks, and sometimes revealed that it
             | was a front for a single file with a lot of funnily-
             | formatted lines. We're just separated from it in time, and
             | you can only truly hate what is familiar.
        
       | superfish wrote:
       | Great read!
       | 
       | I'm sure I'm not the first to point out that Junio (the appointed
       | git "shepherd") works at Google where mercurial is the "recommend
       | local vcs" internally instead of git.
        
         | ilyagr wrote:
         | Large parts of Google rely on Git, most notably Chrome and
         | Android.
         | 
         | Also, it is a good thing if Junio can do his job independently
         | of Google's immediate needs.
        
       | mulmboy wrote:
       | > Additionally, Petr set up the first project homepage for Git,
       | git.or.cz, and a code hosting service, repo.or.cz. These websites
       | were the "official" Git sites until GitHub took over.
       | 
       | Is this true? I thought GitHub had no official affiliation with
       | the git project
        
         | jimbobthrowawy wrote:
         | I think some github employees have written code that went into
         | git, but it's not an _official_ affiliation.
         | 
         | The quotes on "official" imply non-official to me. i.e.
         | official seeming to people who don't know any better.
        
         | arp242 wrote:
         | That's why "official" in in quotes. As in: "de-facto standard".
        
           | cxr wrote:
           | Not really. git-scm.org is the de facto "official" site for
           | the Git project in about the same way that French is the de
           | facto "official" language of France.
           | 
           | They meant exactly what they wrote: GitHub took over hosting
           | duties for the official Git site (because they did).
        
         | roywashere wrote:
         | The git repo is on kernel.org nowadays with mirrors on
         | repo.or.cz and GitHub.
         | 
         | But I think they mean here what the official git project 'site'
         | is with docs and so on. And that is now https://git-scm.com/
         | and indeed as the article describes that was initially set up
         | by GitHub people, to promote git
        
       | xiwenc wrote:
       | It's been awhile since i actually finished reading an article
       | this long. Very well written!
       | 
       | I tried to find out who the author is or how come he/she knows so
       | much. No luck. Anyone else knows or OP care to chip in?
        
       | cryptonector wrote:
       | > In a 2022 survey by Stack Overflow, Git had a market share of
       | 94%, ...
       | 
       | > Never in history has a version control system dominated the
       | market like Git. What will be the next to replace Git? Many say
       | it might be related to AI, but no one can say for sure.
       | 
       | I doubt it's getting replaced. It's not just that it's got so
       | much of the market, but also that the market is so much larger
       | than back in the days of CVS.
       | 
       | It's hard to imagine everyone switching from Git. Switching from
       | GitHub, feasible. From Git? That's much harder.
        
         | jbaber wrote:
         | It does feel like asking "What will replace ASCII?" Extensions,
         | sure, but 0x41 is going to mean 'A' in 5050 AD.
        
           | eliangcs wrote:
           | Author here. I don't think ASCII is the right comparison.
           | True, it would be really hard for anything to compete with
           | Git because a lot of infrastructures we have are already
           | deeply integrated with Git. But think about x86 vs. ARM and
           | how AI might change our ways of producing code.
        
         | fragmede wrote:
         | Git shortcomings are well known by this point, so "all" a
         | successor project has to do is solve those problems. Git scales
         | to Linux kernel sized projects, but it turns out there are
         | bigger, even more complex projects out there, so it doesn't
         | scale to Google-sized organizations. You would want to support
         | centralized and decentralized operation, but be aware of both,
         | so it would support multiple remotes, while making it easier to
         | keep them straight. Is the copy on Github up to date with
         | gitlab, the CI system, and my laptop and my desktop? It would
         | have to handle binaries well, and natively, so I can check-in
         | my 100 MiB jpeg and not stuff things up. You'd want to use it
         | both as a monorepo and as multirepos, by allowing you to
         | checkout just a subtree of the monorepo. Locally, the workflow
         | would need to both support git's complexity, while also being
         | easier to use than git.
         | 
         | Anyway, those are the four things you'd have to hit in order to
         | replace git, as I see them.
         | 
         | If you had such a system, getting people off git wouldn't be
         | the issue - offer git compatibility and if they don't want to
         | use the advanced features, they can just keep using their
         | existing workflow with git. The problem with that though, is
         | that then why use your new system.
         | 
         | Which gets to the point of, how do you make this exist as a
         | global worldwide product? FAANG-sized companies have their own
         | internal tools team to manage source code. Anywhere smaller
         | doesn't have the budget to create such a thing from scratch but
         | 
         | You can't go off and make this product and then sell it to
         | someone because how many companies are gonna go with an
         | unproven new workflow tool that their engineers want? What's
         | the TAM of companies for whom "git's not good enough", and have
         | large enough pocketbooks?
        
           | Borg3 wrote:
           | You are right. GIT is not DVFS, its DVCS. It was made to
           | track source code, not binary data. If you are putting binary
           | to DVCS, you are doing something wrong.
           | 
           | But, there are industries that need it, like game industry.
           | So they should use tool that allow that. I heard that
           | Plastic-SCM is pretty decent at it. Never used it so cant
           | tell personally.
           | 
           | Replacing GIT is such a stupid idea. There is no ONE tool to
           | handle all cases. Just use right one for your workflows. I,
           | for example, have a need to version binary files. I know GIT
           | handles them badly, but I really like the tool. Solution? I
           | wrote my own simple DVFS tool for that usecase: dot.exe
           | (138KB)
           | 
           | Its very simple DVFS for personal use, peer to peer syncing
           | (local, TCP, SSH). Data and Metadata are SHA-1 checksumed.
           | Its pretty speedy for my needs :) After weeks of use use I
           | liked it so much, I added pack storage to handle text files
           | and moved all my notes from SVN to DOT :)
        
             | ozim wrote:
             | Second that way of thinking, for me GIT is as good as it
             | gets for versioning text files.
             | 
             | Not handling binary files is not a downside for me because
             | GIT should not be a tool to handle binary files versioning
             | and we should use something else for that.
        
           | cryptonector wrote:
           | You say this, but Git has made great strides in scaling to
           | huge repositories in recent years. You can currently do the
           | "checkout just a subtree of the monorepo" just fine, and you
           | can use shallow clones to approximate a centralized system
           | (and most importantly to use less local storage).
           | 
           | > If you had such a system, getting people off git wouldn't
           | be the issue - offer git compatibility and [...]
           | 
           | Git is already doing exactly that.
        
             | vlovich123 wrote:
             | Git itself isn't though, not in any real way that matters.
             | Having to know all the sub trees to clone in a mono repo is
             | a usability nonstarter. You need a pseudo filesystem that
             | knows how to pull files on access. And one ideally
             | integrated with the build system to offset the cost of
             | doing remote operations on demand and improve parallelism.
             | Facebook is open sourcing a lot of their work but it's
             | based on mercurial. Microsoft is bought into git but afaik
             | hasn't open sourced their supporting git tooling that makes
             | this feasible.
             | 
             | TLDR: the problem is more complex and pretending like "you
             | can checkout a subtree" solves the problem is missing the
             | proverbial forest for the (sub)tree
        
               | neerajsi wrote:
               | Microsoft's vfs for git is open source. So is scalar.
               | These are the two main approaches used at Microsoft for
               | large repos. Unfortunately the technically superior vfs
               | approach was a nonstarter on macOS.
        
             | fragmede wrote:
             | > "checkout just a subtree of the monorepo"
             | 
             | How do I check out, eg
             | https://github.com/neovim/neovim/tree/master/scripts into a
             | directory and work with it as if it was a repo unto itself?
        
               | nolist_policy wrote:
               | You can't (since commits are snapshots of the repo root).
               | You can have this approximation however:
               | git clone --filter=blob:none --sparse
               | https://github.com/neovim/neovim         cd neovim
               | git sparse-checkout add scripts
               | 
               | Unfortunately, GitHub does not support
               | --filter=sparse:oid=master:scripts, so blobs will be
               | fetched on demand as you use the repo.
        
       | cxr wrote:
       | There's a screenshot purporting to be of GitHub from May 2008.
       | There are tell-tale signs, though, that some or all of the CSS
       | has failed to load, and that that's not really what the site
       | would have looked like if you visited it at the time. Indeed, if
       | you check github.com in the Wayback Machine, you can see that its
       | earliest crawl was May 2008, and it failed to capture the
       | external style sheet, which results in a 404 when you try to load
       | that copy today. Probably best to just not include a screenshot
       | when that happens.
       | 
       | (Although it's especially silly in this case, since accessing
       | that copy[1] in the Wayback Machine reveals that the GitHub
       | website included screenshots of itself that look nothing like the
       | screenshot in this article.)
       | 
       | 1.
       | <https://web.archive.org/web/20080514210148/http://github.com...>
        
         | philipwhiuk wrote:
         | Thanks - I was struggling to believe GitHub would have launched
         | with something as bad looking - 2008 was not CERN era looking
         | webpages!
        
         | eliangcs wrote:
         | Author here. That's a good catch, thanks! I've replaced it with
         | a newer screenshot from August 2008.
        
           | cxr wrote:
           | Larry wants to call you and discuss two corrections to this
           | piece ("one minor, one major"). I've already passed on your
           | email address for good measure, but you should reach out to
           | him.
        
             | eliangcs wrote:
             | I've emailed him to follow up. Thanks for letting me know!
        
       | dudus wrote:
       | I never heard the term porcelain before, but I liked this tidbit.
       | 
       | "In software development terminology, comparing low-level
       | infrastructure to plumbing is hard to trace, but the use of
       | "porcelain" to describe high-level packaging originated in the
       | Git mailing list. To this day, Git uses the terms "plumbing" and
       | "porcelain" to refer to low-level and high-level commands,
       | respectively. "
       | 
       | Also, unrelated, the "Ruby people, strange people" video gave me
       | a good chuckle.
       | 
       | https://www.youtube.com/watch?v=0m4hlWx7oRk&t=1080s
        
       | nyanpasu64 wrote:
       | FYI Mercurial's developer is now known as Olivia Mackall; sadly
       | the Google infobox has failed to pick up the updated information.
        
         | eliangcs wrote:
         | Updated, thanks.
        
       | globular-toast wrote:
       | I've heard the story before but this was still fun to read. I
       | didn't realise quite how rudimentary the first versions of git
       | were. It really makes you wonder: was git the last opportunity to
       | establish a ubiquitous version control system? Will there ever be
       | another opportunity? Regardless of git's technical merits, one
       | thing I'm extremely happy about is that it's free software. It
       | seemed to come just before an avalanche of free software and
       | really changed the way things are done (hopefully for good).
        
         | ozim wrote:
         | It created the avalanche. I don't think scale of free software
         | we have now would be possible without git and GitHub.
        
       | sergius wrote:
       | This story is missing the impact that Tom Lord's TLA had on the
       | git design.
        
         | cxr wrote:
         | Previously:
         | <https://news.ycombinator.com/item?id=32155067#32157109>
        
       | JoshTriplett wrote:
       | > Tridge did the following.
       | 
       | > "Here's a BitKeeper address, bk://thunk.org:5000. Let's try
       | connecting with telnet."
       | 
       | Famously, Tridge gave a talk about this, and got the audience of
       | the talk to recreate the "reverse engineering". See
       | https://lwn.net/Articles/133016/ for a source.
       | 
       | > I attended Tridge's talk today. The best part of the
       | demonstration was that he asked the audience for each command he
       | should type in. And the audience instantly called out each
       | command in unison, ("telnet", "help", "echo clone | nc").
        
       | JoshTriplett wrote:
       | > In January 2006, the X Window team switched from CVS to Git,
       | which wowed Junio. He didn't expect such a big project like X
       | Window to go through the trouble of changing version control
       | systems.
       | 
       | It's the "X Window System" or just "X".
        
       | blacklion wrote:
       | Interesting, this story quote (without attribution!) this comment
       | by Larry McVoy himself on HN
       | 
       | https://news.ycombinator.com/item?id=11671777
        
         | zerocrates wrote:
         | That part really should just be a straight quotation; the very
         | light rewording of the original post feels in poor form.
        
         | aidenn0 wrote:
         | The entire comment section on that post is a goldmine, thanks!
        
       | janvdberg wrote:
       | Exceptional read! I love it.
       | 
       | It's the most complete history of git that I know now.
       | Exceptional!
       | 
       | I'd love to read more historical articles like this one, of
       | pieces of software that have helped shape our world.
        
         | eliasson wrote:
         | Ditto. This was a really nice read!
        
       | ajkjk wrote:
       | Dang this is such a good read.
        
       | metadat wrote:
       | _> My biggest regret is not money, it is that Git is such an
       | awful excuse for an SCM. It drives me nuts that the model is a
       | tarball server. Even Linus has admitted to me that it's a crappy
       | design. It does what he wants, but what he wants is not what the
       | world should want._
       | 
       | Why is this crappy? What would be better?
       | 
       | Edit: @jasoneckert It can be pleasant to use and a crappy design
       | on the backend, the two aren't mutually exclusive. I'm much more
       | curious what Larry meant about WHY it's a "crappy design" and
       | exactly WHAT _should_ the world want?. As you 've noted, the
       | porcelain works well enough.
        
         | jasoneckert wrote:
         | As someone who has lived in Git for the past decade, I also
         | fail to see why Git is a crappy design. It's easy to
         | distribute, works well, and there's nothing wrong with a
         | tarball server.
        
           | trhway wrote:
           | Exactly. While the article is good about events history, it
           | doesn't go deep enough into the feature evolution. Which is :
           | 
           | TeamWare - somewhat easy branching (by copying whole
           | workspace from the parent and the bringover/putback of the
           | changes, good merge tool), the history is local, partial
           | commits.
           | 
           | BitKeeper added distributed mode, changesets.
           | 
           | Git added very easy branching, stash, etc.
           | 
           | Any other currently available source control usually is
           | missing at least one of those features. Very illustrative is
           | the case of Mercurial which emerged at about the same time
           | responding to the same need for the modern source control at
           | the time, yet was missing partial commits for example and had
           | much cumbersome branching (like no local history or something
           | like this - i looked at it last more than a decade ago) -
           | that really allowed it to be used only in very strict/stuffy
           | settings, for everybody else it was a non starter.
        
         | luckydude wrote:
         | My issues with Git
         | 
         | - No rename support, it guesses
         | 
         | - no weave. Without going into a lot of detail, suppose someone
         | adds N bytes on a branch and then that branch is merged. The N
         | bytes are copied into the merge node (yeah, I know, git looks
         | for that and dedups it but that is a slow bandaid on the
         | problem).
         | 
         | - annotations are wrong, if I added the N bytes on the branch
         | and you merged it, it will (unless this is somehow fixed now)
         | show you as the author of the N bytes in the merge node.
         | 
         | - only one graph for the whole repository. This causes multiple
         | problems: A) the GCA is the repository GCA, it can be miles
         | away from the file GCA if there was a graph per file like
         | BitKeeper has. B) Debugging is upside down, you start at the
         | changeset and drill down. In BitKeeper, because there is a
         | graph per file, let's say I had an assert() pop. You run bk
         | revtool on that file, find the assert and look around to see
         | what has changed before that assert. Hover over a line, it will
         | show you the commit comments to the file and then the
         | changeset. You find the likely line, double click on it, now
         | you are looking at the changeset. We were a tiny company, we
         | never hit the claimed 25 people, and we supported tons of
         | users. This form of debugging was a huge, HUGE, part of why we
         | could support so many people. C) commit comments are per
         | changeset, not per file. We had a graphic check in tool that
         | walked you through the list of files, showed you the diffs for
         | that file and asked you to comment. When you got the the
         | ChangeSet file, now it is asking you for what Git asks for
         | comments but the diffs are all the file names followed by what
         | you just wrote. It made people sort of uplevel their commit
         | comments. We had big customers that insisted the engineers use
         | that tool rather a command line that checked in everything with
         | the same comment.
         | 
         | - submodules turned Git into CVS. Maybe that's been redone but
         | the last time I looked at it, you couldn't do sideways pulls if
         | you had submodules. BK got this MUCH closer to correct, the
         | repository produced identical results to a mono repository if
         | all the modules were present (and identical less whatever isn't
         | populated in the sparse case). All with exactly the same
         | semantics, same functionality mono or many repos.
         | 
         | In summary, Git isn't really a version control system and Linus
         | has admitted it to me years ago. A version control system needs
         | to faithfully record everything that happened, no more or less.
         | Git doesn't record renames, it passes content across branches
         | by value, not by reference. To me, it feels like a giant step
         | backwards.
         | 
         | Here's another thing. We made a bk fast-export and a bk fast-
         | import that are compatible with Git. You can have a tree in BK,
         | have it updated constantly, and no matter where in the history
         | you run bk fast-export, you will get the same repository. Our
         | fast-export is idempotent. Git can't do that, it doesn't send
         | the rename info because it doesn't record that. That means we
         | have to make it up when doing a bk fast-import which means Git
         | -> BK is not idempotent.
         | 
         | I don't expect to convince anyone of anything at this point,
         | someone nudged, I tried. I don't read hackernews any more so
         | don't expect me to defend what I said, I really don't care at
         | this point. I'm happier away from tech, I just go fish on the
         | ocean and don't think about this stuff.
        
       ___________________________________________________________________
       (page generated 2024-07-03 23:01 UTC)