[HN Gopher] A Git story: Not so fun this time
___________________________________________________________________
A Git story: Not so fun this time
Author : thunderbong
Score : 272 points
Date : 2024-07-01 19:10 UTC (2 days ago)
(HTM) web link (blog.brachiosoft.com)
(TXT) w3m dump (blog.brachiosoft.com)
| hoistbypetard wrote:
| Thanks for sharing a fun read.
|
| Bitkeeper was neat, and my overall take on it mirrors Larry
| McVoy's: I wish he had open sourced it, made his nut running
| something just like github but for Bitkeeper, and that it had
| survived.
|
| I only had one interaction with him. In the early '00s, I had
| contributed a minor amount of code to TortoiseCVS. (Stuff like
| improving the installer and adding a way to call a tool that
| could provide a reasonable display for diffs of `.doc` and `.rtf`
| files.) I had a new, very niche, piece of hardware that I was
| excited about and wanted to add support for in the Linux kernel.
| Having read the terms of his license agreement for Bitkeeper, and
| intending to maintain my patches for TortoiseCVS, I sent him an
| email asking if it was OK for me to use Bitkeeper anyway. He told
| me that it did not look like I was in the business of version
| control software (I wasn't!) and said to go ahead, but let him
| know if that changed.
|
| I use git all the time now, because thankfully, it's good enough
| that I shouldn't spend any of my "innovation tokens" in this
| domain. But I'd still rather have bitkeeper or mercurial or
| fossil. I just can't justify the hit that being different would
| impose on collaboration.
| nmz wrote:
| I wouldn't put fossil in that list of collaboration, since its
| not really a collaborative tool, or more like, there are
| barriers to that collaboration, like creating a username for
| each fossil repository. That's a huge barrier in my view. It
| would be nice if there was something like a general auth
| identity that can be used everywhere but that's still not
| implemented.
|
| FWIW, mercurial seems to have an advantage over git, and that
| support for BIG repositories which seems to be provided by
| facebook of all people, so until facebook moves to git,
| mercurial lives on.
| thunderbong wrote:
| You can have one repository and link all the others to it via
| "Login Groups"
|
| https://www.fossil-scm.org/home/doc/trunk/www/caps/login-
| gro...
| sunshowers wrote:
| Like I tell lots of people, check out Jujutsu. It's a very
| Mercurial-inspired-but-better-than-it UI (the lead dev and I
| worked on Mercurial together for many years) with Git as one of
| the main supported backends. I've been using it full time for
| almost a year now.
| JoshTriplett wrote:
| I would _love_ to use jujutsu, and it seems like a great
| model. I think it 'd be a bad outcome if the world starts
| building top a piece of software with a single company owner
| and a CLA, though.
|
| I hope that the CLA goes away one day.
| sunshowers wrote:
| Note that the CLA does not transfer copyright, so "single
| company owner" is not accurate from a copyright
| perspective.
| JoshTriplett wrote:
| It's accurate from the perspective of "there's a single
| company with the right to change the licensing
| arbitrarily".
| ilyagr wrote:
| No, it is not accurate. That is not what Google's CLA
| says.
|
| OTOH, IANAL, but AFAIK anyone can fork `jj` and sell a
| proprietary product based on jj (and distribute it under
| pretty much whatever license they like, with very few
| restrictions) because it is currently Apache licensed,
| but that is unrelated to the Google CLA.
| codetrotter wrote:
| https://martinvonz.github.io/jj/latest/tutorial/
|
| https://github.com/martinvonz/jj
|
| Seems an interesting take indeed :)
| yencabulator wrote:
| I was a heavy user of BitKeeper.
|
| To me, Git is almost exactly like a ground-up cleaner rewrite
| of BitKeeper. Gitk and git-gui are essentially clones of the
| BitKeeper GUI.
|
| I don't understand why you'd want to keep using BitKeeper.
| hoistbypetard wrote:
| I think my memory is probably colored by BitKeeper being my
| first DVCS. I was never a heavy user of it.
|
| I was exposed to BitKeeper when I was managing my team's CVS
| server. On my next team, we moved to svn, which always felt
| like cvs with better porcelain from a developer perspective,
| but when administering that server fell onto my plate, I
| liked it a lot better than CVS. And I thought BitKeeper would
| be nicer from a developer perspective.
|
| Then on my next team, we used mercurial. I really, really,
| really liked mercurial, both as a developer and as a dev
| infrastructure administrator. It also sucked a lot less on
| Windows than git or BitKeeper.
|
| The last time I had to decide for a new team, mercurial and
| git were the obvious options. I went with git because that
| was clearly what the world liked best, and because bringing
| new team members up to speed would require less from me that
| way.
|
| All that goes to say... my direct comparison of git and
| bitkeeper came from when bitkeeper was mature and git
| decidedly was not. Then I lumped it in with mercurial (which
| I really would still prefer, right now) and fossil (ditto).
| You're probably exactly right about BK.
| cmrdporcupine wrote:
| Conceptually git is more powerful. But I recall the bitkeeper
| CLI being far more sensible in its interface.
| yencabulator wrote:
| It had its own weird quirks, and sometimes revealed that it
| was a front for a single file with a lot of funnily-
| formatted lines. We're just separated from it in time, and
| you can only truly hate what is familiar.
| superfish wrote:
| Great read!
|
| I'm sure I'm not the first to point out that Junio (the appointed
| git "shepherd") works at Google where mercurial is the "recommend
| local vcs" internally instead of git.
| ilyagr wrote:
| Large parts of Google rely on Git, most notably Chrome and
| Android.
|
| Also, it is a good thing if Junio can do his job independently
| of Google's immediate needs.
| mulmboy wrote:
| > Additionally, Petr set up the first project homepage for Git,
| git.or.cz, and a code hosting service, repo.or.cz. These websites
| were the "official" Git sites until GitHub took over.
|
| Is this true? I thought GitHub had no official affiliation with
| the git project
| jimbobthrowawy wrote:
| I think some github employees have written code that went into
| git, but it's not an _official_ affiliation.
|
| The quotes on "official" imply non-official to me. i.e.
| official seeming to people who don't know any better.
| arp242 wrote:
| That's why "official" in in quotes. As in: "de-facto standard".
| cxr wrote:
| Not really. git-scm.org is the de facto "official" site for
| the Git project in about the same way that French is the de
| facto "official" language of France.
|
| They meant exactly what they wrote: GitHub took over hosting
| duties for the official Git site (because they did).
| roywashere wrote:
| The git repo is on kernel.org nowadays with mirrors on
| repo.or.cz and GitHub.
|
| But I think they mean here what the official git project 'site'
| is with docs and so on. And that is now https://git-scm.com/
| and indeed as the article describes that was initially set up
| by GitHub people, to promote git
| xiwenc wrote:
| It's been awhile since i actually finished reading an article
| this long. Very well written!
|
| I tried to find out who the author is or how come he/she knows so
| much. No luck. Anyone else knows or OP care to chip in?
| cryptonector wrote:
| > In a 2022 survey by Stack Overflow, Git had a market share of
| 94%, ...
|
| > Never in history has a version control system dominated the
| market like Git. What will be the next to replace Git? Many say
| it might be related to AI, but no one can say for sure.
|
| I doubt it's getting replaced. It's not just that it's got so
| much of the market, but also that the market is so much larger
| than back in the days of CVS.
|
| It's hard to imagine everyone switching from Git. Switching from
| GitHub, feasible. From Git? That's much harder.
| jbaber wrote:
| It does feel like asking "What will replace ASCII?" Extensions,
| sure, but 0x41 is going to mean 'A' in 5050 AD.
| eliangcs wrote:
| Author here. I don't think ASCII is the right comparison.
| True, it would be really hard for anything to compete with
| Git because a lot of infrastructures we have are already
| deeply integrated with Git. But think about x86 vs. ARM and
| how AI might change our ways of producing code.
| fragmede wrote:
| Git shortcomings are well known by this point, so "all" a
| successor project has to do is solve those problems. Git scales
| to Linux kernel sized projects, but it turns out there are
| bigger, even more complex projects out there, so it doesn't
| scale to Google-sized organizations. You would want to support
| centralized and decentralized operation, but be aware of both,
| so it would support multiple remotes, while making it easier to
| keep them straight. Is the copy on Github up to date with
| gitlab, the CI system, and my laptop and my desktop? It would
| have to handle binaries well, and natively, so I can check-in
| my 100 MiB jpeg and not stuff things up. You'd want to use it
| both as a monorepo and as multirepos, by allowing you to
| checkout just a subtree of the monorepo. Locally, the workflow
| would need to both support git's complexity, while also being
| easier to use than git.
|
| Anyway, those are the four things you'd have to hit in order to
| replace git, as I see them.
|
| If you had such a system, getting people off git wouldn't be
| the issue - offer git compatibility and if they don't want to
| use the advanced features, they can just keep using their
| existing workflow with git. The problem with that though, is
| that then why use your new system.
|
| Which gets to the point of, how do you make this exist as a
| global worldwide product? FAANG-sized companies have their own
| internal tools team to manage source code. Anywhere smaller
| doesn't have the budget to create such a thing from scratch but
|
| You can't go off and make this product and then sell it to
| someone because how many companies are gonna go with an
| unproven new workflow tool that their engineers want? What's
| the TAM of companies for whom "git's not good enough", and have
| large enough pocketbooks?
| Borg3 wrote:
| You are right. GIT is not DVFS, its DVCS. It was made to
| track source code, not binary data. If you are putting binary
| to DVCS, you are doing something wrong.
|
| But, there are industries that need it, like game industry.
| So they should use tool that allow that. I heard that
| Plastic-SCM is pretty decent at it. Never used it so cant
| tell personally.
|
| Replacing GIT is such a stupid idea. There is no ONE tool to
| handle all cases. Just use right one for your workflows. I,
| for example, have a need to version binary files. I know GIT
| handles them badly, but I really like the tool. Solution? I
| wrote my own simple DVFS tool for that usecase: dot.exe
| (138KB)
|
| Its very simple DVFS for personal use, peer to peer syncing
| (local, TCP, SSH). Data and Metadata are SHA-1 checksumed.
| Its pretty speedy for my needs :) After weeks of use use I
| liked it so much, I added pack storage to handle text files
| and moved all my notes from SVN to DOT :)
| ozim wrote:
| Second that way of thinking, for me GIT is as good as it
| gets for versioning text files.
|
| Not handling binary files is not a downside for me because
| GIT should not be a tool to handle binary files versioning
| and we should use something else for that.
| cryptonector wrote:
| You say this, but Git has made great strides in scaling to
| huge repositories in recent years. You can currently do the
| "checkout just a subtree of the monorepo" just fine, and you
| can use shallow clones to approximate a centralized system
| (and most importantly to use less local storage).
|
| > If you had such a system, getting people off git wouldn't
| be the issue - offer git compatibility and [...]
|
| Git is already doing exactly that.
| vlovich123 wrote:
| Git itself isn't though, not in any real way that matters.
| Having to know all the sub trees to clone in a mono repo is
| a usability nonstarter. You need a pseudo filesystem that
| knows how to pull files on access. And one ideally
| integrated with the build system to offset the cost of
| doing remote operations on demand and improve parallelism.
| Facebook is open sourcing a lot of their work but it's
| based on mercurial. Microsoft is bought into git but afaik
| hasn't open sourced their supporting git tooling that makes
| this feasible.
|
| TLDR: the problem is more complex and pretending like "you
| can checkout a subtree" solves the problem is missing the
| proverbial forest for the (sub)tree
| neerajsi wrote:
| Microsoft's vfs for git is open source. So is scalar.
| These are the two main approaches used at Microsoft for
| large repos. Unfortunately the technically superior vfs
| approach was a nonstarter on macOS.
| fragmede wrote:
| > "checkout just a subtree of the monorepo"
|
| How do I check out, eg
| https://github.com/neovim/neovim/tree/master/scripts into a
| directory and work with it as if it was a repo unto itself?
| nolist_policy wrote:
| You can't (since commits are snapshots of the repo root).
| You can have this approximation however:
| git clone --filter=blob:none --sparse
| https://github.com/neovim/neovim cd neovim
| git sparse-checkout add scripts
|
| Unfortunately, GitHub does not support
| --filter=sparse:oid=master:scripts, so blobs will be
| fetched on demand as you use the repo.
| cxr wrote:
| There's a screenshot purporting to be of GitHub from May 2008.
| There are tell-tale signs, though, that some or all of the CSS
| has failed to load, and that that's not really what the site
| would have looked like if you visited it at the time. Indeed, if
| you check github.com in the Wayback Machine, you can see that its
| earliest crawl was May 2008, and it failed to capture the
| external style sheet, which results in a 404 when you try to load
| that copy today. Probably best to just not include a screenshot
| when that happens.
|
| (Although it's especially silly in this case, since accessing
| that copy[1] in the Wayback Machine reveals that the GitHub
| website included screenshots of itself that look nothing like the
| screenshot in this article.)
|
| 1.
| <https://web.archive.org/web/20080514210148/http://github.com...>
| philipwhiuk wrote:
| Thanks - I was struggling to believe GitHub would have launched
| with something as bad looking - 2008 was not CERN era looking
| webpages!
| eliangcs wrote:
| Author here. That's a good catch, thanks! I've replaced it with
| a newer screenshot from August 2008.
| cxr wrote:
| Larry wants to call you and discuss two corrections to this
| piece ("one minor, one major"). I've already passed on your
| email address for good measure, but you should reach out to
| him.
| eliangcs wrote:
| I've emailed him to follow up. Thanks for letting me know!
| dudus wrote:
| I never heard the term porcelain before, but I liked this tidbit.
|
| "In software development terminology, comparing low-level
| infrastructure to plumbing is hard to trace, but the use of
| "porcelain" to describe high-level packaging originated in the
| Git mailing list. To this day, Git uses the terms "plumbing" and
| "porcelain" to refer to low-level and high-level commands,
| respectively. "
|
| Also, unrelated, the "Ruby people, strange people" video gave me
| a good chuckle.
|
| https://www.youtube.com/watch?v=0m4hlWx7oRk&t=1080s
| nyanpasu64 wrote:
| FYI Mercurial's developer is now known as Olivia Mackall; sadly
| the Google infobox has failed to pick up the updated information.
| eliangcs wrote:
| Updated, thanks.
| globular-toast wrote:
| I've heard the story before but this was still fun to read. I
| didn't realise quite how rudimentary the first versions of git
| were. It really makes you wonder: was git the last opportunity to
| establish a ubiquitous version control system? Will there ever be
| another opportunity? Regardless of git's technical merits, one
| thing I'm extremely happy about is that it's free software. It
| seemed to come just before an avalanche of free software and
| really changed the way things are done (hopefully for good).
| ozim wrote:
| It created the avalanche. I don't think scale of free software
| we have now would be possible without git and GitHub.
| sergius wrote:
| This story is missing the impact that Tom Lord's TLA had on the
| git design.
| cxr wrote:
| Previously:
| <https://news.ycombinator.com/item?id=32155067#32157109>
| JoshTriplett wrote:
| > Tridge did the following.
|
| > "Here's a BitKeeper address, bk://thunk.org:5000. Let's try
| connecting with telnet."
|
| Famously, Tridge gave a talk about this, and got the audience of
| the talk to recreate the "reverse engineering". See
| https://lwn.net/Articles/133016/ for a source.
|
| > I attended Tridge's talk today. The best part of the
| demonstration was that he asked the audience for each command he
| should type in. And the audience instantly called out each
| command in unison, ("telnet", "help", "echo clone | nc").
| JoshTriplett wrote:
| > In January 2006, the X Window team switched from CVS to Git,
| which wowed Junio. He didn't expect such a big project like X
| Window to go through the trouble of changing version control
| systems.
|
| It's the "X Window System" or just "X".
| blacklion wrote:
| Interesting, this story quote (without attribution!) this comment
| by Larry McVoy himself on HN
|
| https://news.ycombinator.com/item?id=11671777
| zerocrates wrote:
| That part really should just be a straight quotation; the very
| light rewording of the original post feels in poor form.
| aidenn0 wrote:
| The entire comment section on that post is a goldmine, thanks!
| janvdberg wrote:
| Exceptional read! I love it.
|
| It's the most complete history of git that I know now.
| Exceptional!
|
| I'd love to read more historical articles like this one, of
| pieces of software that have helped shape our world.
| eliasson wrote:
| Ditto. This was a really nice read!
| ajkjk wrote:
| Dang this is such a good read.
| metadat wrote:
| _> My biggest regret is not money, it is that Git is such an
| awful excuse for an SCM. It drives me nuts that the model is a
| tarball server. Even Linus has admitted to me that it's a crappy
| design. It does what he wants, but what he wants is not what the
| world should want._
|
| Why is this crappy? What would be better?
|
| Edit: @jasoneckert It can be pleasant to use and a crappy design
| on the backend, the two aren't mutually exclusive. I'm much more
| curious what Larry meant about WHY it's a "crappy design" and
| exactly WHAT _should_ the world want?. As you 've noted, the
| porcelain works well enough.
| jasoneckert wrote:
| As someone who has lived in Git for the past decade, I also
| fail to see why Git is a crappy design. It's easy to
| distribute, works well, and there's nothing wrong with a
| tarball server.
| trhway wrote:
| Exactly. While the article is good about events history, it
| doesn't go deep enough into the feature evolution. Which is :
|
| TeamWare - somewhat easy branching (by copying whole
| workspace from the parent and the bringover/putback of the
| changes, good merge tool), the history is local, partial
| commits.
|
| BitKeeper added distributed mode, changesets.
|
| Git added very easy branching, stash, etc.
|
| Any other currently available source control usually is
| missing at least one of those features. Very illustrative is
| the case of Mercurial which emerged at about the same time
| responding to the same need for the modern source control at
| the time, yet was missing partial commits for example and had
| much cumbersome branching (like no local history or something
| like this - i looked at it last more than a decade ago) -
| that really allowed it to be used only in very strict/stuffy
| settings, for everybody else it was a non starter.
| luckydude wrote:
| My issues with Git
|
| - No rename support, it guesses
|
| - no weave. Without going into a lot of detail, suppose someone
| adds N bytes on a branch and then that branch is merged. The N
| bytes are copied into the merge node (yeah, I know, git looks
| for that and dedups it but that is a slow bandaid on the
| problem).
|
| - annotations are wrong, if I added the N bytes on the branch
| and you merged it, it will (unless this is somehow fixed now)
| show you as the author of the N bytes in the merge node.
|
| - only one graph for the whole repository. This causes multiple
| problems: A) the GCA is the repository GCA, it can be miles
| away from the file GCA if there was a graph per file like
| BitKeeper has. B) Debugging is upside down, you start at the
| changeset and drill down. In BitKeeper, because there is a
| graph per file, let's say I had an assert() pop. You run bk
| revtool on that file, find the assert and look around to see
| what has changed before that assert. Hover over a line, it will
| show you the commit comments to the file and then the
| changeset. You find the likely line, double click on it, now
| you are looking at the changeset. We were a tiny company, we
| never hit the claimed 25 people, and we supported tons of
| users. This form of debugging was a huge, HUGE, part of why we
| could support so many people. C) commit comments are per
| changeset, not per file. We had a graphic check in tool that
| walked you through the list of files, showed you the diffs for
| that file and asked you to comment. When you got the the
| ChangeSet file, now it is asking you for what Git asks for
| comments but the diffs are all the file names followed by what
| you just wrote. It made people sort of uplevel their commit
| comments. We had big customers that insisted the engineers use
| that tool rather a command line that checked in everything with
| the same comment.
|
| - submodules turned Git into CVS. Maybe that's been redone but
| the last time I looked at it, you couldn't do sideways pulls if
| you had submodules. BK got this MUCH closer to correct, the
| repository produced identical results to a mono repository if
| all the modules were present (and identical less whatever isn't
| populated in the sparse case). All with exactly the same
| semantics, same functionality mono or many repos.
|
| In summary, Git isn't really a version control system and Linus
| has admitted it to me years ago. A version control system needs
| to faithfully record everything that happened, no more or less.
| Git doesn't record renames, it passes content across branches
| by value, not by reference. To me, it feels like a giant step
| backwards.
|
| Here's another thing. We made a bk fast-export and a bk fast-
| import that are compatible with Git. You can have a tree in BK,
| have it updated constantly, and no matter where in the history
| you run bk fast-export, you will get the same repository. Our
| fast-export is idempotent. Git can't do that, it doesn't send
| the rename info because it doesn't record that. That means we
| have to make it up when doing a bk fast-import which means Git
| -> BK is not idempotent.
|
| I don't expect to convince anyone of anything at this point,
| someone nudged, I tried. I don't read hackernews any more so
| don't expect me to defend what I said, I really don't care at
| this point. I'm happier away from tech, I just go fish on the
| ocean and don't think about this stuff.
___________________________________________________________________
(page generated 2024-07-03 23:01 UTC)