[HN Gopher] Git Branches: Intuition and Reality
       ___________________________________________________________________
        
       Git Branches: Intuition and Reality
        
       Author : tambourine_man
       Score  : 318 points
       Date   : 2023-11-23 14:43 UTC (8 hours ago)
        
 (HTM) web link (jvns.ca)
 (TXT) w3m dump (jvns.ca)
        
       | xanderlewis wrote:
       | > in general, even if people's intuition about a topic is
       | technically incorrect in some ways, people usually have the
       | intuition they do for very legitimate reasons!
       | 
       | This is worth an essay of its own.
        
         | webstrand wrote:
         | I'm still missing what part of the intuition is incorrect? It
         | seems like the only "incorrectness" is that there's no explicit
         | hierarchy of branches. Except that's wrong the HEAD ref points
         | to the default branch. Any other branches are of equal
         | significance, though.
        
           | mgerdts wrote:
           | Intuition would be that the branch starts at the point that
           | it diverges from main, labeled "base" in the first diagram.
           | In reality, the first commit in "main" and "branch" are the
           | same commit.
           | 
           | Intuition likely comes from how a tree (fir or oak, not
           | binary) is structured. Generally a branch starts at the trunk
           | or some other branch, not at the ground where the trunk gives
           | way to roots.
        
             | jancsika wrote:
             | I don't agree with the author here.
             | 
             | Intuition is _travelling_ down main path that has branches
             | which diverge and re-merge into the main path.
             | 
             | That's why people seem to intuitively get "merging" back
             | into main, whereas that doesn't generally make sense for
             | physical trees.
        
           | Sharlin wrote:
           | No, the HEAD ref points to whatever branch is "active",
           | that's how the active branch is defined. Indeed `git checkout
           | branchname` does nothing except make HEAD point to the commit
           | that `refs/heads/branchname` points to.
           | 
           | The intuition jvns meant is the idea that a branch only
           | constitutes the commits since the point of divergence, but
           | every branch actually contains the full history up to the
           | root of its tree, and `git log` of course shows that. (If you
           | want to only show the commits specific to a branch, you can
           | do `git log parent..branch`. Note also that two branches need
           | not have _any_ common history, it 's perfectly possible for a
           | git graph to be disconnected.)
        
             | xorcist wrote:
             | > `git checkout branchname` does nothing except make HEAD
             | point to the commit
             | 
             | You probably know this, but since we are being pedantic we
             | might as well get it right: That describes "git reset".
             | "git checkout" does that _and_ record that we are tracking
             | branchname. So any commits will move both HEAD and the
             | branchname reference.
        
         | mwexler wrote:
         | I guess.
         | 
         | To me, the opposite is a more worthy essay: why, with all the
         | power to customize our tech, do we create things that
         | consistently work differently than people's intuition?
         | 
         | The fact that it "mostly jibes" feels like a footgun, not a
         | feature.
         | 
         | I get that for some, "git just works! It made sense from day
         | one" but in my limited experience, 0% of people I've worked
         | with have said that.
         | 
         | Sure, we can all learn the tech. And expert techniques in any
         | field often don't jibe with naive expectations. But for me and
         | the folks I work with, the tech industry feels like it's
         | gliding more towards inscrutible tools vs ease of use.
         | 
         | We've hit a stage where many rely on code completion bots and
         | answer-supplying bots instead of being able to directly embrace
         | our tech. I wish the tech was more approachable on its own, but
         | perhaps this is the natural evolution of things.
        
           | xanderlewis wrote:
           | That doesn't seem like the opposite to me. It seems like the
           | same thing. Rather than rejecting people's intuition as
           | 'understandable but wrong' why don't we use it as the basis
           | for a better solution?
        
           | atq2119 wrote:
           | You do have a point, but it's not a slam dunk. Intuition
           | isn't some fixed thing but arises from personal experience. A
           | lot of that is common to a culture, but there are different
           | cultures and in any case, some truly personal aspects remain.
           | 
           | There needs to be a balance between creating new, more
           | powerful intuitions, and meeting people at the intuitions
           | they already have.
           | 
           | Case in point, Git's branching model is pretty intuitive when
           | you understand how Linux kernel development works. Perhaps 0%
           | of the people you've worked with have looked into that.
           | That's fine. Different cultures...
           | 
           | Another example that may be worth studying is mathematics and
           | the hard sciences. Learning those is a lot about learning
           | powerful intuitions.
        
             | BlueTemplar wrote:
             | Yeah, few things are actually "intuitive". "Shared
             | familiarity" is probably a better term.
        
           | eviks wrote:
           | Partially because that's a much harder design challenge,
           | especially for people with an unrelated skill set
        
           | chthonicdaemon wrote:
           | People writing software fall into several categories based on
           | the problem they're solving, the reason they're solving it
           | and the audience of the solution.
           | 
           | I solve my own problems for my own reasons all the time and
           | therefore other people's intuitions are immaterial in the
           | process. It would just slow me down to think "how would other
           | people use this" when I'm focused on some technical personal
           | problem.
           | 
           | Commercial software developers solve problems with the clear
           | purpose of selling the solution to others and where they know
           | ahead of time roughly what their audience's intuitions are.
           | This is why intuitive GUI applications exist - there are
           | whole industries devoted to finding out what people expect,
           | what lowers cognitive load etc. iOS and Android apps give you
           | a good idea of what is possible with modern tech when the
           | purposes are properly aligned.
           | 
           | The problem here is that git was expressly developed by Linus
           | to solve his own problem in a way that made sense to him with
           | no thought as to how other people would use it. There were no
           | focus groups, early betas, feedback from users and so on. At
           | best there has been slow fixes to the porcelain to fix the
           | stuff that bothers the people who could make a PR to git. On
           | the other hand there are also many front-end projects that
           | attempt to align some other person's idea of how version
           | control is supposed to work with the Git model.
           | 
           | Anyway - I am in the camp where I very seldom get confused
           | about a git thing because the actual expressed model is
           | really simple (in the way that x86 assembly is a "simpler"
           | language than Java). I find most front-ends much more
           | confusing because they don't seem to work the way I expect.
           | But I am never surprised when someone's pet project is
           | understandable only by themselves. Or indeed when a consumer
           | product is consumer-friendly. The real surprise is when a
           | lone programmer makes something for themselves that then goes
           | on to have wide appeal.
        
           | ndriscoll wrote:
           | As one of those people that thinks it's extremely intuitive,
           | I have to wonder where the confused people are learning about
           | git. The documentation on the site[0] is quite clear:
           | 
           | > A branch in Git is simply a lightweight movable pointer to
           | one of these commits. The default branch name in Git is
           | master. As you start making commits, you're given a master
           | branch that points to the last commit you made. Every time
           | you commit, the master branch pointer moves forward
           | automatically.
           | 
           | It has multiple diagrams explaining how commits point to
           | their content and their parents, and branches point to
           | commits. The Pro Git content has been there for at least 10
           | years (it's what I learned from 10 years ago).
           | 
           | Maybe the problem is just that the Internet is full of blogs
           | that have incorrect diagrams (like those in the OP) and bad
           | explanations, despite the main website having great
           | documentation!
           | 
           | [0] https://git-scm.com/book/en/v2/Git-Branching-Branches-in-
           | a-N...
        
             | chihuahua wrote:
             | If Git was "extremely intuitive", and the documentation was
             | "great", why would so many otherwise smart people keep
             | writing blogs about it with incorrect diagrams?
             | 
             | What is your theory about why so many people are having
             | difficulty creating a correct mental model about Git, and
             | why so many people are writing incorrect blogs about it?
        
               | ndriscoll wrote:
               | Like I sort of implied, my theory is people haven't read
               | the docs on the official site (or the book that's on the
               | site), and keep regurgitating bad information that they
               | read on some blog or howto site. I don't know why they do
               | this. I don't make these sites, so I don't know what
               | motivates people who do, especially people who don't
               | understand what they're writing about.
               | 
               | If you understand the basic design premise (commits are
               | content-addressed immutable snapshots), the pointer stuff
               | is kind of obvious. It _has_ to work something like that
               | for it to be able to be immutable if you want to be able
               | to make branches /tags after the commit is created.
        
               | afiori wrote:
               | In part it is because git is hard to use, in part it is
               | because mostly people learn git by oral tradition and
               | often treat it like sorcery.
        
           | marcosdumay wrote:
           | Often, people's intuition is wrong on very important ways,
           | and something that works like they expect is sure to create
           | footguns or just blow up by itself.
           | 
           | But I'm not sure git is a case of this. The DVCS that were
           | created following people's intuitions were known to be slow
           | and internally complex, but I have never heard about them
           | failing. (And the slowness is obviously of a kind that can be
           | optimized away.)
           | 
           | We just stuck with the worst UI ever devised in public for a
           | VCS because of network effects.
        
             | chihuahua wrote:
             | I totally agree with it being "the worst UI ever devised".
             | It's fine to use commits with parent pointers and branches
             | as pointers to commits and all the other stuff internally.
             | But there should be a UI wrapped around that that maps to
             | operations that make sense for the purpose of working on a
             | software project.
             | 
             | Not this:
             | 
             | git merge [-n] [--stat] [--no-commit] [--squash]
             | [--[no-]edit] [--no-verify] [-s <strategy>] [-X <strategy-
             | option>] [-S[<keyid>]] [--[no-]allow-unrelated-histories]
             | [--[no-]rerere-autoupdate] [-m <msg>] [-F <file>] [--into-
             | name <branch>] [<commit>... ]
        
           | codesnik wrote:
           | because a) everyone's intuition is different, b) sometimes
           | uneducated intuition is just wrong. On a surface level things
           | looks good, but in some specific situation intuitive ways of
           | doing things could be not consistent or don't have any
           | solution at all. In this cases you just stuck with magic box
           | of software which did _something_ and you have no idea what
           | and reach for backup.
           | 
           | Git is not like that. It is very-very simple. If you learn
           | basics of it, your intuition will align with git's
           | "intuition" too, and you can do crazy things with total peace
           | of mind, without googling or looking into source code of git
           | to see how they had to make something "intuitive" in some
           | definition of the word.
        
           | timacles wrote:
           | This is the same reasoning that SQL gets criticized with. But
           | the answer is simple.
           | 
           | Git (and sql) range from simple task to very complicated.
           | Everyone likes to fantasize about making it easier but
           | they're only thinking about the fraction of functionality
           | they use, rather than everything it currently does.
           | 
           | If someone could come up with a simpler solution they would,
           | but they can't because git can do extremely complicated
           | things and is internally consistent. Most people
           | underestimate that part
        
         | informalo wrote:
         | Yup. If it works, it ain't stupid.
        
       | adaboese wrote:
       | I cannot be the only one that gets away with only knowing:
       | git pull       git merge x       git checkout [-b] foo       git
       | commit       git push
        
         | mixedmath wrote:
         | Possibly `git branch NEWBRANCHNAME` instead of `git checkout -b
         | NEWBRANCHNAME`. When I need to show git to someone in order for
         | them to contribute to something, I give them only these
         | incantations --- and instructions to ask me if weird git things
         | happen.
        
           | rkangel wrote:
           | You then need to do both `git branch xxx` and `git checkout
           | xxx` though.
           | 
           | If you teach "checkout to move around and add -b when moving
           | to a new branch the first time" that works pretty well
        
           | still_grokking wrote:
           | It's `git switch [-c]` nowadays.
        
             | someone7x wrote:
             | I've given up trying to configure git push the branch I'm
             | on so I type this little dance each time :
             | git switch -c foo         git push          > did you mean
             | git push --args-with-branch-name?         sigh, copy,
             | paste, enter
        
         | bravetraveler wrote:
         | I manage with even less, any merging under my watch happens as
         | a strategy with pulling
        
         | _ZeD_ wrote:
         | FWIW that's 99% of my usage of git
         | 
         | (well... it's about 1% because I do everything using the
         | eclipse git UI, but that's the same behavior you get from that
         | commands)
        
         | have_faith wrote:
         | I _know_ a decent amount of git, but day to day I use GUIs
         | (Sublime Merge).
        
         | ryanjshaw wrote:
         | The most important command:                   git reset --hard
        
           | kreeben wrote:
           | This is my favorite. It allows one to easily create "service"
           | branches based on tags where you apply to that tag a select
           | set of commits from the development branch that you can then
           | easily deploy to PROD without including the rest of the
           | (perhaps not sufficiently tested) commits and without having
           | a convoluted branching strategy.
        
           | jwestbury wrote:
           | Or:                 git reset --soft HEAD~1
        
         | sbergot wrote:
         | protip: If you want to switch branch you can now use "git
         | switch [-c] foo". If you want to restore files you can do "git
         | restore .".
         | 
         | Basically you can stop using checkout.
         | 
         | *edit*: fixed switch branch creation parameter.
        
           | drdec wrote:
           | I think perhaps you meant "git switch [-c] foo"
        
             | sbergot wrote:
             | Correct thank you.
        
           | Am4TIfIsER0ppos wrote:
           | "THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE."
           | 
           | I will keep using them so I can keep using old software. How
           | new is it? Does ubuntu or debian have it?
        
             | sbergot wrote:
             | They have been introduced in git 2.23 released in august
             | 2019.
        
               | Macha wrote:
               | And is in Ubuntu 22.04 and Debian 11.
        
         | world2vec wrote:
         | `git add` sometimes
        
         | gear54rus wrote:
         | Because your IDE shows line-based blame and allows to check out
         | old file versions via right-click? :) Mee too
        
         | baggy_trough wrote:
         | Yes, this is very good. Also, rebasing is evil.
        
         | Sharlin wrote:
         | I definitely can't work without `git add`. Other commands I use
         | daily or almost-daily (frequently enough that I have aliases
         | for them) are `git add -p`, `git commit --amend`, `git rebase`,
         | `git rebase -i`, `git stash`...
         | 
         | Then there's of course `git log`, `git diff`, and `git status`,
         | but I presume you know about those as well.
        
         | Patrol8394 wrote:
         | To cleanup the history before asking for a review.
         | 
         | git rebase HEAD~2 -i
         | 
         | git commit --amend
        
         | mrkeen wrote:
         | For me, git becomes unintelligible when there's a crazy train-
         | track map of branching and merging.
         | 
         | So I usually do whatever I can to keep a single straight-line
         | master history.
         | 
         | I branch off for a task, then after a while my branch isn't
         | joined at the tip of master. So I rebase locally until it is.
         | Then when the PR happens, master gets my changes added to the
         | top, with no extra noise from merge commits.
         | 
         | Even if the local-rebase workflow is slightly more complicated,
         | the payoff is a really clean history, making future reasoning
         | about branches much easier. Not to mention merge conflicts are
         | easier to solve when you rebase early & often.
        
           | yCombLinks wrote:
           | That's because you've apparently used some command to look at
           | that history, none of his commands ever show that
        
           | remus wrote:
           | I think this is a pretty sensible approach. Git feels like
           | one of those tools where you are given a lot of power but
           | it's your responsibility to use it in a sensible way. A bit
           | like with Excel or spreadsheets. You can do a lot but you can
           | also make a big mess pretty quickly.
        
           | silon42 wrote:
           | In my observation, the problem comes when there is a "merge
           | main to the feature branch before merging"
           | 
           | This step is when newbies get confused because the diffs are
           | the wrong way, and I've seen people often losing merge hunks
           | from master when there are conflicts which can be disastrous
           | (not a git issue, I've seen people do the same in SVN).
           | 
           | The proper solution IMO would be for git to have a "merge
           | --reintegrate" which would do the opposite merge: take the
           | main branch and merge the current feature branch to it...
           | after success, you have a new feature branch.
           | 
           | This is why I also prefer rebase and cleaner history (but a
           | common mistake here is to squash after PR approval... that
           | should be done before, not after).
        
           | gumby wrote:
           | > For me, git becomes unintelligible when there's a crazy
           | train-track map of branching and merging.
           | 
           | Consider finding peace by inverting your perspective: if your
           | organization's development process (for whatever reason, many
           | of them legitimate) involves a crazy train-track of features
           | being developed in parallel, isn't it great that you're all
           | at least using something that can keep track of it?
           | 
           | For a small team that should be unnecessary.
           | 
           | Frequent rebasing when you're on a side branch of development
           | is smart, but doesn't conflict with my point.
           | 
           | Also, really, who looks back into the depths of history?
           | There's a reason a lot of backup schemes rotate a set of
           | tapes over 30 or even 14 days. For that reason I am not a fan
           | of rewriting history for "clarity": I consider it wasted
           | effort.
           | 
           | For the same reason I don't care about branches for
           | explorations that turned out to go nowhere -- just mark the
           | head abandoned and stop worrying about it.
        
         | saltserv wrote:
         | you can't make do with only that if you're in a team of 2 even
        
         | js2 wrote:
         | Git is conceptually simple but has a baroque UI. I really
         | recommend spending 15 minutes to understand it conceptually:
         | 
         | objects: blobs, trees, commits, annotated tags
         | 
         | refs: branches (local, remote-tracking), tags, HEAD
         | 
         | other: working tree, index (aka cache aka staging area),
         | remotes
         | 
         | There's lots of good guides out there: git from the bottom up,
         | git for computer scientists, and the git parable are some that
         | spring to mind. The Git Book is also excellent but it's more
         | than 15 minutes of your time.
         | 
         | All the commands become way less mystifying when you understand
         | what they are manipulating. You'll never get into a state where
         | you want to `rm -rf` the entire repo and start with a new
         | clone. There will hopefully be no more teeth grinding or
         | keyboard mashing.
         | 
         | I've used a lot of VCSs over the years (rcs, sccs, cvs,
         | subversion, clearcase, mercurial, git) and I swear git is the
         | one I find least frustrating. The others may have had simpler
         | interfaces, but they were either conceptually more complex,
         | overly rigid in their design and behavior, or both (looking at
         | you clearcase).
         | 
         | Look, I get it: a lot of folks see git as a necessary evil
         | that's part of their day job. I disagree. I think it's really
         | worth spending the time to learn well, probably like you
         | invested some time in your editor and other tooling.
         | 
         | I mean, it's easier than C++. :-)
        
         | GuB-42 wrote:
         | You are probably missing "git add" but otherwise, it is fine...
         | unless you fuck up.
         | 
         | Unfucking thing is what of most of my knowledge of git goes to.
         | Committing in the wrong branch, the wrong files, starting from
         | the wrong commit, etc... Before you push, almost all mistakes
         | are fixable, but it requires knowing a few more commands.
         | 
         | Then there are the project specific things. For instance, I
         | worked on a project where we didn't have a central server we
         | could push to and pull from (airgap). So we had to work with
         | bundles. Git does that really well (it really is
         | decentralized), but it is uncommon. Some people prefer a
         | rebase-based workflow, some use cherry-picking extensively,
         | some projects are more prone to conflicts than others, etc...
        
         | 2-718-281-828 wrote:
         | you don't need git add? no git status? i take my hat off to
         | you, sir!
        
         | airstrike wrote:
         | that's my list too, pretty much (+ the obvious `git add` and
         | `git status` others have mentioned)
         | 
         | I throw in a `git reflog` once in a while, and the other day I
         | patted myself on the back for my first use of `git tag -a mytag
         | -m "This is my first tag!"` followed by `git push origin
         | mytag`. I felt like god
         | 
         | now as much as I hate Xcode, its UI for looking at all my
         | current changes and staging them by blocks of line for commit
         | is like a superpower. as a solo developer working on brand new
         | code, not all of my lines of thinking follow very atomic git
         | commits, so it's nice to separate, say, some refactoring code
         | from some actually new functionality when committing changes
        
         | dahart wrote:
         | You're definitely not the only one, but you will level up if
         | you learn branching. After that you can become the guru amongst
         | your peers and a god among men if you learn how to use 'git
         | reflog'. ;) That will help you learn how fix almost any git
         | accident.
        
         | chrisweekly wrote:
         | you're def not the only one. but learning more pays dividends
         | in confidence and resilience. fyi, pull is just shorthand for
         | fetch && merge
        
         | paulddraper wrote:
         | Really?? I would have a very hard time without:
         | git log       git show       git status       git blame
         | git diff       git add       git reset
        
           | alex_smart wrote:
           | My "minimalist" list of git commands:                   git
           | add         git blame         git branch         git checkout
           | git cherry-pick         git clone         git commit
           | git diff         git fetch         git log         git merge
           | git pull         git push         git reset         git rm
           | git stash         git status
           | 
           | 17 commands in total. I don't think it is possible to be a
           | professional software engineer without being familiar with
           | them. Granted, some of these you may need more frequently
           | than others.
           | 
           | My guess is, the OC relies on their IDE/editor for the
           | functionality provided by some of these commands. But then,
           | why not just go all the way. Just use all VC features
           | provided by your IDE, and claim that you need zero git
           | commands.
        
         | xorcist wrote:
         | Unless you are superhuman and never make mistaeks, you should
         | put "git rebase" (with and without -i) on top of your list of
         | things to learn.
        
           | alex_smart wrote:
           | Let us also hope they do a git diff before pushing changes.
        
         | leonheld wrote:
         | I'm nothing without git rebase -i.
        
         | bmacho wrote:
         | If you get errors, save your work elsewhere, delete the
         | project, and download a fresh copy.
        
         | seba_dos1 wrote:
         | You don't _need to_ know many more commands than those - maybe
         | aside of  "reset", "rebase" and "fetch" which definitely come
         | handy regularly, and maybe a few more for showing status or
         | browsing commit graph (unless you use some GUI for that) - as
         | whenever you need anything else you'll usually just look it up
         | anyway, either in the man or on the Web. However, if your
         | _mental model_ of git is limited to these commands only, you
         | 're doing yourself a disservice that leads to
         | https://xkcd.com/1597/
        
       | zdw wrote:
       | A lot of things in git are just pointers to commits, and then the
       | git implementation handles them under the covers in some way that
       | usually makes sense but not always.
       | 
       | One example that also bites people: moving files isn't stored in
       | git - if you move files (even with `git mv`) and create a new
       | commit, the moves aren't stored, but this is reconstructed later
       | by the client based on similarity, which comes from the diff
       | algorithm.
       | 
       | And git has _multiple diff algorithms_ to pick from: https://git-
       | scm.com/docs/git-config#Documentation/git-config...
       | 
       | And optionally to _not detect renames_ in diff output with
       | `diff.renames`: https://git-scm.com/docs/git-
       | config#Documentation/git-config...
        
         | keybored wrote:
         | Yup. "Storing moves" is the kind of thing that might sound
         | intuitively obvious but then gets gnarly and non-obvious when
         | you think about it for five minutes. And so something that
         | might be "obvious" to do then turns out to be so non-obvious--
         | how to catch all file moves (intent) outside of simple
         | identitical content cases, and how do you represent them
         | internally?--that you realize that just using snapshots is
         | really the best thing to do.
        
           | hyperthesis wrote:
           | BitKeeper already did it.
        
             | lubutu wrote:
             | I think this is the one thing I feel BitKeeper does better
             | than Git. Git can get confused about where a file came
             | from, for moves but especially for copies, and so the
             | version history ends, even if you ask it to try and follow
             | along. BitKeeper, on the other hand, keeps the moves and
             | copies as part of the history, so you can always trace it
             | through to the origin of the file, no matter how
             | circuitous.
        
               | account42 wrote:
               | git log has --follow but unfortunately it only works when
               | spefying a single file and not e.g. a whole directory.
        
           | chrismorgan wrote:
           | It's completely trivial. The obvious and correct place is in
           | the commit object just like author and date and such, since
           | renaming is semantically part of the commit, not the tree:
           | commit 0123456789abcdef0123456789abcdef01234567       parent
           | fedcba9876543210fedcba9876543210fedcba98       author Nemo
           | <nemo@example.invalid> 1234567890 +0000       committer Nemo
           | <nemo@example.invalid> 1234567890 +0000       rename-from
           | path1.old       rename-to path1.new       rename-from
           | path2.old       rename-to path2.new            Commit message
           | 
           | And you don't _detect_ moves (because that's madness), but
           | require that people record them deliberately, just like every
           | other VCS has done. There's even git-mv already, it just
           | skips a step that every other VCS's equivalent command would
           | do. (And technically this all works out because the index is
           | a commit, so you can record the rename normally.)
           | 
           | Of course, all of this assumes that moving a file is a
           | meaningful operation. Perhaps ideally (for most languages and
           | systems) you'd track this in far smaller chunks, so that you
           | can track changes to a function even when it alone was moved
           | to a different file. But things like Git aren't interested in
           | those kinds of semantics, and work technically at the file
           | level, more or less, so I think it should track renames
           | because _in practice_ straightforward renames are _super_
           | common, but often also involve other changes that thwart
           | rename detection. Years ago Linus explained why he didn't
           | like storing moves (someone else has linked it), but I'm
           | largely not sold with his reasoning--the theory of the
           | perfect has hindered the useful, and file renames _are_
           | commonly meaningful in ways more than he said.
        
         | 2-718-281-828 wrote:
         | > moving files isn't stored in git
         | 
         | is there an intuitive and enlightening explanation as to why it
         | is this way?
        
           | keybored wrote:
           | Git stores snapshots and that's it. The whole tree, not per-
           | file.
           | 
           | As to why Linus doesn't like storing file moves:
           | https://public-
           | inbox.org/git/Pine.LNX.4.58.0504150753440.721...
        
             | bqmjjx0kac wrote:
             | Man, he communicates like a dick all the time I guess.
        
             | xorcist wrote:
             | I'd be happy to argue why Linus is wrong here. Many things
             | would be much easier if git recorded some more metadata in
             | every commit: file moves, and branch moves, to start with.
             | 
             | Having some sort of notion of "parent branch" would be very
             | useful for a number of common operations, and a "renamed
             | file" without having to rely on client dependent heuristics
             | too. Empty files trip people up all the time so a "create
             | file" would fit in perfectly.
             | 
             | These concepts would also be a good basis for more user
             | friendly clients. Other version control systems do this the
             | surprise factor should be low.
        
               | erik_seaberg wrote:
               | People would get lazy and rename a file without telling
               | Subversion they had done it, so it would write a "old
               | file deleted, new file created from nothing" revision.
               | Most of the merge conflict resolution machinery just
               | couldn't run without the missing guidance. Git infers
               | someone _probably_ renamed a file you edited or vice
               | versa, which seems risky but works better in practice.
        
             | ziyao_w wrote:
             | It's kind of funny to see Linus browbeaten other people
             | into submission regardless of him being right or not, while
             | claiming "I am always right".
             | 
             | A few counter points:
             | 
             | - `hg` has `cp`, and I believe both Meta and Google's
             | internal systems have that; - git has `mv`, which was added
             | later, but it is really janky and git would forget files
             | are moved which I think it is because git doesn't try to
             | track that, likely because of the philosophy here; - as for
             | storing file moves - nobody said you *have* to use this
             | information, but you can certainly use this information to
             | help with things.
             | 
             | The whole thread is an interesting read though and I will
             | try going through it someday - maybe doing that would
             | change my mind.
        
           | paulddraper wrote:
           | Git doesn't store _any_ individual changes: files moved,
           | lines added, line deleted, etc.
           | 
           | It stores a commit graph, and a tree at each of those
           | commits. (A lossless compression algorithm deduplicates
           | information.)
           | 
           | There's no need for the author to be concerned with what
           | diffing information gets incorporated into the commit. Diffs
           | are up to the viewer of the commit history.
           | git show --diff-algorithm=...
        
           | layer8 wrote:
           | For the historical rationale see here: https://gist.github.co
           | m/borekb/3a548596ffd27ad6d948854751756...
           | 
           | In short, Linus stance is that file renaming doesn't matter,
           | only the _contents_ of files matter, and the moving of
           | contents between files. Moved /renamed files then fall out as
           | a special case of moving content.
           | 
           | Personally, I think this is a case of the better being the
           | enemy of the good, and his "clearly superior algorithm"
           | doesn't work as well as claimed in practice. Or maybe tooling
           | merely still isn't up to snuff after 18 years.
        
             | seba_dos1 wrote:
             | I don't think it's about having a stance, it's about git's
             | architecture. From the commit graph point of view, there's
             | no such things as moving anything at all, neither files nor
             | content. Commits represent a whole new state of the
             | repository, not a diff from the previous state. The only
             | way a commit is linked to the previous state is via parent
             | pointer, it can otherwise be completely unrelated (and you
             | can simply change the parent pointer without changing
             | anything else in the commit). Any diffs are calculated at
             | runtime. The issue with renames is just a consequence of
             | assuming such data model - you could try to plaster it over
             | with some metadata, but ultimately you would still be
             | fighting against the model rather than working with it.
             | 
             | Many people develop a bad mental model with commits as
             | diffs, because that's what the UI makes them think commits
             | are. It can work for a while, but inevitably leads to
             | confusion later on.
        
               | layer8 wrote:
               | As you say, commits link to their parent(s), and those
               | links effectively represent the edges of the commit
               | graph. It makes perfectly sense to record moves on those
               | edges. That's how other VCSs do it. There is no conflict
               | with the commit model.
               | 
               | Viewing the commit graph in terms of nodes (commits) or
               | edges (diffs) is equivalent, these are dual views you can
               | easily convert between. The internal representation is
               | independent from that. Some VCSs use a mix of diffs and
               | full revisions internally. Even Git uses delta
               | compression when packing objects.
        
               | seba_dos1 wrote:
               | What I meant is that git doesn't have any structure to
               | represent an edge other than a simple pointer.
               | Conceptually it wouldn't be a big change to add some, but
               | the consequence of that is that everything in git
               | revolves around nodes rather than edges, and whenever the
               | concept of an edge is needed (such as in "cherry-pick")
               | it's being calculated on fly.
        
               | layer8 wrote:
               | I don't see where this would be causing any issues. There
               | is a canonical place where to put edge metadata, namely
               | in the child commit. And whenever you're interested in
               | move information, you have to process the respective
               | child commit anyway.
        
               | ptx wrote:
               | If you think of it not as a "rename" (which would belong
               | in the edge object if it existed) but rather as a "note:
               | the file A in this tree was known as B in the parent
               | tree" it would make perfect sense to store it in the
               | child commit.
        
         | smusamashah wrote:
         | My TL;DR; for git commits is that these are connected like a
         | linked list but in reverse and has more pointers than just
         | head/tail. I recommend having a look at Merkle trees. I don't
         | understand git cli, but I can manipulate git commits, branches,
         | tags etc well based on basic understanding using a good git UI.
        
       | nunez wrote:
       | Great explanation. Thanks, Julia!
        
       | PeterWhittaker wrote:
       | I just reread my take on branches and relearned some stuff I'd
       | forgotten: https://peter-whittaker.com/obligatory-grokking-git-
       | post
       | 
       | Warning, all text, no diagrams....
        
       | rkangel wrote:
       | Git doesn't have the concept of "main is special", but at least
       | tools like Gitlab have protected branches to stop you screwing up
       | too much.
       | 
       | Some concept of "parent" and "child" branches would actually be
       | pretty interesting. You do have to support multiple "parent"
       | branches though for long term support branches.
        
         | samus wrote:
         | Protecting branches is indeed very important. I make errors all
         | the time when screwing around. It helps enormously being
         | restricted to just messing up one's feature branches. Many
         | other changes can be done via the GUI with PRs and the various
         | kind of controlled merge and rebase strategies they support,
         | like Merge, Rebase + Merge, FF-only Merge, Squash merge, etc.
        
           | cedws wrote:
           | It's also a security feature. If you have a repo with a lot
           | of developers working on it, you need to be sure they
           | absolutely cannot slip in code with nobody noticing, or
           | trigger CI/CD and compromise build secrets or even
           | production.
        
         | andrybak wrote:
         | > Git doesn't have the concept of "main is special"
         | 
         | Technically, there is special handling for both "master" and
         | "main" in Git in fairly obvious, but I'd argue in a not very
         | important way. When you merge two regular branches, the commit
         | message is `Merge branch 'source' into destination`. But not if
         | destination is `master` or `main` - the `into ...` part is
         | omitted for those merge commits.
         | 
         | But this is just for backward compatibility. Git is very
         | conservative in changing such user facing behavior as generated
         | merge commit messages. To get Git to treat `master` and `main`
         | truly without special handling, set empty value to config
         | option `merge.suppressDest` [1]:                   $ git config
         | merge.suppressDest ""
         | 
         | `master` is also used as the default name for the default
         | branch in newly created repositories. See option `--initial-
         | branch` of `git init` and config variable `init.defaultBranch`
         | [2] to override. Git for Windows, for example, allows setting
         | the config option in its installer.
         | 
         | Source code:
         | 
         | For merge commit formatting:
         | https://github.com/git/git/blob/2108fe4a1976f95821e13503fd33...
         | 
         | For default branch naming:
         | https://github.com/git/git/blob/91e2ab1587d8ee18e3d2978f2b7b...
         | 
         | Git for Windows installer suggesting setting
         | `init.defaultBranch`:
         | 
         | - https://github.com/git-for-windows/build-
         | extra/blob/586c46ec...
         | 
         | - https://github.com/git-for-windows/build-
         | extra/blob/586c46ec...
         | 
         | Footnotes:
         | 
         | [1] https://git-scm.com/docs/git-merge#Documentation/git-
         | merge.t...
         | 
         | [2] https://git-scm.com/docs/git-init#Documentation/git-
         | init.txt...
        
           | chrnola wrote:
           | There's some special handling for FETCH_HEAD too (i.e. which
           | branch on a remote is considered the default).
        
         | jacoblambda wrote:
         | It actually does but it's very much in alpha/active development
         | (under the umbrella of OpenSSF with the intent of being
         | integrated into mainline git eventually).
         | 
         | https://github.com/gittuf/gittuf
        
           | ndriscoll wrote:
           | Git itself doesn't run a persistent process and I don't see
           | how it'd make sense to prevent a user from making arbitrary
           | changes to their local repo, so this sounds like just another
           | server like GitHub, Gerrit, Gitlab, etc. that already have
           | those features.
        
       | Sharlin wrote:
       | But... If you have a rebase workflow, then `git checkout trunk;
       | git rebase branch` is exactly how you "merge" an offshoot branch
       | into a trunk branch! That's what Github does when you rebase-
       | merge a PR, for example.
        
         | karatinversion wrote:
         | No, that's not right. If you did that, you would need to force
         | push to get the result pushed to the remote.
        
           | Sharlin wrote:
           | Oh, right. So what actually happens is that the offshoot must
           | first be rebased on top of the trunk, and then trunk can be
           | fast-forward merged/rebased (same thing, really) to the
           | offshoot's head.
        
       | ChrisMarshallNY wrote:
       | That's an excellent explanation.
       | 
       |  _> "Wrong" models can be super useful._
       | 
       | This is used in usability and UX design a lot. Affording mental
       | models that don't reflect the actual code, happens all the time.
        
         | samus wrote:
         | This is perfectly fine and the added value of a great
         | application if it can hide the underlying reality completely.
         | With Git, the abstractions are paper-thin at best though. Good
         | UIs can indeed cover up many aspects, but they only work as
         | long as there are no merge or rebase conflicts. To correctly
         | resolve these, the user has to have a precise picture of what
         | is actually going on.
        
         | pvg wrote:
         | _This is used in usability and UX design a lot._
         | 
         | It's the fundamental thing that makes UI work. I've always
         | liked the title of Brenda Laurel's book - _Computers as
         | Theatre_
        
       | informalo wrote:
       | > You do need to explicitly specify the other branch when merging
       | or rebasing or making a pull request (like git rebase main),
       | because git doesn't know what branch you think your offshoot is
       | based on.
       | 
       | I think a big issue with the presented intuition is that it's
       | limited to wanting to merge the base/trunk/main branch into your
       | feature branch. However, sometimes you want to merge a feature
       | branch into another feature branch. With this in mind, you can
       | form a better intuition, imo, where it's absolutely clear that
       | you have to specify what branch you want to merge into another
       | one.
        
       | mtnygard wrote:
       | I have found that git makes a lot more sense if you reverse the
       | mental model of lineage. People think about a lineage going
       | _forward_. But a more useful way to think is in terms of
       | _backward_ pointers.
       | 
       | A commit points to it's parent(s). Since a branch is just a
       | commit ID, you can follow the parent links backwards to find the
       | whole history of that branch.
       | 
       | So a "branch point" is just where two chains of parent links
       | converge.
       | 
       | The special part are merge commits. Those have multiple parents,
       | indicating that two histories fused into one.
        
         | layer8 wrote:
         | The issue is that if you consider a branch to be what is really
         | the history of the branch tip, then a branch is not just the
         | part starting from the last join with another branch. Instead
         | it is some directed path through the commit DAG, a path that in
         | general can't be reconstructed from the information Git keeps.
         | 
         | If, for example, you have a structure like
         | |             o            / \           o   o        A  |   |
         | B           o   o            \ /                     o
         | / \           o   o        C  |   |  D           o   o
         | \ /             o             |
         | 
         | then conceptually the path CA might be one branch and DB the
         | other branch (or alternatively, CB and DA). But this is not
         | something that is represented in Git's model.
        
           | mountainboy wrote:
           | Interesting, so then which path(s) does git display when
           | running git-log on this?
        
             | seba_dos1 wrote:
             | Define "this". If you git-log from the commit on the top of
             | that ASCII graph, you get all the drawn commits listed
             | (unless adjusted with arguments such as `--no-merges` or
             | `--first-parent`).
        
             | Izkata wrote:
             | You can get ASCII art of that structure with:
             | git log --graph --oneline
             | 
             | Older versions you'll also want --decorate to show branches
             | and tags, but I think that's on by default now.
        
           | vifon wrote:
           | This missing piece of information would be essentially `git
           | reflog`, except it's not something Git sends between the
           | clones.
        
           | ajross wrote:
           | > a path that in general can't be reconstructed from the
           | information Git keeps.
           | 
           | Uh... yes it can. Commits have a list of 0 or more parents.
           | That creates a DAG. There are literal hordes of tools out
           | there that reliably interpret this, from visualizer tools to
           | practical mutators like git bisect.
           | 
           | Maybe you're trying to say that no single commit order exists
           | that traverses the whole tree. That's true, because branches
           | can merge together. But it remains a completely interpretable
           | graph nonetheless.
        
             | layer8 wrote:
             | That's not what I was saying. I was referring to the
             | history of branch tips.
        
               | ajross wrote:
               | But that's not related to the DAG at all. The branch can
               | be changed at any moment for any reason to point to any
               | commit with any content.
               | 
               | But it's true that conventionally, a new branch tip
               | should always have the previous branch tip as an
               | ancestor. But not always as a direct parent, and even if
               | so it might be a merge commit that joins two different
               | branches. There is indeed no single spanning path through
               | a DAG.
               | 
               | But trying to explain it as "git doesn't store enough
               | information" to construct that spanning path seems
               | confused to me. It's not about what git stores, it's just
               | math: there is no such path in the general case, period.
        
               | layer8 wrote:
               | The fact that the branch tip can be moved to unrelated
               | commits is another issue with Git's model, and a mismatch
               | to the intuitive "a named lineage in the DAG" conception
               | of branches. In other VCSs, that would be a new/different
               | branch, and you could still rename branches so that the
               | same name will later refer to a different branch, but the
               | branch history as such (including renames) would be
               | preserved.
        
               | ajross wrote:
               | > mismatch to the intuitive "a named lineage in the DAG"
               | conception of branches
               | 
               | Once more, that conception may be intuitive _but it is
               | wrong_. A branch is emphatically _NOT_ a line through the
               | DAG, it 's the whole DAG. There simply is no single list
               | of patches to apply to get from one commit to another,
               | even if both were at some point heads of the same branch,
               | and even if one is an ancestor of the other.
               | 
               | And the reason it's wrong is that branches can merge
               | together. You can have commit A descended from both the
               | "main" branch and the "topic_a" branch, despite the fact
               | that those two had diverged. This isn't a bug, it's a
               | feature. You don't have to use it if you don't want to
               | (lots of projects require linear commit histories in
               | their main branch), but it's part of the tool nonetheless
               | because some projects (Linux especially) use it heavily
               | and to great effect.
        
           | lifeisstillgood wrote:
           | Just to go off on a tangent - that's a pretty neat diagram
           | for a throw away comment. was that just careful spacing in
           | the HN textbox or did you use a tool - which one ? :-)
        
             | grodriguez100 wrote:
             | "Text after a blank line that is indented by two or more
             | spaces is reproduced verbatim. (This is intended for
             | code.)"
             | 
             | Looks like this also switches to a monospaced font, which
             | makes it easier to draw ASCII art.                 This
             | should be rendered using a monospaced font.        _____
             | \   /        \ /         O
        
           | Izkata wrote:
           | You can reconstruct it manually with a combination of the
           | parent commit order and the automatic merge commit message,
           | if you didn't change the commit message. But yeah, that
           | second part isn't recorded in the structure itself.
        
         | trealira wrote:
         | That's how I learned it, not having known anything about git or
         | version control beforehand. I used this site:
         | 
         | learngitbranching.js.org/
         | 
         | Which represents commits as circles with arrows pointing to
         | their parents.
        
       | keybored wrote:
       | Lately I've wanted branches (heads) to have a corresponding tail
       | which points to the base commit that the branch sits on top of
       | (like the commit on `main` when you created the branch).[1]
       | Because branches get rebased all the time and eventually you have
       | six commits out in the AEther somewhere and you have to think
       | twice about where it even starts. And yeah you can probably think
       | for a few seconds and recall that you have worked with John and
       | not Jimmy on this branch so the seventh commit backwards that
       | belongs to Jimmy must be the commit base. Or Git can tell you
       | that the seventh commit belongs to `main` already. But why should
       | you have to expend any effort?
       | 
       | You can optionally include the base commit when you send out
       | "patches" to a mailing list.[2] Because it might not have been
       | obvious that you based your changes on:
       | 
       | - The latest release
       | 
       | - The main development branch
       | 
       | - Some integration branch (probably an error)
       | 
       | You also need to keep the "base" in mind when you use `git range-
       | diff` because that tool takes two ranges lik `main..previous` and
       | `main..current`. And sometimes you can rely on just using
       | `main..` and letting Git figure it out but in my experience
       | passing an explicit value sometimes works better.
       | 
       | `git range-diff` is a super-cool but perhaps niche tool. But you
       | basically have to use it on review round number 2 and higher when
       | you are sending changes to the Git project.
       | 
       | [1] This has been discussed before and there was a patch series
       | that implemented it. But that was basically a POC and done in the
       | spirit of "this is useless IMO but here's how you could do it"...
       | and the implementation didn't factor in all the shenanigans that
       | you can do with `reset` and `rebase` so it couldn't have been
       | merged as-is. (Although to be fair: the bar was not set to work
       | perfectly with any kind of branch reset etc., which I suspect is
       | impossible in any case.)
       | 
       | [2] Patches after all are just commit messages plus the patches
       | themselves and don't tell you what they are based on.
        
         | Snarwin wrote:
         | It looks like this is what git merge-base --fork-point is
         | supposed to do, although according to the docs it is not 100%
         | reliable.
        
       | chatmasta wrote:
       | We're teaching Git wrong. Most of the common confusion is due to
       | people learning from the porcelain down to the plumbing, when it
       | should be the other way around. If you limit your mental model to
       | the plumbing, there's generally only one outcome that you want,
       | but there are a dozen ways to get there from the porcelain. You
       | can choose whichever one you prefer. But if you start from one of
       | those dozen ways, they could each lead to a different outcome
       | than you expected.
       | 
       | I'm forever grateful for one of my early internships, where a guy
       | from GitHub visited the office and gave us a one day workshop on
       | Git. He started from the internals and explained how Git models
       | your codebase. (He's also the one who introduced me to the idea
       | of plumbing vs. porcelain.) Then once we had a common language,
       | teaching the porcelain was a matter of starting from the plumbing
       | and working upwards, rather than the other way around.
       | 
       | Another invaluable resource in learning Git is this interactive
       | tutorial [0], which renders a tree diagram of start state and
       | desired end state and makes you write the commands (for which
       | there are often many options!) to get to that end state. This
       | reinforces the idea that the best way of planning Git commands is
       | to first visualize the end state you want, and then reason about
       | how to get there.
       | 
       | Also: RTFM! Not just once. Go back to it. You'll learn something
       | new every time. The docs [1] are really good.
       | 
       | [0] https://learngitbranching.js.org/
       | 
       | [1] https://git-scm.com/docs
        
       | spenrose wrote:
       | If you learned from this (excellent) piece, I recommend that you
       | buy and work through https://leanpub.com/learngitthehardway . It
       | will take less than a day, and you'll have a much stronger
       | foundation for a core tool.
        
       | riperoni wrote:
       | While the explanation is right in some sense, it misses a few
       | points.
       | 
       | Branches are pointers to a commit and that pointer is refreshed
       | when a new commit is created. One could say they are a wandering
       | tag (without explaining a tag for now).
       | 
       | The actual chain of commits that represent what we see as branch
       | comes from the commits themselves. Those commits point back to
       | their parent commit.
       | 
       | And then one can see why no branch has any special meaning: It is
       | a chain of related commits with a named entrypoint. Once you
       | delete a branch (i.e. the named wandering pointer to a commit),
       | you cannot identify a branch as such anymore. It is just a chain
       | of related commits without a named label now. And nothing besides
       | the name distinguished the branch from other commit chains
       | before.
       | 
       | The master/dev/release branches are then a convention to keep an
       | updated commit pointer on the chain of commits containing changes
       | of interest.
        
         | jansan wrote:
         | This was the most useful piece of information that I have ever
         | read about Git.
         | 
         | But what happens if you merge branch A into beanch B? A and B
         | will both contain the commits of A, but in B there may be
         | commits of B between the commits that were merged. Do the same
         | commits of A then have different parents depending on which
         | branch they are on?
        
           | paulddraper wrote:
           | Merging branch A into branch B does two things:
           | 
           | 1. Create a new merge commit with _two_ parents: the commit
           | pointed to by A and the commit pointed to by B.
           | 
           | 2. Set branch B to point at the new merge commit.
           | 
           | This is a non-linear history; when comparing some commits
           | there isn't a "before" or "after."
        
           | seba_dos1 wrote:
           | > Do the same commits of A then have different parents
           | depending on which branch they are on?
           | 
           | Absolutely not. Commits are immutable (representing whole
           | repo state, _not_ a diff), and branches are just (mutable)
           | pointers to them.
           | 
           | As the sibling already noted, a merge commit is just a
           | regular commit. It simply points to multiple parents,
           | "merging" them. Aside of the whole machinery to resolve
           | conflicts etc. that's pretty much all there is to it.
           | 
           | When your graph topology allows it, you can also merge
           | branches without generating a new commit (so called "fast
           | forward" merges) - such a merge does nothing but rewrites the
           | branch pointer. You can also create merge commits that point
           | to more parents than two ("octopus" merges). Reconciling the
           | commits' content can get quite complicated in such cases, but
           | from the repo graph perspective it's nothing special.
        
             | xorcist wrote:
             | > Commits are immutable (representing whole repo state, not
             | a diff)
             | 
             | To make things more clear: Repo state here is the contents
             | of all files, and some metadata including a pointer to the
             | previous commit.
             | 
             | So a commit hash uniquely identifies not only a set of
             | files but the unique history leading up to it! That's why
             | we some people like to call git the original block chain
             | (there's no proof of work involved of course so it can
             | never be used for payments or anything like that, but the
             | merkle tree bit is similar enough).
        
           | tharkun__ wrote:
           | I keep repeating this every time someone talks about git and
           | finds something weird or doesn't get branches, so I'm really
           | glad your parent mentioned it as well and I know there's
           | someone else out there that "gets" that:                   In
           | git it's all just labels/pointers
           | 
           | It's not useful at all to think about branches as the user
           | sees them as "things" of their own. Branches don't "have"
           | anything. Branches in that sense are just convenient labels.
           | 
           | Of course actual "branches" in the commit tree exist whether
           | you label them or not. Until `git` does a garbage collection
           | and gets rid of anything that doesn't have a pointer
           | ultimately leading to it - something that a human would
           | understand aka branch/tag. And that's why we call these
           | labels "branches" as well but it's actually one word for two
           | things here. The actual tree branch and the label that's
           | called branch.
           | 
           | And a branch and a tag are basically the same exact thing
           | underneath, just a file in the `.git` directory somewhere
           | that contains a commit hash. All the meaning and
           | differentiation of branch or tag is just in the human brain
           | and how we and our tools treat them. Such as if you look at a
           | particular commit in your tool of choice, it will tell you
           | which branch it's part of. To create a branch you can
           | literally just create a thousand randomly named files in the
           | right part of the `.git` directory containing the same commit
           | hash and suddenly this commit "is on all those branches".
           | That's what git does and why creating a branch in git is so
           | super fast.
        
             | seba_dos1 wrote:
             | To make things more complicated, the word "tag" is also
             | overloaded. It can either be just a reference (in git
             | lingo, a "ref") to a commit - just like a branch, only
             | differing from it in how the tools treat it; but they can
             | also be "annotated tags" which are pointing to a special
             | tag object which contains some metadata and only then
             | points to a specific commit (or other kind of object...) :)
        
               | evntdrvn wrote:
               | You'll also see the first type referred to as
               | "lightweight tags", if that helps anyone :)
        
           | loeg wrote:
           | In short: merge commits have multiple parent commits. So your
           | tree tracing logic bifurcates at that point. The commits in
           | the merged history are not altered by the merge commit; they
           | each have a single parent commit (unless they are also merge
           | commits).
        
         | skrebbel wrote:
         | For years I was deeply annoyed by the terrible name "branch"
         | for something that acts more like a bookmark (or "wandering
         | tag" indeed!).
         | 
         | And then I learned that git branches are branches in exactly
         | the same way that the first element of a linked list in C "is"
         | the linked list. Git was made by C people and they're used to
         | referring to entire data structures by way of some root
         | element.
         | 
         | I mean that doesn't make me dislike the name any less but at
         | least now I see where they were coming from.
        
           | neuromanser wrote:
           | That's (most probably) where the "head" terminology comes
           | from, too.
        
             | alfredpawney wrote:
             | Yes you are correct. It traces back to Allen Newell
        
           | mr_mitm wrote:
           | When the entire structure of commits is called a tree, I find
           | the name "branch" fitting. The branch is identified by its
           | head commit, so the path from head to root is uniquely
           | defined and that's the branch. (Disregarding merges for now.)
        
             | TeMPOraL wrote:
             | > _Disregarding merges for now._
             | 
             | Without disregarding them, it's not a tree, but a DAG.
        
               | dayjaby wrote:
               | It is a tree. What makes you think it's just a DAG? Are
               | there commits with multiple parent commits or what?
        
               | imron wrote:
               | Yes. Merge commits have two parents.
        
               | Izkata wrote:
               | Two or more. I'm not sure there's a limit.
               | 
               | Try not to do this (imagine 5-way merge conflict).
        
               | zaphar wrote:
               | There absolutely can be. Merge commits have multiple
               | parent commits for example. It's definitely a graph not
               | just a tree.
        
               | dayjaby wrote:
               | Parent comment was about disregarding merge commits.
        
             | cpeterso wrote:
             | Leaning into the tree metaphor (and following the precedent
             | of other version control systems), git should have used the
             | term _trunk_ instead of _master_ or _main_.
        
               | hnarn wrote:
               | Why? That would heavily imply that master/main is somehow
               | technically different from all other branches (since a
               | trunk is certainly not a branch), which to my knowledge
               | is not true.
        
             | skrebbel wrote:
             | FWIW "tree" has a specific, different meaning in Git. It's
             | a file tracking the contents of a directory.
        
           | ajross wrote:
           | > Git was made by C people and they're used to referring to
           | entire data structures by way of some root element.
           | 
           | FWIW this is actually backwards. The word "branch" was
           | already in common use (to refer to the same basic idea) in
           | SCM systems going back decades, and in almost all of those a
           | "branch" was indeed a first class object with its own data
           | that acted as a "container" for commits, both semantically
           | and physically.
           | 
           | The fact that a "branch" is just a pointer is in fact a git
           | innovation on top of the former idea.
        
           | cpeterso wrote:
           | > acts more like a bookmark
           | 
           | In fact, Mercurial uses the term "bookmark" for its
           | lightweight, git-like branching. Mercurial's branches have
           | slightly different semantics and can't be deleted like
           | bookmarks or git branches
        
           | imron wrote:
           | > Git was made by C people
           | 
           | This is why I think of branches as pointers. The file
           | contents are literally just a pointer to a commit on the DAG.
        
         | loeg wrote:
         | I think this is covered adequately (if less completely) in the
         | "technically correct" definition section.
        
         | jillesvangurp wrote:
         | A key point with git is that every clone is effectively its own
         | set of branches; even if they have the same name. The
         | mechanisms you use for synchronizing your local branches with
         | some remote branches are exactly the same as the mechanisms you
         | use between to your local named branches.
         | 
         | Git was actually designed initially for email based workflows
         | where there was no central remote at all. Basically, that works
         | by exporting patches and then applying them to your local
         | branch. The branch name isn't even part of the patch.
         | 
         | A git patch is just a textualized form of the list of commits
         | you created locally. You can apply them to any branch you like.
         | As long as you and whomever applies the patch has a common
         | ancestor commit in common, the patch may merge cleanly. It's
         | good hygiene to ensure it does by for example
         | rebasing/merging/squashing before you email somebody your
         | patches. If that somebody is called Linus Torvalds, he's going
         | to be pretty strict about things like commit messages and
         | things not being spaghetti ball of merges, reverts, forks, etc.
         | Your mess, your problem. Linux development still works via
         | mailing list. And forget about emailing him directly with a
         | patch; you need to use the mailing lists like everybody else.
         | And he works with a network of senior contributors that screen
         | everything that comes in and that aggregate all the patches
         | coming from upstream. So, he only gets involved at the end of
         | the process.
         | 
         | Of course the rest of us use network protocols to sync our
         | repositories. But the important distinction here is that this
         | is a two step process. First you fetch content from remote.
         | This is simply ensuring you have all the commit objects you
         | need in your local git database. Any branches you have are
         | simply text files with the commit content hash they point to as
         | the content in .git/refs/heads. Remote branches are the same
         | but live in your local .git/refs/remotes/<remotename>. Those
         | branches might be named something like origin/main to make it
         | clear that that is a local branch from the origin remote. And
         | then you rebase/merge between your local and "remote" (i.e.
         | also local) branch as needed. Pull is just short hand for doing
         | both steps in one go. All merges are local. Same with rebases.
         | 
         | Most of the conventions people project on git are kind of
         | cultural and vary between people and companies. It's helpful to
         | read up on the git internals in the Git book. Github is sort of
         | an opinionated take on this that back in the day made people
         | coming from centralized version systems like subversion feel at
         | home by providing a central repository and allowing them to
         | push their changes there or "share" branches there. Not
         | necessarily a great idea for bigger projects and limiting write
         | access is common on Github.
        
       | gtirloni wrote:
       | Some of Julia's tweets started to get suffixed with "I don't want
       | advice about this". It must have reached unacceptable levels.
        
       | cube2222 wrote:
       | There is a very good article by GitHub:
       | https://github.blog/2020-12-17-commits-are-snapshots-not-dif...
       | 
       | TLDR: Think of commits as snapshots, not diffs, and you'll be
       | fine.
        
       | MauranKilom wrote:
       | I don't use git at work, but in my private hobby projects my
       | friends usually get mad when they watch me juggle changes and
       | branch pointers with git reset --hard and git stash...
       | 
       | How do you undo a merge that you didn't mean to do/did wrongly?
       | git reset --hard <last commit before merge>
       | 
       | Have some cosmetic fixups on your local branch that really should
       | go into main (or a separate branch) first before merging a bigger
       | feature?                   git stash         git checkout main
       | git stash apply
       | 
       | By thinking about branches as pointers, the commit graph existing
       | independently, and stashes just being temporary commits, I feel
       | I'm working much more directly with the underlying abstraction.
       | Yes, git has commands for specific combinations of actions, but
       | for an occasional user it's harder to remember every such command
       | and which arguments and flags to pass in which order. It's either
       | "look through documentation until you find graph diagrams
       | illustrating what will happen for this order of arguments and
       | flags" or "use the primitives 'move branch pointer', 'commit to
       | branch', 'hold these changes for a second' for obtaining the
       | commit tree you actually want. Knowing that the reflog exists
       | also makes this insane-sounding working mode pretty non-scary.
       | And yes, some operations (e.g. cherry-pick) you just need to do
       | the "real" way.
       | 
       | (My git stash obsession is most likely just damage from years of
       | using Perforce, which doesn't have a modified/staged distinction.
       | The only way to commit only part of a changed file is via the
       | equivalent of stash -> [restore half the file] -> commit -> stash
       | pop.)
       | 
       |  _Prepares to be crucified..._
        
         | globular-toast wrote:
         | "Undo" is usually more like `git reset --hard HEAD@{1}`, ie.
         | using the reflog.
         | 
         | Nothing wrong with this at all. Only people who don't
         | understand and/or are scared of git don't like it.
         | 
         | You could also use cherry-pick to "donate" commits to other
         | branches, instead of stash, of course. Magit has some great
         | extra abstractions for this.
        
         | hotnfresh wrote:
         | You can check out multiple branches in different directories
         | from a single git repo. This saves me a lot of what used to be
         | stashing.
        
         | zaptheimpaler wrote:
         | Both of those sound totally reasonable to me! I don't know of
         | any better ways to do that stuff and there's nothing risky
         | about it.
        
           | int0x80 wrote:
           | One thing that is risky about git reset --hard is that any
           | non-committed changes are lost. That has bitten me a few
           | times.
        
             | afiori wrote:
             | My controversial opinion is that git needs some kind of gui
             | that help you keep track of the state of the repo
        
         | specialist wrote:
         | > _I 'm working much more directly with the underlying
         | abstraction_
         | 
         | Your strategy of seeing things as they are is a useful general
         | purpose life skill.
        
         | codesnik wrote:
         | why crucified? you're doing exactly what I do. All the people
         | who have any trouble with git whatsoever try to use it as a
         | black box for some high-level whatever ideas of what is their
         | workflow should be. And git is not that, git is a thin wrapper
         | around simple and elegant data structure. If you understand it,
         | then everything clicks and git doesn't EVER gives any trouble.
         | 
         | Your friends are unreasonable, unless you collaborate with them
         | on the same branches and rewrite them after you shared them.
        
         | codesnik wrote:
         | also, using stash is only a first level. git cherry-pick, git
         | rebase --interactive, git reset --hard HEAD^ and friends allows
         | do such moves and cosmetic extractions after the commit itself.
         | I also prefer to split cosmetic changes and feature changes, so
         | I extract cosmetic stuff to the main all the time.
        
         | tremon wrote:
         | git reset --hard is actually dangerous, because it throws away
         | local modifications that were not yet committed. To undo just
         | the commit and not the work, you should use git reset --soft
         | (to undo just the git commit) or git reset --mixed (to undo
         | both the git commit and the "git add"s leading up to the
         | commit).
        
           | alex_smart wrote:
           | git checkout will also happily throw away local unstaged
           | modifications, and I would argue that it is even more
           | dangerous because I did not have to type "--hard" to shoot
           | myself in the foot.
        
             | hnarn wrote:
             | That does not sound right at all, I'm pretty sure there's a
             | warning when you try to checkout a branch that would
             | override local unstaged changes. I might be wrong but I'd
             | like some proof.
        
               | Izkata wrote:
               | Checkout is also used for reverting changes to unstaged
               | files.
        
             | astrobe_ wrote:
             | There must be something you do terribly wrong if you
             | believe it happened to you in normal use.
        
               | mb7733 wrote:
               | The poster is talking about when you do something like
               | this: `git checkout -- .` That wipes out all unstated
               | changes on tracked files in the current directory
        
         | afiori wrote:
         | I agree that git is almost asking you to juggle commits.
         | 
         | My preference is to use temporary branches and cherry-picking
         | instead of stashing; I mostly use a gui* to work with git so it
         | is easy to select the two or three commits to cherry-picking or
         | see visually if an interactive rebase would work.
         | 
         | * https://gitextensions.github.io/
        
         | pitaj wrote:
         | > How do you undo a merge that you didn't mean to do/did
         | wrongly?
         | 
         | I usually used `switch` for this:                   # Check out
         | the previous commit         git switch -d HEAD~         #
         | Overwrite the branch         git switch -C <current branch>
        
         | erik_seaberg wrote:
         | Not a fan of the staging area, because it hasn't been tested
         | before commit. I would rather stash some changes to postpone
         | them, then test and commit the workspace.
        
       | bloopernova wrote:
       | tl;dr Please ignore, just me working through a Python+pygit2
       | problem. I solved it in a grandchild comment.
       | 
       | I had so much trouble trying to map my intuited/mental model of
       | git onto pygit2 that I gave up and just used the git module.
       | 
       | I wanted to automate a fairly simple thing in Python as opposed
       | to bash+commands. My reasoning being that I wanted to do it
       | "right" and be a Big Programmer Real Boy(tm). I just wanted to
       | create a branch remotely in Github, pull the repo, and checkout
       | the new branch. I got stuck going in circles trying to figure out
       | why I was always left in detached HEAD state because I didn't
       | understand _exactly_ what git was doing during a checkout.
       | # repo has already been pulled         if
       | os.path.exists(repo_path):             local_repo =
       | git.Repo(path=repo_path)             self.log.debug(f"current
       | branch: {local_repo.active_branch.name}")
       | local_repo.git.checkout(branch_name)
       | 
       | That's super easy and is much the same as running the commands in
       | the shell or in a bash script.
       | 
       | Of course, I've lost my poor implementation using pygit2, so I'll
       | add that later if I find it. Thankfully there's a good discussion
       | surrounding the issue I encountered in this excellent "roll your
       | own git in Python", which doesn't use pygit2, but the concepts
       | are the same: https://www.leshenko.net/p/ugit/#checkout-switch-
       | branches
        
         | bloopernova wrote:
         | This isn't asking someone else to make this work, it's more of
         | a caution to convince folks like me to just use "import git"
         | rather than pygit2:
         | 
         | So something like this was what I expected to work, but leaves
         | the repo in detached head state:                   import
         | pygit2         def checkout_branch(path, branch_name):
         | repo = pygit2.Repository(path)                  branch_ref =
         | repo.lookup_reference(f"refs/remotes/origin/{branch_name}")
         | print(f"{branch_ref.name}")
         | repo.checkout(branch_ref)
         | 
         | The branch_ref.name prints "refs/remotes/origin/test" but git
         | status says "HEAD detached at origin/test"
         | 
         | So I'm probably feeding the wrong thing into repo.checkout, but
         | I'm honestly not sure what else it should be.
         | 
         | Funnily enough, git itself tries to do the right thing if
         | pulled in a detached head state:                   From
         | https://github.com/testorg/example         * [new branch]
         | test       -> origin/test         You are not currently on a
         | branch.         Please specify which branch you want to merge
         | with.         See git-pull(1) for details.                  git
         | pull <remote> <branch>
        
           | bloopernova wrote:
           | Ha, and of course just messing around gets me something that
           | actually works.
           | 
           | There always seems to be just one more stackoverflow thread
           | to read that has the real answer:
           | https://stackoverflow.com/questions/68435607/how-to-clone-
           | ma... (found via Kagi which I wasn't using before, and the
           | search "pygit2 detached head")                   def
           | checkout_branch(path, branch_name):             repo =
           | pygit2.Repository(path)                  main_branch =
           | repo.lookup_branch("main")             print(f"Main branch
           | upstream: {main_branch.upstream_name}")                  if
           | branch_name not in repo.branches.local:
           | print(f"Branch {branch_name} not found in local branches")
           | remote_branch = "origin/" + branch_name                 if
           | remote_branch not in repo.branches.remote:
           | raise SystemExit(f"Branch {remote_branch} not found in remote
           | branches")                 (commit, remote_ref) =
           | repo.resolve_refish(remote_branch)
           | repo.create_reference("refs/heads/" + branch_name,
           | commit.hex)                  branch =
           | repo.lookup_branch(branch_name)             print(f"Branch
           | name: {branch.name}")                  repo.checkout(branch)
           | print(f"Is branch head? {branch.is_head()}")
           | (commit, branch_remote) = repo.resolve_refish("origin/" +
           | branch_name)             print(f"Remote branch:
           | {branch_remote.name}")             branch.upstream =
           | branch_remote
           | 
           | With git reflog telling me the right thing:
           | d44aedc (HEAD -> test, origin/test) HEAD@{0}: checkout:
           | moving from main to test
           | 
           | And git push has the remote branch already set.
           | 
           | I wish there was a pair programmer AI that you had to explain
           | stuff to. That would enable the "by explaining it, I solved
           | it" phenomenon.
        
       | parasti wrote:
       | The article goes in the right direction, but from a weird
       | starting point. Saying things like "a branch contains the entire
       | history" just adds to the general confusion about Git. Git does
       | not have branches. Sure, Git emulates branches to appear familiar
       | and intuitive, but it is actually counterproductive to use that
       | as a starting point to explain how Git works. Git manages a graph
       | of commits and some of those commits need human readable labels.
       | That's it. The only thing that contains the entire history is the
       | commit graph itself.
        
       | BlueTemplar wrote:
       | Another way to think about merging and patches :
       | https://jneem.github.io/merging/
        
       | intrepidsoldier wrote:
       | Anything about Git reminds me of this:
       | 
       | https://youtu.be/EReooAZoMO0?si=sHqcYsf8v6LyWLAx
        
         | chihuahua wrote:
         | Given how many smart people are confused by Git, and how many
         | times Git's behavior needs to be explained in a way that often
         | raises as many questions as it answers, it seems to indicate
         | that Git's model is not at all intuitive and doesn't map well
         | to how people generally use it to get work done.
         | 
         | These are all people who have no problem understanding all
         | kinds of other technologies and building complex systems from
         | them.
         | 
         | It's not quite in "a monad is just a monoid in the category of
         | endofunctors" territory, but when this many smart people have
         | difficulty understanding something, I think Git is to blame,
         | not the people.
        
           | jacoblambda wrote:
           | I have to serious ask. Of the people who have issues with
           | using or understanding git, how many of them have actually
           | read the docs?
           | 
           | Git is by no means perfect but the development community is
           | great and there is a massive focus on improving the project
           | and making things more approachable and intuitive.
           | 
           | And because of that, git actually has really solid, coherent
           | documentation with easily digestible tutorials and guides for
           | all the things you need to do.
           | 
           | So it always hurts me when I see people ranting and raving
           | about how awful git is, how it can't do x, or how it doesn't
           | make sense how it works but then you send them the guide or
           | tutorial hosted on the git-scm website and suddenly it makes
           | sense.
           | 
           | Not to be beating the RTFM horse but RTFM guys.
        
             | chihuahua wrote:
             | Whenever I read the Git docs, after a while I start
             | thinking "this is all very well and good, but it doesn't
             | seem to be related to what I'm trying to do to get my work
             | done (usually fairly basic things)"
             | 
             | Or I have read a bunch of pages on the git-scm site, and
             | I'm thinking "oh yes it all makes sense now." Then I'm
             | trying to do something in the real world, and I get bizarre
             | messages and conflicts that don't make any sense. Or I made
             | a mistake and want to undo it, and end up in some crazy
             | situation. The Documentation doesn't seem to help in
             | anything but an ideal textbook scenario with no mistakes
             | and complications.
        
             | forrestthewoods wrote:
             | Git makes version control roughly 10x more complicated than
             | it needs to be.
             | 
             | I can teach an artist or designer who has never heard of
             | version control how to use Perforce in roughly 5 minutes.
             | They will never blow off their leg and will likely never
             | lose work. It will probably be a few months before they hit
             | some edge case where they need help.
             | 
             | Git requires building a non-trivial mental model. Then it
             | requires memorizing a whole bunch of unintuitive commands
             | with unintuitive flags.
             | 
             | > Not to be beating the RTFM horse but RTFM guys.
             | 
             | Good tools are intuitive and can be incrementally learned
             | without resorting to dense documentation.
             | 
             | RTFM is definitely a solution. But when a very large number
             | of users have consistently similar issues at some point you
             | have to stop blaming the users and admit the tool isn't
             | easy to learn.
        
             | Feathercrown wrote:
             | I've read the docs. Too much explanation of command line
             | flags, not enough practical examples. It is thorough
             | though.
        
       | gitanovic wrote:
       | I think that one way to "easily" understand the syntax of git is
       | to remember that when you perform a command you "always" modify
       | the current branch
       | 
       | for example: git merge my-branch will merge my-branch into the
       | current one
       | 
       | while git rebase my-branch will rebase current one on top of my-
       | branch
        
       | Vinnl wrote:
       | Years ago I wrote this dynamic tutorial that visualises branches
       | as you read: https://agripongit.vincenttunru.com
       | 
       | It's aimed at folks who know how to use `git add` and `git
       | commit`, and would like to spend 15 minutes to form a mental
       | model to help them _understand_ what 's going on.
       | 
       | In case it's useful to someone.
        
       | k__ wrote:
       | That article would have been a lot better if it showed
       | illustrations for the "right" mental model too.
        
         | taberiand wrote:
         | The right mental model is to realise the 'main' branch is only
         | special by convention - git doesn't actually treat it
         | differently from any other branch.
         | 
         | All of the confusion expressed in the article stems from a
         | misunderstanding that main should work in some special way.
         | 
         | Of course every branch's history goes all the way back to root
         | and not to some arbitrary common commit of another branch like
         | 'main'. Of course rebase and merge can work "backwards" from
         | main onto some branch (because it's not "backwards" because
         | main is not special - it just isn't done much in practice
         | because keeping main straight helps with collaboration)
         | 
         | Furthermore, by realising that main isn't inherently special,
         | it becomes obvious that the actions can be done between any two
         | branches as needed.
         | 
         | The right mental model is - it's just commits, all the way
         | down.
        
           | k__ wrote:
           | _" All of the confusion expressed in the article stems from a
           | misunderstanding that main should work in some special way."_
           | 
           | I didn't have that impression when reading that article.
           | 
           | To me it seems that the confusion comes from thinking in
           | actual branches, and not from thinking anything special about
           | main.
        
       | why-el wrote:
       | I've learned only one constant with git in my years as a
       | programmer: master your own employer's git use cases, and pray to
       | god for three things:
       | 
       | 1. you don't change places often and thus git patterns.
       | 
       | 2. you don't accidentally ship and commit a multi-GB file to your
       | remote.
       | 
       | 3. you don't change the git process on yourself and your
       | colleagues without an extremely solid reason.
       | 
       | Document your chosen git patterns, even in 2023.
        
       | tsbx wrote:
       | I always get back to this page when trying to understand/show how
       | git works under the hood: https://eagain.net/articles/git-for-
       | computer-scientists/
       | 
       | It summarizes fundamentals clearly.
        
       ___________________________________________________________________
       (page generated 2023-11-23 23:00 UTC)