[HN Gopher] Git Branches: Intuition and Reality
___________________________________________________________________
Git Branches: Intuition and Reality
Author : tambourine_man
Score : 318 points
Date : 2023-11-23 14:43 UTC (8 hours ago)
(HTM) web link (jvns.ca)
(TXT) w3m dump (jvns.ca)
| xanderlewis wrote:
| > in general, even if people's intuition about a topic is
| technically incorrect in some ways, people usually have the
| intuition they do for very legitimate reasons!
|
| This is worth an essay of its own.
| webstrand wrote:
| I'm still missing what part of the intuition is incorrect? It
| seems like the only "incorrectness" is that there's no explicit
| hierarchy of branches. Except that's wrong the HEAD ref points
| to the default branch. Any other branches are of equal
| significance, though.
| mgerdts wrote:
| Intuition would be that the branch starts at the point that
| it diverges from main, labeled "base" in the first diagram.
| In reality, the first commit in "main" and "branch" are the
| same commit.
|
| Intuition likely comes from how a tree (fir or oak, not
| binary) is structured. Generally a branch starts at the trunk
| or some other branch, not at the ground where the trunk gives
| way to roots.
| jancsika wrote:
| I don't agree with the author here.
|
| Intuition is _travelling_ down main path that has branches
| which diverge and re-merge into the main path.
|
| That's why people seem to intuitively get "merging" back
| into main, whereas that doesn't generally make sense for
| physical trees.
| Sharlin wrote:
| No, the HEAD ref points to whatever branch is "active",
| that's how the active branch is defined. Indeed `git checkout
| branchname` does nothing except make HEAD point to the commit
| that `refs/heads/branchname` points to.
|
| The intuition jvns meant is the idea that a branch only
| constitutes the commits since the point of divergence, but
| every branch actually contains the full history up to the
| root of its tree, and `git log` of course shows that. (If you
| want to only show the commits specific to a branch, you can
| do `git log parent..branch`. Note also that two branches need
| not have _any_ common history, it 's perfectly possible for a
| git graph to be disconnected.)
| xorcist wrote:
| > `git checkout branchname` does nothing except make HEAD
| point to the commit
|
| You probably know this, but since we are being pedantic we
| might as well get it right: That describes "git reset".
| "git checkout" does that _and_ record that we are tracking
| branchname. So any commits will move both HEAD and the
| branchname reference.
| mwexler wrote:
| I guess.
|
| To me, the opposite is a more worthy essay: why, with all the
| power to customize our tech, do we create things that
| consistently work differently than people's intuition?
|
| The fact that it "mostly jibes" feels like a footgun, not a
| feature.
|
| I get that for some, "git just works! It made sense from day
| one" but in my limited experience, 0% of people I've worked
| with have said that.
|
| Sure, we can all learn the tech. And expert techniques in any
| field often don't jibe with naive expectations. But for me and
| the folks I work with, the tech industry feels like it's
| gliding more towards inscrutible tools vs ease of use.
|
| We've hit a stage where many rely on code completion bots and
| answer-supplying bots instead of being able to directly embrace
| our tech. I wish the tech was more approachable on its own, but
| perhaps this is the natural evolution of things.
| xanderlewis wrote:
| That doesn't seem like the opposite to me. It seems like the
| same thing. Rather than rejecting people's intuition as
| 'understandable but wrong' why don't we use it as the basis
| for a better solution?
| atq2119 wrote:
| You do have a point, but it's not a slam dunk. Intuition
| isn't some fixed thing but arises from personal experience. A
| lot of that is common to a culture, but there are different
| cultures and in any case, some truly personal aspects remain.
|
| There needs to be a balance between creating new, more
| powerful intuitions, and meeting people at the intuitions
| they already have.
|
| Case in point, Git's branching model is pretty intuitive when
| you understand how Linux kernel development works. Perhaps 0%
| of the people you've worked with have looked into that.
| That's fine. Different cultures...
|
| Another example that may be worth studying is mathematics and
| the hard sciences. Learning those is a lot about learning
| powerful intuitions.
| BlueTemplar wrote:
| Yeah, few things are actually "intuitive". "Shared
| familiarity" is probably a better term.
| eviks wrote:
| Partially because that's a much harder design challenge,
| especially for people with an unrelated skill set
| chthonicdaemon wrote:
| People writing software fall into several categories based on
| the problem they're solving, the reason they're solving it
| and the audience of the solution.
|
| I solve my own problems for my own reasons all the time and
| therefore other people's intuitions are immaterial in the
| process. It would just slow me down to think "how would other
| people use this" when I'm focused on some technical personal
| problem.
|
| Commercial software developers solve problems with the clear
| purpose of selling the solution to others and where they know
| ahead of time roughly what their audience's intuitions are.
| This is why intuitive GUI applications exist - there are
| whole industries devoted to finding out what people expect,
| what lowers cognitive load etc. iOS and Android apps give you
| a good idea of what is possible with modern tech when the
| purposes are properly aligned.
|
| The problem here is that git was expressly developed by Linus
| to solve his own problem in a way that made sense to him with
| no thought as to how other people would use it. There were no
| focus groups, early betas, feedback from users and so on. At
| best there has been slow fixes to the porcelain to fix the
| stuff that bothers the people who could make a PR to git. On
| the other hand there are also many front-end projects that
| attempt to align some other person's idea of how version
| control is supposed to work with the Git model.
|
| Anyway - I am in the camp where I very seldom get confused
| about a git thing because the actual expressed model is
| really simple (in the way that x86 assembly is a "simpler"
| language than Java). I find most front-ends much more
| confusing because they don't seem to work the way I expect.
| But I am never surprised when someone's pet project is
| understandable only by themselves. Or indeed when a consumer
| product is consumer-friendly. The real surprise is when a
| lone programmer makes something for themselves that then goes
| on to have wide appeal.
| ndriscoll wrote:
| As one of those people that thinks it's extremely intuitive,
| I have to wonder where the confused people are learning about
| git. The documentation on the site[0] is quite clear:
|
| > A branch in Git is simply a lightweight movable pointer to
| one of these commits. The default branch name in Git is
| master. As you start making commits, you're given a master
| branch that points to the last commit you made. Every time
| you commit, the master branch pointer moves forward
| automatically.
|
| It has multiple diagrams explaining how commits point to
| their content and their parents, and branches point to
| commits. The Pro Git content has been there for at least 10
| years (it's what I learned from 10 years ago).
|
| Maybe the problem is just that the Internet is full of blogs
| that have incorrect diagrams (like those in the OP) and bad
| explanations, despite the main website having great
| documentation!
|
| [0] https://git-scm.com/book/en/v2/Git-Branching-Branches-in-
| a-N...
| chihuahua wrote:
| If Git was "extremely intuitive", and the documentation was
| "great", why would so many otherwise smart people keep
| writing blogs about it with incorrect diagrams?
|
| What is your theory about why so many people are having
| difficulty creating a correct mental model about Git, and
| why so many people are writing incorrect blogs about it?
| ndriscoll wrote:
| Like I sort of implied, my theory is people haven't read
| the docs on the official site (or the book that's on the
| site), and keep regurgitating bad information that they
| read on some blog or howto site. I don't know why they do
| this. I don't make these sites, so I don't know what
| motivates people who do, especially people who don't
| understand what they're writing about.
|
| If you understand the basic design premise (commits are
| content-addressed immutable snapshots), the pointer stuff
| is kind of obvious. It _has_ to work something like that
| for it to be able to be immutable if you want to be able
| to make branches /tags after the commit is created.
| afiori wrote:
| In part it is because git is hard to use, in part it is
| because mostly people learn git by oral tradition and
| often treat it like sorcery.
| marcosdumay wrote:
| Often, people's intuition is wrong on very important ways,
| and something that works like they expect is sure to create
| footguns or just blow up by itself.
|
| But I'm not sure git is a case of this. The DVCS that were
| created following people's intuitions were known to be slow
| and internally complex, but I have never heard about them
| failing. (And the slowness is obviously of a kind that can be
| optimized away.)
|
| We just stuck with the worst UI ever devised in public for a
| VCS because of network effects.
| chihuahua wrote:
| I totally agree with it being "the worst UI ever devised".
| It's fine to use commits with parent pointers and branches
| as pointers to commits and all the other stuff internally.
| But there should be a UI wrapped around that that maps to
| operations that make sense for the purpose of working on a
| software project.
|
| Not this:
|
| git merge [-n] [--stat] [--no-commit] [--squash]
| [--[no-]edit] [--no-verify] [-s <strategy>] [-X <strategy-
| option>] [-S[<keyid>]] [--[no-]allow-unrelated-histories]
| [--[no-]rerere-autoupdate] [-m <msg>] [-F <file>] [--into-
| name <branch>] [<commit>... ]
| codesnik wrote:
| because a) everyone's intuition is different, b) sometimes
| uneducated intuition is just wrong. On a surface level things
| looks good, but in some specific situation intuitive ways of
| doing things could be not consistent or don't have any
| solution at all. In this cases you just stuck with magic box
| of software which did _something_ and you have no idea what
| and reach for backup.
|
| Git is not like that. It is very-very simple. If you learn
| basics of it, your intuition will align with git's
| "intuition" too, and you can do crazy things with total peace
| of mind, without googling or looking into source code of git
| to see how they had to make something "intuitive" in some
| definition of the word.
| timacles wrote:
| This is the same reasoning that SQL gets criticized with. But
| the answer is simple.
|
| Git (and sql) range from simple task to very complicated.
| Everyone likes to fantasize about making it easier but
| they're only thinking about the fraction of functionality
| they use, rather than everything it currently does.
|
| If someone could come up with a simpler solution they would,
| but they can't because git can do extremely complicated
| things and is internally consistent. Most people
| underestimate that part
| informalo wrote:
| Yup. If it works, it ain't stupid.
| adaboese wrote:
| I cannot be the only one that gets away with only knowing:
| git pull git merge x git checkout [-b] foo git
| commit git push
| mixedmath wrote:
| Possibly `git branch NEWBRANCHNAME` instead of `git checkout -b
| NEWBRANCHNAME`. When I need to show git to someone in order for
| them to contribute to something, I give them only these
| incantations --- and instructions to ask me if weird git things
| happen.
| rkangel wrote:
| You then need to do both `git branch xxx` and `git checkout
| xxx` though.
|
| If you teach "checkout to move around and add -b when moving
| to a new branch the first time" that works pretty well
| still_grokking wrote:
| It's `git switch [-c]` nowadays.
| someone7x wrote:
| I've given up trying to configure git push the branch I'm
| on so I type this little dance each time :
| git switch -c foo git push > did you mean
| git push --args-with-branch-name? sigh, copy,
| paste, enter
| bravetraveler wrote:
| I manage with even less, any merging under my watch happens as
| a strategy with pulling
| _ZeD_ wrote:
| FWIW that's 99% of my usage of git
|
| (well... it's about 1% because I do everything using the
| eclipse git UI, but that's the same behavior you get from that
| commands)
| have_faith wrote:
| I _know_ a decent amount of git, but day to day I use GUIs
| (Sublime Merge).
| ryanjshaw wrote:
| The most important command: git reset --hard
| kreeben wrote:
| This is my favorite. It allows one to easily create "service"
| branches based on tags where you apply to that tag a select
| set of commits from the development branch that you can then
| easily deploy to PROD without including the rest of the
| (perhaps not sufficiently tested) commits and without having
| a convoluted branching strategy.
| jwestbury wrote:
| Or: git reset --soft HEAD~1
| sbergot wrote:
| protip: If you want to switch branch you can now use "git
| switch [-c] foo". If you want to restore files you can do "git
| restore .".
|
| Basically you can stop using checkout.
|
| *edit*: fixed switch branch creation parameter.
| drdec wrote:
| I think perhaps you meant "git switch [-c] foo"
| sbergot wrote:
| Correct thank you.
| Am4TIfIsER0ppos wrote:
| "THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE."
|
| I will keep using them so I can keep using old software. How
| new is it? Does ubuntu or debian have it?
| sbergot wrote:
| They have been introduced in git 2.23 released in august
| 2019.
| Macha wrote:
| And is in Ubuntu 22.04 and Debian 11.
| world2vec wrote:
| `git add` sometimes
| gear54rus wrote:
| Because your IDE shows line-based blame and allows to check out
| old file versions via right-click? :) Mee too
| baggy_trough wrote:
| Yes, this is very good. Also, rebasing is evil.
| Sharlin wrote:
| I definitely can't work without `git add`. Other commands I use
| daily or almost-daily (frequently enough that I have aliases
| for them) are `git add -p`, `git commit --amend`, `git rebase`,
| `git rebase -i`, `git stash`...
|
| Then there's of course `git log`, `git diff`, and `git status`,
| but I presume you know about those as well.
| Patrol8394 wrote:
| To cleanup the history before asking for a review.
|
| git rebase HEAD~2 -i
|
| git commit --amend
| mrkeen wrote:
| For me, git becomes unintelligible when there's a crazy train-
| track map of branching and merging.
|
| So I usually do whatever I can to keep a single straight-line
| master history.
|
| I branch off for a task, then after a while my branch isn't
| joined at the tip of master. So I rebase locally until it is.
| Then when the PR happens, master gets my changes added to the
| top, with no extra noise from merge commits.
|
| Even if the local-rebase workflow is slightly more complicated,
| the payoff is a really clean history, making future reasoning
| about branches much easier. Not to mention merge conflicts are
| easier to solve when you rebase early & often.
| yCombLinks wrote:
| That's because you've apparently used some command to look at
| that history, none of his commands ever show that
| remus wrote:
| I think this is a pretty sensible approach. Git feels like
| one of those tools where you are given a lot of power but
| it's your responsibility to use it in a sensible way. A bit
| like with Excel or spreadsheets. You can do a lot but you can
| also make a big mess pretty quickly.
| silon42 wrote:
| In my observation, the problem comes when there is a "merge
| main to the feature branch before merging"
|
| This step is when newbies get confused because the diffs are
| the wrong way, and I've seen people often losing merge hunks
| from master when there are conflicts which can be disastrous
| (not a git issue, I've seen people do the same in SVN).
|
| The proper solution IMO would be for git to have a "merge
| --reintegrate" which would do the opposite merge: take the
| main branch and merge the current feature branch to it...
| after success, you have a new feature branch.
|
| This is why I also prefer rebase and cleaner history (but a
| common mistake here is to squash after PR approval... that
| should be done before, not after).
| gumby wrote:
| > For me, git becomes unintelligible when there's a crazy
| train-track map of branching and merging.
|
| Consider finding peace by inverting your perspective: if your
| organization's development process (for whatever reason, many
| of them legitimate) involves a crazy train-track of features
| being developed in parallel, isn't it great that you're all
| at least using something that can keep track of it?
|
| For a small team that should be unnecessary.
|
| Frequent rebasing when you're on a side branch of development
| is smart, but doesn't conflict with my point.
|
| Also, really, who looks back into the depths of history?
| There's a reason a lot of backup schemes rotate a set of
| tapes over 30 or even 14 days. For that reason I am not a fan
| of rewriting history for "clarity": I consider it wasted
| effort.
|
| For the same reason I don't care about branches for
| explorations that turned out to go nowhere -- just mark the
| head abandoned and stop worrying about it.
| saltserv wrote:
| you can't make do with only that if you're in a team of 2 even
| js2 wrote:
| Git is conceptually simple but has a baroque UI. I really
| recommend spending 15 minutes to understand it conceptually:
|
| objects: blobs, trees, commits, annotated tags
|
| refs: branches (local, remote-tracking), tags, HEAD
|
| other: working tree, index (aka cache aka staging area),
| remotes
|
| There's lots of good guides out there: git from the bottom up,
| git for computer scientists, and the git parable are some that
| spring to mind. The Git Book is also excellent but it's more
| than 15 minutes of your time.
|
| All the commands become way less mystifying when you understand
| what they are manipulating. You'll never get into a state where
| you want to `rm -rf` the entire repo and start with a new
| clone. There will hopefully be no more teeth grinding or
| keyboard mashing.
|
| I've used a lot of VCSs over the years (rcs, sccs, cvs,
| subversion, clearcase, mercurial, git) and I swear git is the
| one I find least frustrating. The others may have had simpler
| interfaces, but they were either conceptually more complex,
| overly rigid in their design and behavior, or both (looking at
| you clearcase).
|
| Look, I get it: a lot of folks see git as a necessary evil
| that's part of their day job. I disagree. I think it's really
| worth spending the time to learn well, probably like you
| invested some time in your editor and other tooling.
|
| I mean, it's easier than C++. :-)
| GuB-42 wrote:
| You are probably missing "git add" but otherwise, it is fine...
| unless you fuck up.
|
| Unfucking thing is what of most of my knowledge of git goes to.
| Committing in the wrong branch, the wrong files, starting from
| the wrong commit, etc... Before you push, almost all mistakes
| are fixable, but it requires knowing a few more commands.
|
| Then there are the project specific things. For instance, I
| worked on a project where we didn't have a central server we
| could push to and pull from (airgap). So we had to work with
| bundles. Git does that really well (it really is
| decentralized), but it is uncommon. Some people prefer a
| rebase-based workflow, some use cherry-picking extensively,
| some projects are more prone to conflicts than others, etc...
| 2-718-281-828 wrote:
| you don't need git add? no git status? i take my hat off to
| you, sir!
| airstrike wrote:
| that's my list too, pretty much (+ the obvious `git add` and
| `git status` others have mentioned)
|
| I throw in a `git reflog` once in a while, and the other day I
| patted myself on the back for my first use of `git tag -a mytag
| -m "This is my first tag!"` followed by `git push origin
| mytag`. I felt like god
|
| now as much as I hate Xcode, its UI for looking at all my
| current changes and staging them by blocks of line for commit
| is like a superpower. as a solo developer working on brand new
| code, not all of my lines of thinking follow very atomic git
| commits, so it's nice to separate, say, some refactoring code
| from some actually new functionality when committing changes
| dahart wrote:
| You're definitely not the only one, but you will level up if
| you learn branching. After that you can become the guru amongst
| your peers and a god among men if you learn how to use 'git
| reflog'. ;) That will help you learn how fix almost any git
| accident.
| chrisweekly wrote:
| you're def not the only one. but learning more pays dividends
| in confidence and resilience. fyi, pull is just shorthand for
| fetch && merge
| paulddraper wrote:
| Really?? I would have a very hard time without:
| git log git show git status git blame
| git diff git add git reset
| alex_smart wrote:
| My "minimalist" list of git commands: git
| add git blame git branch git checkout
| git cherry-pick git clone git commit
| git diff git fetch git log git merge
| git pull git push git reset git rm
| git stash git status
|
| 17 commands in total. I don't think it is possible to be a
| professional software engineer without being familiar with
| them. Granted, some of these you may need more frequently
| than others.
|
| My guess is, the OC relies on their IDE/editor for the
| functionality provided by some of these commands. But then,
| why not just go all the way. Just use all VC features
| provided by your IDE, and claim that you need zero git
| commands.
| xorcist wrote:
| Unless you are superhuman and never make mistaeks, you should
| put "git rebase" (with and without -i) on top of your list of
| things to learn.
| alex_smart wrote:
| Let us also hope they do a git diff before pushing changes.
| leonheld wrote:
| I'm nothing without git rebase -i.
| bmacho wrote:
| If you get errors, save your work elsewhere, delete the
| project, and download a fresh copy.
| seba_dos1 wrote:
| You don't _need to_ know many more commands than those - maybe
| aside of "reset", "rebase" and "fetch" which definitely come
| handy regularly, and maybe a few more for showing status or
| browsing commit graph (unless you use some GUI for that) - as
| whenever you need anything else you'll usually just look it up
| anyway, either in the man or on the Web. However, if your
| _mental model_ of git is limited to these commands only, you
| 're doing yourself a disservice that leads to
| https://xkcd.com/1597/
| zdw wrote:
| A lot of things in git are just pointers to commits, and then the
| git implementation handles them under the covers in some way that
| usually makes sense but not always.
|
| One example that also bites people: moving files isn't stored in
| git - if you move files (even with `git mv`) and create a new
| commit, the moves aren't stored, but this is reconstructed later
| by the client based on similarity, which comes from the diff
| algorithm.
|
| And git has _multiple diff algorithms_ to pick from: https://git-
| scm.com/docs/git-config#Documentation/git-config...
|
| And optionally to _not detect renames_ in diff output with
| `diff.renames`: https://git-scm.com/docs/git-
| config#Documentation/git-config...
| keybored wrote:
| Yup. "Storing moves" is the kind of thing that might sound
| intuitively obvious but then gets gnarly and non-obvious when
| you think about it for five minutes. And so something that
| might be "obvious" to do then turns out to be so non-obvious--
| how to catch all file moves (intent) outside of simple
| identitical content cases, and how do you represent them
| internally?--that you realize that just using snapshots is
| really the best thing to do.
| hyperthesis wrote:
| BitKeeper already did it.
| lubutu wrote:
| I think this is the one thing I feel BitKeeper does better
| than Git. Git can get confused about where a file came
| from, for moves but especially for copies, and so the
| version history ends, even if you ask it to try and follow
| along. BitKeeper, on the other hand, keeps the moves and
| copies as part of the history, so you can always trace it
| through to the origin of the file, no matter how
| circuitous.
| account42 wrote:
| git log has --follow but unfortunately it only works when
| spefying a single file and not e.g. a whole directory.
| chrismorgan wrote:
| It's completely trivial. The obvious and correct place is in
| the commit object just like author and date and such, since
| renaming is semantically part of the commit, not the tree:
| commit 0123456789abcdef0123456789abcdef01234567 parent
| fedcba9876543210fedcba9876543210fedcba98 author Nemo
| <nemo@example.invalid> 1234567890 +0000 committer Nemo
| <nemo@example.invalid> 1234567890 +0000 rename-from
| path1.old rename-to path1.new rename-from
| path2.old rename-to path2.new Commit message
|
| And you don't _detect_ moves (because that's madness), but
| require that people record them deliberately, just like every
| other VCS has done. There's even git-mv already, it just
| skips a step that every other VCS's equivalent command would
| do. (And technically this all works out because the index is
| a commit, so you can record the rename normally.)
|
| Of course, all of this assumes that moving a file is a
| meaningful operation. Perhaps ideally (for most languages and
| systems) you'd track this in far smaller chunks, so that you
| can track changes to a function even when it alone was moved
| to a different file. But things like Git aren't interested in
| those kinds of semantics, and work technically at the file
| level, more or less, so I think it should track renames
| because _in practice_ straightforward renames are _super_
| common, but often also involve other changes that thwart
| rename detection. Years ago Linus explained why he didn't
| like storing moves (someone else has linked it), but I'm
| largely not sold with his reasoning--the theory of the
| perfect has hindered the useful, and file renames _are_
| commonly meaningful in ways more than he said.
| 2-718-281-828 wrote:
| > moving files isn't stored in git
|
| is there an intuitive and enlightening explanation as to why it
| is this way?
| keybored wrote:
| Git stores snapshots and that's it. The whole tree, not per-
| file.
|
| As to why Linus doesn't like storing file moves:
| https://public-
| inbox.org/git/Pine.LNX.4.58.0504150753440.721...
| bqmjjx0kac wrote:
| Man, he communicates like a dick all the time I guess.
| xorcist wrote:
| I'd be happy to argue why Linus is wrong here. Many things
| would be much easier if git recorded some more metadata in
| every commit: file moves, and branch moves, to start with.
|
| Having some sort of notion of "parent branch" would be very
| useful for a number of common operations, and a "renamed
| file" without having to rely on client dependent heuristics
| too. Empty files trip people up all the time so a "create
| file" would fit in perfectly.
|
| These concepts would also be a good basis for more user
| friendly clients. Other version control systems do this the
| surprise factor should be low.
| erik_seaberg wrote:
| People would get lazy and rename a file without telling
| Subversion they had done it, so it would write a "old
| file deleted, new file created from nothing" revision.
| Most of the merge conflict resolution machinery just
| couldn't run without the missing guidance. Git infers
| someone _probably_ renamed a file you edited or vice
| versa, which seems risky but works better in practice.
| ziyao_w wrote:
| It's kind of funny to see Linus browbeaten other people
| into submission regardless of him being right or not, while
| claiming "I am always right".
|
| A few counter points:
|
| - `hg` has `cp`, and I believe both Meta and Google's
| internal systems have that; - git has `mv`, which was added
| later, but it is really janky and git would forget files
| are moved which I think it is because git doesn't try to
| track that, likely because of the philosophy here; - as for
| storing file moves - nobody said you *have* to use this
| information, but you can certainly use this information to
| help with things.
|
| The whole thread is an interesting read though and I will
| try going through it someday - maybe doing that would
| change my mind.
| paulddraper wrote:
| Git doesn't store _any_ individual changes: files moved,
| lines added, line deleted, etc.
|
| It stores a commit graph, and a tree at each of those
| commits. (A lossless compression algorithm deduplicates
| information.)
|
| There's no need for the author to be concerned with what
| diffing information gets incorporated into the commit. Diffs
| are up to the viewer of the commit history.
| git show --diff-algorithm=...
| layer8 wrote:
| For the historical rationale see here: https://gist.github.co
| m/borekb/3a548596ffd27ad6d948854751756...
|
| In short, Linus stance is that file renaming doesn't matter,
| only the _contents_ of files matter, and the moving of
| contents between files. Moved /renamed files then fall out as
| a special case of moving content.
|
| Personally, I think this is a case of the better being the
| enemy of the good, and his "clearly superior algorithm"
| doesn't work as well as claimed in practice. Or maybe tooling
| merely still isn't up to snuff after 18 years.
| seba_dos1 wrote:
| I don't think it's about having a stance, it's about git's
| architecture. From the commit graph point of view, there's
| no such things as moving anything at all, neither files nor
| content. Commits represent a whole new state of the
| repository, not a diff from the previous state. The only
| way a commit is linked to the previous state is via parent
| pointer, it can otherwise be completely unrelated (and you
| can simply change the parent pointer without changing
| anything else in the commit). Any diffs are calculated at
| runtime. The issue with renames is just a consequence of
| assuming such data model - you could try to plaster it over
| with some metadata, but ultimately you would still be
| fighting against the model rather than working with it.
|
| Many people develop a bad mental model with commits as
| diffs, because that's what the UI makes them think commits
| are. It can work for a while, but inevitably leads to
| confusion later on.
| layer8 wrote:
| As you say, commits link to their parent(s), and those
| links effectively represent the edges of the commit
| graph. It makes perfectly sense to record moves on those
| edges. That's how other VCSs do it. There is no conflict
| with the commit model.
|
| Viewing the commit graph in terms of nodes (commits) or
| edges (diffs) is equivalent, these are dual views you can
| easily convert between. The internal representation is
| independent from that. Some VCSs use a mix of diffs and
| full revisions internally. Even Git uses delta
| compression when packing objects.
| seba_dos1 wrote:
| What I meant is that git doesn't have any structure to
| represent an edge other than a simple pointer.
| Conceptually it wouldn't be a big change to add some, but
| the consequence of that is that everything in git
| revolves around nodes rather than edges, and whenever the
| concept of an edge is needed (such as in "cherry-pick")
| it's being calculated on fly.
| layer8 wrote:
| I don't see where this would be causing any issues. There
| is a canonical place where to put edge metadata, namely
| in the child commit. And whenever you're interested in
| move information, you have to process the respective
| child commit anyway.
| ptx wrote:
| If you think of it not as a "rename" (which would belong
| in the edge object if it existed) but rather as a "note:
| the file A in this tree was known as B in the parent
| tree" it would make perfect sense to store it in the
| child commit.
| smusamashah wrote:
| My TL;DR; for git commits is that these are connected like a
| linked list but in reverse and has more pointers than just
| head/tail. I recommend having a look at Merkle trees. I don't
| understand git cli, but I can manipulate git commits, branches,
| tags etc well based on basic understanding using a good git UI.
| nunez wrote:
| Great explanation. Thanks, Julia!
| PeterWhittaker wrote:
| I just reread my take on branches and relearned some stuff I'd
| forgotten: https://peter-whittaker.com/obligatory-grokking-git-
| post
|
| Warning, all text, no diagrams....
| rkangel wrote:
| Git doesn't have the concept of "main is special", but at least
| tools like Gitlab have protected branches to stop you screwing up
| too much.
|
| Some concept of "parent" and "child" branches would actually be
| pretty interesting. You do have to support multiple "parent"
| branches though for long term support branches.
| samus wrote:
| Protecting branches is indeed very important. I make errors all
| the time when screwing around. It helps enormously being
| restricted to just messing up one's feature branches. Many
| other changes can be done via the GUI with PRs and the various
| kind of controlled merge and rebase strategies they support,
| like Merge, Rebase + Merge, FF-only Merge, Squash merge, etc.
| cedws wrote:
| It's also a security feature. If you have a repo with a lot
| of developers working on it, you need to be sure they
| absolutely cannot slip in code with nobody noticing, or
| trigger CI/CD and compromise build secrets or even
| production.
| andrybak wrote:
| > Git doesn't have the concept of "main is special"
|
| Technically, there is special handling for both "master" and
| "main" in Git in fairly obvious, but I'd argue in a not very
| important way. When you merge two regular branches, the commit
| message is `Merge branch 'source' into destination`. But not if
| destination is `master` or `main` - the `into ...` part is
| omitted for those merge commits.
|
| But this is just for backward compatibility. Git is very
| conservative in changing such user facing behavior as generated
| merge commit messages. To get Git to treat `master` and `main`
| truly without special handling, set empty value to config
| option `merge.suppressDest` [1]: $ git config
| merge.suppressDest ""
|
| `master` is also used as the default name for the default
| branch in newly created repositories. See option `--initial-
| branch` of `git init` and config variable `init.defaultBranch`
| [2] to override. Git for Windows, for example, allows setting
| the config option in its installer.
|
| Source code:
|
| For merge commit formatting:
| https://github.com/git/git/blob/2108fe4a1976f95821e13503fd33...
|
| For default branch naming:
| https://github.com/git/git/blob/91e2ab1587d8ee18e3d2978f2b7b...
|
| Git for Windows installer suggesting setting
| `init.defaultBranch`:
|
| - https://github.com/git-for-windows/build-
| extra/blob/586c46ec...
|
| - https://github.com/git-for-windows/build-
| extra/blob/586c46ec...
|
| Footnotes:
|
| [1] https://git-scm.com/docs/git-merge#Documentation/git-
| merge.t...
|
| [2] https://git-scm.com/docs/git-init#Documentation/git-
| init.txt...
| chrnola wrote:
| There's some special handling for FETCH_HEAD too (i.e. which
| branch on a remote is considered the default).
| jacoblambda wrote:
| It actually does but it's very much in alpha/active development
| (under the umbrella of OpenSSF with the intent of being
| integrated into mainline git eventually).
|
| https://github.com/gittuf/gittuf
| ndriscoll wrote:
| Git itself doesn't run a persistent process and I don't see
| how it'd make sense to prevent a user from making arbitrary
| changes to their local repo, so this sounds like just another
| server like GitHub, Gerrit, Gitlab, etc. that already have
| those features.
| Sharlin wrote:
| But... If you have a rebase workflow, then `git checkout trunk;
| git rebase branch` is exactly how you "merge" an offshoot branch
| into a trunk branch! That's what Github does when you rebase-
| merge a PR, for example.
| karatinversion wrote:
| No, that's not right. If you did that, you would need to force
| push to get the result pushed to the remote.
| Sharlin wrote:
| Oh, right. So what actually happens is that the offshoot must
| first be rebased on top of the trunk, and then trunk can be
| fast-forward merged/rebased (same thing, really) to the
| offshoot's head.
| ChrisMarshallNY wrote:
| That's an excellent explanation.
|
| _> "Wrong" models can be super useful._
|
| This is used in usability and UX design a lot. Affording mental
| models that don't reflect the actual code, happens all the time.
| samus wrote:
| This is perfectly fine and the added value of a great
| application if it can hide the underlying reality completely.
| With Git, the abstractions are paper-thin at best though. Good
| UIs can indeed cover up many aspects, but they only work as
| long as there are no merge or rebase conflicts. To correctly
| resolve these, the user has to have a precise picture of what
| is actually going on.
| pvg wrote:
| _This is used in usability and UX design a lot._
|
| It's the fundamental thing that makes UI work. I've always
| liked the title of Brenda Laurel's book - _Computers as
| Theatre_
| informalo wrote:
| > You do need to explicitly specify the other branch when merging
| or rebasing or making a pull request (like git rebase main),
| because git doesn't know what branch you think your offshoot is
| based on.
|
| I think a big issue with the presented intuition is that it's
| limited to wanting to merge the base/trunk/main branch into your
| feature branch. However, sometimes you want to merge a feature
| branch into another feature branch. With this in mind, you can
| form a better intuition, imo, where it's absolutely clear that
| you have to specify what branch you want to merge into another
| one.
| mtnygard wrote:
| I have found that git makes a lot more sense if you reverse the
| mental model of lineage. People think about a lineage going
| _forward_. But a more useful way to think is in terms of
| _backward_ pointers.
|
| A commit points to it's parent(s). Since a branch is just a
| commit ID, you can follow the parent links backwards to find the
| whole history of that branch.
|
| So a "branch point" is just where two chains of parent links
| converge.
|
| The special part are merge commits. Those have multiple parents,
| indicating that two histories fused into one.
| layer8 wrote:
| The issue is that if you consider a branch to be what is really
| the history of the branch tip, then a branch is not just the
| part starting from the last join with another branch. Instead
| it is some directed path through the commit DAG, a path that in
| general can't be reconstructed from the information Git keeps.
|
| If, for example, you have a structure like
| | o / \ o o A | |
| B o o \ / o
| / \ o o C | | D o o
| \ / o |
|
| then conceptually the path CA might be one branch and DB the
| other branch (or alternatively, CB and DA). But this is not
| something that is represented in Git's model.
| mountainboy wrote:
| Interesting, so then which path(s) does git display when
| running git-log on this?
| seba_dos1 wrote:
| Define "this". If you git-log from the commit on the top of
| that ASCII graph, you get all the drawn commits listed
| (unless adjusted with arguments such as `--no-merges` or
| `--first-parent`).
| Izkata wrote:
| You can get ASCII art of that structure with:
| git log --graph --oneline
|
| Older versions you'll also want --decorate to show branches
| and tags, but I think that's on by default now.
| vifon wrote:
| This missing piece of information would be essentially `git
| reflog`, except it's not something Git sends between the
| clones.
| ajross wrote:
| > a path that in general can't be reconstructed from the
| information Git keeps.
|
| Uh... yes it can. Commits have a list of 0 or more parents.
| That creates a DAG. There are literal hordes of tools out
| there that reliably interpret this, from visualizer tools to
| practical mutators like git bisect.
|
| Maybe you're trying to say that no single commit order exists
| that traverses the whole tree. That's true, because branches
| can merge together. But it remains a completely interpretable
| graph nonetheless.
| layer8 wrote:
| That's not what I was saying. I was referring to the
| history of branch tips.
| ajross wrote:
| But that's not related to the DAG at all. The branch can
| be changed at any moment for any reason to point to any
| commit with any content.
|
| But it's true that conventionally, a new branch tip
| should always have the previous branch tip as an
| ancestor. But not always as a direct parent, and even if
| so it might be a merge commit that joins two different
| branches. There is indeed no single spanning path through
| a DAG.
|
| But trying to explain it as "git doesn't store enough
| information" to construct that spanning path seems
| confused to me. It's not about what git stores, it's just
| math: there is no such path in the general case, period.
| layer8 wrote:
| The fact that the branch tip can be moved to unrelated
| commits is another issue with Git's model, and a mismatch
| to the intuitive "a named lineage in the DAG" conception
| of branches. In other VCSs, that would be a new/different
| branch, and you could still rename branches so that the
| same name will later refer to a different branch, but the
| branch history as such (including renames) would be
| preserved.
| ajross wrote:
| > mismatch to the intuitive "a named lineage in the DAG"
| conception of branches
|
| Once more, that conception may be intuitive _but it is
| wrong_. A branch is emphatically _NOT_ a line through the
| DAG, it 's the whole DAG. There simply is no single list
| of patches to apply to get from one commit to another,
| even if both were at some point heads of the same branch,
| and even if one is an ancestor of the other.
|
| And the reason it's wrong is that branches can merge
| together. You can have commit A descended from both the
| "main" branch and the "topic_a" branch, despite the fact
| that those two had diverged. This isn't a bug, it's a
| feature. You don't have to use it if you don't want to
| (lots of projects require linear commit histories in
| their main branch), but it's part of the tool nonetheless
| because some projects (Linux especially) use it heavily
| and to great effect.
| lifeisstillgood wrote:
| Just to go off on a tangent - that's a pretty neat diagram
| for a throw away comment. was that just careful spacing in
| the HN textbox or did you use a tool - which one ? :-)
| grodriguez100 wrote:
| "Text after a blank line that is indented by two or more
| spaces is reproduced verbatim. (This is intended for
| code.)"
|
| Looks like this also switches to a monospaced font, which
| makes it easier to draw ASCII art. This
| should be rendered using a monospaced font. _____
| \ / \ / O
| Izkata wrote:
| You can reconstruct it manually with a combination of the
| parent commit order and the automatic merge commit message,
| if you didn't change the commit message. But yeah, that
| second part isn't recorded in the structure itself.
| trealira wrote:
| That's how I learned it, not having known anything about git or
| version control beforehand. I used this site:
|
| learngitbranching.js.org/
|
| Which represents commits as circles with arrows pointing to
| their parents.
| keybored wrote:
| Lately I've wanted branches (heads) to have a corresponding tail
| which points to the base commit that the branch sits on top of
| (like the commit on `main` when you created the branch).[1]
| Because branches get rebased all the time and eventually you have
| six commits out in the AEther somewhere and you have to think
| twice about where it even starts. And yeah you can probably think
| for a few seconds and recall that you have worked with John and
| not Jimmy on this branch so the seventh commit backwards that
| belongs to Jimmy must be the commit base. Or Git can tell you
| that the seventh commit belongs to `main` already. But why should
| you have to expend any effort?
|
| You can optionally include the base commit when you send out
| "patches" to a mailing list.[2] Because it might not have been
| obvious that you based your changes on:
|
| - The latest release
|
| - The main development branch
|
| - Some integration branch (probably an error)
|
| You also need to keep the "base" in mind when you use `git range-
| diff` because that tool takes two ranges lik `main..previous` and
| `main..current`. And sometimes you can rely on just using
| `main..` and letting Git figure it out but in my experience
| passing an explicit value sometimes works better.
|
| `git range-diff` is a super-cool but perhaps niche tool. But you
| basically have to use it on review round number 2 and higher when
| you are sending changes to the Git project.
|
| [1] This has been discussed before and there was a patch series
| that implemented it. But that was basically a POC and done in the
| spirit of "this is useless IMO but here's how you could do it"...
| and the implementation didn't factor in all the shenanigans that
| you can do with `reset` and `rebase` so it couldn't have been
| merged as-is. (Although to be fair: the bar was not set to work
| perfectly with any kind of branch reset etc., which I suspect is
| impossible in any case.)
|
| [2] Patches after all are just commit messages plus the patches
| themselves and don't tell you what they are based on.
| Snarwin wrote:
| It looks like this is what git merge-base --fork-point is
| supposed to do, although according to the docs it is not 100%
| reliable.
| chatmasta wrote:
| We're teaching Git wrong. Most of the common confusion is due to
| people learning from the porcelain down to the plumbing, when it
| should be the other way around. If you limit your mental model to
| the plumbing, there's generally only one outcome that you want,
| but there are a dozen ways to get there from the porcelain. You
| can choose whichever one you prefer. But if you start from one of
| those dozen ways, they could each lead to a different outcome
| than you expected.
|
| I'm forever grateful for one of my early internships, where a guy
| from GitHub visited the office and gave us a one day workshop on
| Git. He started from the internals and explained how Git models
| your codebase. (He's also the one who introduced me to the idea
| of plumbing vs. porcelain.) Then once we had a common language,
| teaching the porcelain was a matter of starting from the plumbing
| and working upwards, rather than the other way around.
|
| Another invaluable resource in learning Git is this interactive
| tutorial [0], which renders a tree diagram of start state and
| desired end state and makes you write the commands (for which
| there are often many options!) to get to that end state. This
| reinforces the idea that the best way of planning Git commands is
| to first visualize the end state you want, and then reason about
| how to get there.
|
| Also: RTFM! Not just once. Go back to it. You'll learn something
| new every time. The docs [1] are really good.
|
| [0] https://learngitbranching.js.org/
|
| [1] https://git-scm.com/docs
| spenrose wrote:
| If you learned from this (excellent) piece, I recommend that you
| buy and work through https://leanpub.com/learngitthehardway . It
| will take less than a day, and you'll have a much stronger
| foundation for a core tool.
| riperoni wrote:
| While the explanation is right in some sense, it misses a few
| points.
|
| Branches are pointers to a commit and that pointer is refreshed
| when a new commit is created. One could say they are a wandering
| tag (without explaining a tag for now).
|
| The actual chain of commits that represent what we see as branch
| comes from the commits themselves. Those commits point back to
| their parent commit.
|
| And then one can see why no branch has any special meaning: It is
| a chain of related commits with a named entrypoint. Once you
| delete a branch (i.e. the named wandering pointer to a commit),
| you cannot identify a branch as such anymore. It is just a chain
| of related commits without a named label now. And nothing besides
| the name distinguished the branch from other commit chains
| before.
|
| The master/dev/release branches are then a convention to keep an
| updated commit pointer on the chain of commits containing changes
| of interest.
| jansan wrote:
| This was the most useful piece of information that I have ever
| read about Git.
|
| But what happens if you merge branch A into beanch B? A and B
| will both contain the commits of A, but in B there may be
| commits of B between the commits that were merged. Do the same
| commits of A then have different parents depending on which
| branch they are on?
| paulddraper wrote:
| Merging branch A into branch B does two things:
|
| 1. Create a new merge commit with _two_ parents: the commit
| pointed to by A and the commit pointed to by B.
|
| 2. Set branch B to point at the new merge commit.
|
| This is a non-linear history; when comparing some commits
| there isn't a "before" or "after."
| seba_dos1 wrote:
| > Do the same commits of A then have different parents
| depending on which branch they are on?
|
| Absolutely not. Commits are immutable (representing whole
| repo state, _not_ a diff), and branches are just (mutable)
| pointers to them.
|
| As the sibling already noted, a merge commit is just a
| regular commit. It simply points to multiple parents,
| "merging" them. Aside of the whole machinery to resolve
| conflicts etc. that's pretty much all there is to it.
|
| When your graph topology allows it, you can also merge
| branches without generating a new commit (so called "fast
| forward" merges) - such a merge does nothing but rewrites the
| branch pointer. You can also create merge commits that point
| to more parents than two ("octopus" merges). Reconciling the
| commits' content can get quite complicated in such cases, but
| from the repo graph perspective it's nothing special.
| xorcist wrote:
| > Commits are immutable (representing whole repo state, not
| a diff)
|
| To make things more clear: Repo state here is the contents
| of all files, and some metadata including a pointer to the
| previous commit.
|
| So a commit hash uniquely identifies not only a set of
| files but the unique history leading up to it! That's why
| we some people like to call git the original block chain
| (there's no proof of work involved of course so it can
| never be used for payments or anything like that, but the
| merkle tree bit is similar enough).
| tharkun__ wrote:
| I keep repeating this every time someone talks about git and
| finds something weird or doesn't get branches, so I'm really
| glad your parent mentioned it as well and I know there's
| someone else out there that "gets" that: In
| git it's all just labels/pointers
|
| It's not useful at all to think about branches as the user
| sees them as "things" of their own. Branches don't "have"
| anything. Branches in that sense are just convenient labels.
|
| Of course actual "branches" in the commit tree exist whether
| you label them or not. Until `git` does a garbage collection
| and gets rid of anything that doesn't have a pointer
| ultimately leading to it - something that a human would
| understand aka branch/tag. And that's why we call these
| labels "branches" as well but it's actually one word for two
| things here. The actual tree branch and the label that's
| called branch.
|
| And a branch and a tag are basically the same exact thing
| underneath, just a file in the `.git` directory somewhere
| that contains a commit hash. All the meaning and
| differentiation of branch or tag is just in the human brain
| and how we and our tools treat them. Such as if you look at a
| particular commit in your tool of choice, it will tell you
| which branch it's part of. To create a branch you can
| literally just create a thousand randomly named files in the
| right part of the `.git` directory containing the same commit
| hash and suddenly this commit "is on all those branches".
| That's what git does and why creating a branch in git is so
| super fast.
| seba_dos1 wrote:
| To make things more complicated, the word "tag" is also
| overloaded. It can either be just a reference (in git
| lingo, a "ref") to a commit - just like a branch, only
| differing from it in how the tools treat it; but they can
| also be "annotated tags" which are pointing to a special
| tag object which contains some metadata and only then
| points to a specific commit (or other kind of object...) :)
| evntdrvn wrote:
| You'll also see the first type referred to as
| "lightweight tags", if that helps anyone :)
| loeg wrote:
| In short: merge commits have multiple parent commits. So your
| tree tracing logic bifurcates at that point. The commits in
| the merged history are not altered by the merge commit; they
| each have a single parent commit (unless they are also merge
| commits).
| skrebbel wrote:
| For years I was deeply annoyed by the terrible name "branch"
| for something that acts more like a bookmark (or "wandering
| tag" indeed!).
|
| And then I learned that git branches are branches in exactly
| the same way that the first element of a linked list in C "is"
| the linked list. Git was made by C people and they're used to
| referring to entire data structures by way of some root
| element.
|
| I mean that doesn't make me dislike the name any less but at
| least now I see where they were coming from.
| neuromanser wrote:
| That's (most probably) where the "head" terminology comes
| from, too.
| alfredpawney wrote:
| Yes you are correct. It traces back to Allen Newell
| mr_mitm wrote:
| When the entire structure of commits is called a tree, I find
| the name "branch" fitting. The branch is identified by its
| head commit, so the path from head to root is uniquely
| defined and that's the branch. (Disregarding merges for now.)
| TeMPOraL wrote:
| > _Disregarding merges for now._
|
| Without disregarding them, it's not a tree, but a DAG.
| dayjaby wrote:
| It is a tree. What makes you think it's just a DAG? Are
| there commits with multiple parent commits or what?
| imron wrote:
| Yes. Merge commits have two parents.
| Izkata wrote:
| Two or more. I'm not sure there's a limit.
|
| Try not to do this (imagine 5-way merge conflict).
| zaphar wrote:
| There absolutely can be. Merge commits have multiple
| parent commits for example. It's definitely a graph not
| just a tree.
| dayjaby wrote:
| Parent comment was about disregarding merge commits.
| cpeterso wrote:
| Leaning into the tree metaphor (and following the precedent
| of other version control systems), git should have used the
| term _trunk_ instead of _master_ or _main_.
| hnarn wrote:
| Why? That would heavily imply that master/main is somehow
| technically different from all other branches (since a
| trunk is certainly not a branch), which to my knowledge
| is not true.
| skrebbel wrote:
| FWIW "tree" has a specific, different meaning in Git. It's
| a file tracking the contents of a directory.
| ajross wrote:
| > Git was made by C people and they're used to referring to
| entire data structures by way of some root element.
|
| FWIW this is actually backwards. The word "branch" was
| already in common use (to refer to the same basic idea) in
| SCM systems going back decades, and in almost all of those a
| "branch" was indeed a first class object with its own data
| that acted as a "container" for commits, both semantically
| and physically.
|
| The fact that a "branch" is just a pointer is in fact a git
| innovation on top of the former idea.
| cpeterso wrote:
| > acts more like a bookmark
|
| In fact, Mercurial uses the term "bookmark" for its
| lightweight, git-like branching. Mercurial's branches have
| slightly different semantics and can't be deleted like
| bookmarks or git branches
| imron wrote:
| > Git was made by C people
|
| This is why I think of branches as pointers. The file
| contents are literally just a pointer to a commit on the DAG.
| loeg wrote:
| I think this is covered adequately (if less completely) in the
| "technically correct" definition section.
| jillesvangurp wrote:
| A key point with git is that every clone is effectively its own
| set of branches; even if they have the same name. The
| mechanisms you use for synchronizing your local branches with
| some remote branches are exactly the same as the mechanisms you
| use between to your local named branches.
|
| Git was actually designed initially for email based workflows
| where there was no central remote at all. Basically, that works
| by exporting patches and then applying them to your local
| branch. The branch name isn't even part of the patch.
|
| A git patch is just a textualized form of the list of commits
| you created locally. You can apply them to any branch you like.
| As long as you and whomever applies the patch has a common
| ancestor commit in common, the patch may merge cleanly. It's
| good hygiene to ensure it does by for example
| rebasing/merging/squashing before you email somebody your
| patches. If that somebody is called Linus Torvalds, he's going
| to be pretty strict about things like commit messages and
| things not being spaghetti ball of merges, reverts, forks, etc.
| Your mess, your problem. Linux development still works via
| mailing list. And forget about emailing him directly with a
| patch; you need to use the mailing lists like everybody else.
| And he works with a network of senior contributors that screen
| everything that comes in and that aggregate all the patches
| coming from upstream. So, he only gets involved at the end of
| the process.
|
| Of course the rest of us use network protocols to sync our
| repositories. But the important distinction here is that this
| is a two step process. First you fetch content from remote.
| This is simply ensuring you have all the commit objects you
| need in your local git database. Any branches you have are
| simply text files with the commit content hash they point to as
| the content in .git/refs/heads. Remote branches are the same
| but live in your local .git/refs/remotes/<remotename>. Those
| branches might be named something like origin/main to make it
| clear that that is a local branch from the origin remote. And
| then you rebase/merge between your local and "remote" (i.e.
| also local) branch as needed. Pull is just short hand for doing
| both steps in one go. All merges are local. Same with rebases.
|
| Most of the conventions people project on git are kind of
| cultural and vary between people and companies. It's helpful to
| read up on the git internals in the Git book. Github is sort of
| an opinionated take on this that back in the day made people
| coming from centralized version systems like subversion feel at
| home by providing a central repository and allowing them to
| push their changes there or "share" branches there. Not
| necessarily a great idea for bigger projects and limiting write
| access is common on Github.
| gtirloni wrote:
| Some of Julia's tweets started to get suffixed with "I don't want
| advice about this". It must have reached unacceptable levels.
| cube2222 wrote:
| There is a very good article by GitHub:
| https://github.blog/2020-12-17-commits-are-snapshots-not-dif...
|
| TLDR: Think of commits as snapshots, not diffs, and you'll be
| fine.
| MauranKilom wrote:
| I don't use git at work, but in my private hobby projects my
| friends usually get mad when they watch me juggle changes and
| branch pointers with git reset --hard and git stash...
|
| How do you undo a merge that you didn't mean to do/did wrongly?
| git reset --hard <last commit before merge>
|
| Have some cosmetic fixups on your local branch that really should
| go into main (or a separate branch) first before merging a bigger
| feature? git stash git checkout main
| git stash apply
|
| By thinking about branches as pointers, the commit graph existing
| independently, and stashes just being temporary commits, I feel
| I'm working much more directly with the underlying abstraction.
| Yes, git has commands for specific combinations of actions, but
| for an occasional user it's harder to remember every such command
| and which arguments and flags to pass in which order. It's either
| "look through documentation until you find graph diagrams
| illustrating what will happen for this order of arguments and
| flags" or "use the primitives 'move branch pointer', 'commit to
| branch', 'hold these changes for a second' for obtaining the
| commit tree you actually want. Knowing that the reflog exists
| also makes this insane-sounding working mode pretty non-scary.
| And yes, some operations (e.g. cherry-pick) you just need to do
| the "real" way.
|
| (My git stash obsession is most likely just damage from years of
| using Perforce, which doesn't have a modified/staged distinction.
| The only way to commit only part of a changed file is via the
| equivalent of stash -> [restore half the file] -> commit -> stash
| pop.)
|
| _Prepares to be crucified..._
| globular-toast wrote:
| "Undo" is usually more like `git reset --hard HEAD@{1}`, ie.
| using the reflog.
|
| Nothing wrong with this at all. Only people who don't
| understand and/or are scared of git don't like it.
|
| You could also use cherry-pick to "donate" commits to other
| branches, instead of stash, of course. Magit has some great
| extra abstractions for this.
| hotnfresh wrote:
| You can check out multiple branches in different directories
| from a single git repo. This saves me a lot of what used to be
| stashing.
| zaptheimpaler wrote:
| Both of those sound totally reasonable to me! I don't know of
| any better ways to do that stuff and there's nothing risky
| about it.
| int0x80 wrote:
| One thing that is risky about git reset --hard is that any
| non-committed changes are lost. That has bitten me a few
| times.
| afiori wrote:
| My controversial opinion is that git needs some kind of gui
| that help you keep track of the state of the repo
| specialist wrote:
| > _I 'm working much more directly with the underlying
| abstraction_
|
| Your strategy of seeing things as they are is a useful general
| purpose life skill.
| codesnik wrote:
| why crucified? you're doing exactly what I do. All the people
| who have any trouble with git whatsoever try to use it as a
| black box for some high-level whatever ideas of what is their
| workflow should be. And git is not that, git is a thin wrapper
| around simple and elegant data structure. If you understand it,
| then everything clicks and git doesn't EVER gives any trouble.
|
| Your friends are unreasonable, unless you collaborate with them
| on the same branches and rewrite them after you shared them.
| codesnik wrote:
| also, using stash is only a first level. git cherry-pick, git
| rebase --interactive, git reset --hard HEAD^ and friends allows
| do such moves and cosmetic extractions after the commit itself.
| I also prefer to split cosmetic changes and feature changes, so
| I extract cosmetic stuff to the main all the time.
| tremon wrote:
| git reset --hard is actually dangerous, because it throws away
| local modifications that were not yet committed. To undo just
| the commit and not the work, you should use git reset --soft
| (to undo just the git commit) or git reset --mixed (to undo
| both the git commit and the "git add"s leading up to the
| commit).
| alex_smart wrote:
| git checkout will also happily throw away local unstaged
| modifications, and I would argue that it is even more
| dangerous because I did not have to type "--hard" to shoot
| myself in the foot.
| hnarn wrote:
| That does not sound right at all, I'm pretty sure there's a
| warning when you try to checkout a branch that would
| override local unstaged changes. I might be wrong but I'd
| like some proof.
| Izkata wrote:
| Checkout is also used for reverting changes to unstaged
| files.
| astrobe_ wrote:
| There must be something you do terribly wrong if you
| believe it happened to you in normal use.
| mb7733 wrote:
| The poster is talking about when you do something like
| this: `git checkout -- .` That wipes out all unstated
| changes on tracked files in the current directory
| afiori wrote:
| I agree that git is almost asking you to juggle commits.
|
| My preference is to use temporary branches and cherry-picking
| instead of stashing; I mostly use a gui* to work with git so it
| is easy to select the two or three commits to cherry-picking or
| see visually if an interactive rebase would work.
|
| * https://gitextensions.github.io/
| pitaj wrote:
| > How do you undo a merge that you didn't mean to do/did
| wrongly?
|
| I usually used `switch` for this: # Check out
| the previous commit git switch -d HEAD~ #
| Overwrite the branch git switch -C <current branch>
| erik_seaberg wrote:
| Not a fan of the staging area, because it hasn't been tested
| before commit. I would rather stash some changes to postpone
| them, then test and commit the workspace.
| bloopernova wrote:
| tl;dr Please ignore, just me working through a Python+pygit2
| problem. I solved it in a grandchild comment.
|
| I had so much trouble trying to map my intuited/mental model of
| git onto pygit2 that I gave up and just used the git module.
|
| I wanted to automate a fairly simple thing in Python as opposed
| to bash+commands. My reasoning being that I wanted to do it
| "right" and be a Big Programmer Real Boy(tm). I just wanted to
| create a branch remotely in Github, pull the repo, and checkout
| the new branch. I got stuck going in circles trying to figure out
| why I was always left in detached HEAD state because I didn't
| understand _exactly_ what git was doing during a checkout.
| # repo has already been pulled if
| os.path.exists(repo_path): local_repo =
| git.Repo(path=repo_path) self.log.debug(f"current
| branch: {local_repo.active_branch.name}")
| local_repo.git.checkout(branch_name)
|
| That's super easy and is much the same as running the commands in
| the shell or in a bash script.
|
| Of course, I've lost my poor implementation using pygit2, so I'll
| add that later if I find it. Thankfully there's a good discussion
| surrounding the issue I encountered in this excellent "roll your
| own git in Python", which doesn't use pygit2, but the concepts
| are the same: https://www.leshenko.net/p/ugit/#checkout-switch-
| branches
| bloopernova wrote:
| This isn't asking someone else to make this work, it's more of
| a caution to convince folks like me to just use "import git"
| rather than pygit2:
|
| So something like this was what I expected to work, but leaves
| the repo in detached head state: import
| pygit2 def checkout_branch(path, branch_name):
| repo = pygit2.Repository(path) branch_ref =
| repo.lookup_reference(f"refs/remotes/origin/{branch_name}")
| print(f"{branch_ref.name}")
| repo.checkout(branch_ref)
|
| The branch_ref.name prints "refs/remotes/origin/test" but git
| status says "HEAD detached at origin/test"
|
| So I'm probably feeding the wrong thing into repo.checkout, but
| I'm honestly not sure what else it should be.
|
| Funnily enough, git itself tries to do the right thing if
| pulled in a detached head state: From
| https://github.com/testorg/example * [new branch]
| test -> origin/test You are not currently on a
| branch. Please specify which branch you want to merge
| with. See git-pull(1) for details. git
| pull <remote> <branch>
| bloopernova wrote:
| Ha, and of course just messing around gets me something that
| actually works.
|
| There always seems to be just one more stackoverflow thread
| to read that has the real answer:
| https://stackoverflow.com/questions/68435607/how-to-clone-
| ma... (found via Kagi which I wasn't using before, and the
| search "pygit2 detached head") def
| checkout_branch(path, branch_name): repo =
| pygit2.Repository(path) main_branch =
| repo.lookup_branch("main") print(f"Main branch
| upstream: {main_branch.upstream_name}") if
| branch_name not in repo.branches.local:
| print(f"Branch {branch_name} not found in local branches")
| remote_branch = "origin/" + branch_name if
| remote_branch not in repo.branches.remote:
| raise SystemExit(f"Branch {remote_branch} not found in remote
| branches") (commit, remote_ref) =
| repo.resolve_refish(remote_branch)
| repo.create_reference("refs/heads/" + branch_name,
| commit.hex) branch =
| repo.lookup_branch(branch_name) print(f"Branch
| name: {branch.name}") repo.checkout(branch)
| print(f"Is branch head? {branch.is_head()}")
| (commit, branch_remote) = repo.resolve_refish("origin/" +
| branch_name) print(f"Remote branch:
| {branch_remote.name}") branch.upstream =
| branch_remote
|
| With git reflog telling me the right thing:
| d44aedc (HEAD -> test, origin/test) HEAD@{0}: checkout:
| moving from main to test
|
| And git push has the remote branch already set.
|
| I wish there was a pair programmer AI that you had to explain
| stuff to. That would enable the "by explaining it, I solved
| it" phenomenon.
| parasti wrote:
| The article goes in the right direction, but from a weird
| starting point. Saying things like "a branch contains the entire
| history" just adds to the general confusion about Git. Git does
| not have branches. Sure, Git emulates branches to appear familiar
| and intuitive, but it is actually counterproductive to use that
| as a starting point to explain how Git works. Git manages a graph
| of commits and some of those commits need human readable labels.
| That's it. The only thing that contains the entire history is the
| commit graph itself.
| BlueTemplar wrote:
| Another way to think about merging and patches :
| https://jneem.github.io/merging/
| intrepidsoldier wrote:
| Anything about Git reminds me of this:
|
| https://youtu.be/EReooAZoMO0?si=sHqcYsf8v6LyWLAx
| chihuahua wrote:
| Given how many smart people are confused by Git, and how many
| times Git's behavior needs to be explained in a way that often
| raises as many questions as it answers, it seems to indicate
| that Git's model is not at all intuitive and doesn't map well
| to how people generally use it to get work done.
|
| These are all people who have no problem understanding all
| kinds of other technologies and building complex systems from
| them.
|
| It's not quite in "a monad is just a monoid in the category of
| endofunctors" territory, but when this many smart people have
| difficulty understanding something, I think Git is to blame,
| not the people.
| jacoblambda wrote:
| I have to serious ask. Of the people who have issues with
| using or understanding git, how many of them have actually
| read the docs?
|
| Git is by no means perfect but the development community is
| great and there is a massive focus on improving the project
| and making things more approachable and intuitive.
|
| And because of that, git actually has really solid, coherent
| documentation with easily digestible tutorials and guides for
| all the things you need to do.
|
| So it always hurts me when I see people ranting and raving
| about how awful git is, how it can't do x, or how it doesn't
| make sense how it works but then you send them the guide or
| tutorial hosted on the git-scm website and suddenly it makes
| sense.
|
| Not to be beating the RTFM horse but RTFM guys.
| chihuahua wrote:
| Whenever I read the Git docs, after a while I start
| thinking "this is all very well and good, but it doesn't
| seem to be related to what I'm trying to do to get my work
| done (usually fairly basic things)"
|
| Or I have read a bunch of pages on the git-scm site, and
| I'm thinking "oh yes it all makes sense now." Then I'm
| trying to do something in the real world, and I get bizarre
| messages and conflicts that don't make any sense. Or I made
| a mistake and want to undo it, and end up in some crazy
| situation. The Documentation doesn't seem to help in
| anything but an ideal textbook scenario with no mistakes
| and complications.
| forrestthewoods wrote:
| Git makes version control roughly 10x more complicated than
| it needs to be.
|
| I can teach an artist or designer who has never heard of
| version control how to use Perforce in roughly 5 minutes.
| They will never blow off their leg and will likely never
| lose work. It will probably be a few months before they hit
| some edge case where they need help.
|
| Git requires building a non-trivial mental model. Then it
| requires memorizing a whole bunch of unintuitive commands
| with unintuitive flags.
|
| > Not to be beating the RTFM horse but RTFM guys.
|
| Good tools are intuitive and can be incrementally learned
| without resorting to dense documentation.
|
| RTFM is definitely a solution. But when a very large number
| of users have consistently similar issues at some point you
| have to stop blaming the users and admit the tool isn't
| easy to learn.
| Feathercrown wrote:
| I've read the docs. Too much explanation of command line
| flags, not enough practical examples. It is thorough
| though.
| gitanovic wrote:
| I think that one way to "easily" understand the syntax of git is
| to remember that when you perform a command you "always" modify
| the current branch
|
| for example: git merge my-branch will merge my-branch into the
| current one
|
| while git rebase my-branch will rebase current one on top of my-
| branch
| Vinnl wrote:
| Years ago I wrote this dynamic tutorial that visualises branches
| as you read: https://agripongit.vincenttunru.com
|
| It's aimed at folks who know how to use `git add` and `git
| commit`, and would like to spend 15 minutes to form a mental
| model to help them _understand_ what 's going on.
|
| In case it's useful to someone.
| k__ wrote:
| That article would have been a lot better if it showed
| illustrations for the "right" mental model too.
| taberiand wrote:
| The right mental model is to realise the 'main' branch is only
| special by convention - git doesn't actually treat it
| differently from any other branch.
|
| All of the confusion expressed in the article stems from a
| misunderstanding that main should work in some special way.
|
| Of course every branch's history goes all the way back to root
| and not to some arbitrary common commit of another branch like
| 'main'. Of course rebase and merge can work "backwards" from
| main onto some branch (because it's not "backwards" because
| main is not special - it just isn't done much in practice
| because keeping main straight helps with collaboration)
|
| Furthermore, by realising that main isn't inherently special,
| it becomes obvious that the actions can be done between any two
| branches as needed.
|
| The right mental model is - it's just commits, all the way
| down.
| k__ wrote:
| _" All of the confusion expressed in the article stems from a
| misunderstanding that main should work in some special way."_
|
| I didn't have that impression when reading that article.
|
| To me it seems that the confusion comes from thinking in
| actual branches, and not from thinking anything special about
| main.
| why-el wrote:
| I've learned only one constant with git in my years as a
| programmer: master your own employer's git use cases, and pray to
| god for three things:
|
| 1. you don't change places often and thus git patterns.
|
| 2. you don't accidentally ship and commit a multi-GB file to your
| remote.
|
| 3. you don't change the git process on yourself and your
| colleagues without an extremely solid reason.
|
| Document your chosen git patterns, even in 2023.
| tsbx wrote:
| I always get back to this page when trying to understand/show how
| git works under the hood: https://eagain.net/articles/git-for-
| computer-scientists/
|
| It summarizes fundamentals clearly.
___________________________________________________________________
(page generated 2023-11-23 23:00 UTC)