hngopher.com

       [HN Gopher] Two Years of Squash Merge (2019)
       ___________________________________________________________________
        
       Two Years of Squash Merge (2019)
        
       Author : m_a_g
       Score  : 45 points
       Date   : 2021-04-30 18:34 UTC (4 hours ago)
        
 (HTM) web link (blog.dnsimple.com)
 (TXT) w3m dump (blog.dnsimple.com)
        
       | chmaynard wrote:
       | Here's an alternative that appears to accomplish the same thing:
       | # checkout feature branch       git switch feature            #
       | reset HEAD while preserving changes to working tree       #
       | commits on feature branch will become orphans       git reset
       | --soft main            # commit all changes on feature branch
       | git add -A; git commit -m <feature description>            #
       | checkout the main branch and merge       git switch main
       | git merge feature
        
       | gouggoug wrote:
       | I will always fight tooth and nail against squash merge.
       | 
       | Squash merge has the major disadvantage of getting rid of
       | valuable meaningful git history.
       | 
       | Squash merge is not the proper solution for keeping your git
       | history clean, it is a hack using the side effect of squash.
       | 
       | Keeping your git history clean is a matter of policy, best-
       | practices and education:
       | 
       | Developers should be required to submit _clean_ PRs, that is,
       | PR's whose git history has been organized and refactored in such
       | a way that it removed "clean up commits", "typo fix", etc.
       | 
       | When you squash merge a feature branch that has thousands of
       | lines of code, and 6 months later you have a bug introduced by
       | this feature branch, it becomes extremely hard to find which line
       | introduced the bug.
       | 
       | On the other hand, if you kept the history, and if this history
       | was clean from the get go, it becomes easy to read the commits
       | one-by-one and understand the issue.
       | 
       | Don't use squash+merge.
       | 
       | edit: I see numerous comment saying, in essence, "squash+merge"
       | is what gets rid of the dirty history. No, developers must learn
       | the existence of `git rebase --interactive`, which is the command
       | to use to clean your git history[0]. "squash" is one possible
       | action, among others, that helps cleaning the history.
       | 
       | [0]: https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History
        
         | hinkley wrote:
         | Squash merge is for people who are so wedded to the idea that
         | you should guess why the code is the way that it is instead of
         | looking it up, that they want to make sure nobody else can do
         | it either.
         | 
         | You know, assholes.
         | 
         | It's fine if you don't want to use something, or use it, as
         | long as it doesn't drag your whole team into your decision.
         | Which squash merge does.
        
         | [deleted]
        
         | bennysomething wrote:
         | Disagree, having a bunch of commits made up of undo, redo ,
         | dunno what I'm doing , oh no now I know, commits is pretty
         | annoying. Squash solves that (for me, when dealing with other
         | peoples merges). Obviously just my opinion etc
        
           | marton78 wrote:
           | The problem is real and the parent commenter addresses it:
           | learn interactive rebasing to craft a beautiful commit
           | history.
        
         | streblo wrote:
         | > Developers should be required to submit _clean_ PRs, that is,
         | PR's whose git history has been organized and refactored in
         | such a way that it removed "clean up commits", "typo fix", etc.
         | 
         | The only way I've ever seen this accomplished is through Github
         | squash and merge.
        
         | drtz wrote:
         | > When you squash merge a feature branch that has thousands of
         | lines of code, and 6 months later you have a bug introduced by
         | this feature branch, it becomes extremely hard to find which
         | line introduced the bug.
         | 
         | The article makes an argument that each commit should be "a
         | single logical change." Squashing thousands of lines of code
         | from a long-running feature branches into a single commit is
         | obviously not what the author is suggesting.
        
         | FunnyLookinHat wrote:
         | > Squash merge has the major disadvantage of getting rid of
         | valuable meaningful git history.
         | 
         | I think the main point of the article is that the Git history
         | is often *not* valuable (because it's hard to enforce good
         | commit messages). Your later points are very valid, however,
         | and that's why I argue for smaller pull requests (and never
         | merging a PR into main that leaves it in a broken state).
        
         | Cymen wrote:
         | I am in agreement however I suspect many developers have never
         | seen the usage of "git bisect" to track down a bug and fix it.
         | Once you see the power of that, I think one can come to
         | appreciate more the granular git history that is present when
         | not using squash commits.
         | 
         | Of course, git bisect still works with squash commits it just
         | makes your job as the bug fixer much harder because typically
         | squashed commits are quite large so you have to figure out what
         | in the N lines of code introduced the defect.
        
         | calpaterson wrote:
         | > Developers should be required to submit _clean_ PRs, that is,
         | PR's whose git history has been organized and refactored in
         | such a way that it removed "clean up commits", "typo fix", etc.
         | 
         | A complete and utter waste of time. You spend more time messing
         | about with rebase than solving problems.
         | 
         | When you're digging through VCS history due to a bug you often
         | ignore the commit message anyway - if the code did what it
         | seemed to do you wouldn't be there.
        
           | sebastialonso wrote:
           | I think I'm missing something here. How valuable is to have
           | 20 commits of "fix this error" , "fix the fix of the error",
           | "revert all fixes", "real fix".... etc? I'd argue a PR with
           | many commits such as these, conveys little no no useful
           | information, when the actual change is 1-3 LOC.
           | 
           | what about cleaning and filtering out useless commits, by
           | soft reseting the branch and commiting just the actual
           | changes to merge. Sure, it can take a lot of time, but it's
           | done once, by you, after all you're supplying the changes.
           | And if it takes a really long time, you're doing it wrong in
           | the first place by submitting PRs with many LOC.
           | 
           | Have you tried rebasing branches with dozens of short useless
           | commits? _That 's_ a real waste of time, and everyone needs
           | to do this if you want to update your main branch. So you've
           | now multiplied the amount of wasted time for everyone.
        
           | preordained wrote:
           | Could not agree more...I wish I had more of substance to add,
           | but an upvote didn't feel sufficient
        
           | qudat wrote:
           | Agreed but I also think this depends heavily on context. If
           | the organization cares a lot of clean commit messages, have a
           | nice git history, and want PRs to be submitted in a clean
           | state, great!
           | 
           | This is not always the case. I've worked on codebases where
           | we rarely scan the git history messages for anything useful,
           | it's just as easy for us to dig into the code to figure out
           | what's wrong.
        
           | thom wrote:
           | Do people here have examples of some bugs for which they had
           | to resort to VCS history to find the cause? I'm struggling to
           | picture a single bug in my whole career where this would have
           | been quicker than just following the logic of the code. If
           | there's information in commit messages that isn't evident in
           | the code itself, that seems a terrible way to live.
        
             | detaro wrote:
             | As parent said, history is not just commit messages.
             | 
             | Trivial example from yesterday: "this worked last week...
             | what did they even touch inbetween?", look at the most
             | relevant commit diff, context makes it obvious someone just
             | accidentally deleted a line too many while replacing a code
             | block.
             | 
             | For more complex stuff, it's helpful for "what were they
             | trying to do with this line", "what's the requirements
             | document they worked off when they wrote this", "which
             | other versions are likely to have this bug too", "Is this
             | new and I can go ask whoever wrote this about it or is it
             | years old", ...
        
             | pizza234 wrote:
             | As wrote in another comment, bisecting (which is for me a
             | significant tool) relies on history (specifically, a
             | granular one). However, it also must a disciplined history.
        
               | thom wrote:
               | What's an example of a bug you've had to bisect for
               | recently? Forgive me, it just seems like such a last
               | resort thing.
        
               | LocalPCGuy wrote:
               | I don't bisect regularly, but when I do, it's usually
               | trying to figure out what the original reason for
               | introducing the code that is problematic is. The entire
               | point is on projects big enough, you may not be able to
               | just "follow the logic" enough to know that your fix the
               | to apparent bug isn't re-introducing some regression that
               | was fixed previously.
               | 
               | So bisect to me, is a way to figure out where and why the
               | code was changed the way that it was, so I have a better
               | understanding of the changes I can make going forward.
               | 
               | That doesn't mean the change made in the past was correct
               | and must be maintained (obviously something is broken),
               | or that it might be that the obvious solution based on
               | following the code logic is correct. But that doesn't
               | mean I wasted time making sure I best understand the
               | reasoning behind the changes made. But this is also why I
               | don't resort to it very often, because it isn't necessary
               | in all cases (I'd even say it isn't necessary in most
               | cases).
               | 
               | Bisect also allows you to see other changes made in the
               | commit in question, and around that commit, so you get a
               | better overall picture of the logic.
        
           | Rapzid wrote:
           | I agree with you in part, depending on the project, if that
           | means just squashing it down first. This at least can keep
           | the integration branch somewhat sane and without a bunch of
           | intermediate commits that are junk and won't build.
           | 
           | For this reason I prefer people to rebase on the integration
           | branch instead of merging it in to stay current; unless there
           | is a reason not to. And a lot of open source projects require
           | PRs that to be pre-squashed or be able to be squashed before
           | merging!
           | 
           | However, there is a time and place still for deliberately
           | crafting the PR commits. For instance if you are wanting the
           | reviewer to be able to review chunks of the PR in
           | isolation/stages, or if you want each of the commits to be
           | buildable.. I suppose also for review/testing.
           | 
           | This all depends on the project and other factors like the
           | wider org.
        
           | dyeje wrote:
           | It's a prime example of engineers mistaking aesthetics for
           | utility.
        
           | tablespoon wrote:
           | > A complete and utter waste of time. You spend more time
           | messing about with rebase than solving problems.
           | 
           | Eh. Not really. A clean history is a good resource for
           | figuring out what was changed and _why_. Cleaning up history
           | isn 't even that much work anyway.
           | 
           | > When you're digging through VCS history due to a bug you
           | often ignore the commit message anyway - if the code did what
           | it seemed to do you wouldn't be there.
           | 
           | Only if the commit messages are shitty, which unfortunately
           | they often are. I try to summarize what I'm changing and what
           | I indent those changes to accomplish, because if there's one
           | thing I hate doing, it's spending more time than necessary to
           | puzzle out what I (or a coworker) was thinking at the time.
           | 
           | VCS history can be a pretty powerful tool, but a messy
           | history discourages the use of it.
        
           | throwaway1777 wrote:
           | Strongly disagree. Mainline branch commit history should be
           | clean atomic commits that you can roll back. Otherwise you
           | are just asking for trouble on any merge conflict or hotfix
           | scenario.
        
           | gregmac wrote:
           | > A complete and utter waste of time. You spend more time
           | messing about with rebase than solving problems.
           | 
           | Unless you're using a garbage client (eg, the git CLI) a
           | rebase to get rid of the "typo" "oops" type commits (when you
           | forgot something) takes I'd say 10-20 seconds.
           | 
           | Most commonly I do this when I have several changes on the go
           | at once and forget to commit a fixed unit test, new import,
           | or something like that. I usually realize it right away, so
           | my commit will usually be "add test" or "amend 2 commits ago"
           | or something to remind myself which one it goes to (and it
           | sticks out, because my other commits are usually in the
           | format "PROJ-1234: Add new --host command-line option").
           | 
           | > When you're digging through VCS history due to a bug you
           | often ignore the commit message anyway - if the code did what
           | it seemed to do you wouldn't be there.
           | 
           | I presume this is doing a bisect or something to narrow down
           | where a bug was introduced?
           | 
           | My experience comes from the opposite side. Usually I
           | identify the line of code causing a bug, it makes me go "wtf,
           | what is this even supposed to be doing?" so I do a git blame.
           | Hopefully the commit message is useful and ideally leads me
           | back to the original bug/ticket, so I can figure out what the
           | original intent/fix was, rather than inadvertently breaking
           | something (or re-causing a different bug it fixed). I'll also
           | note that in code that is well-commented and well-tested,
           | this step is rarely necessary.
           | 
           | As an example, say I come across something like this:
           | if (port > 443) ignoreErrors = true;
           | 
           | Clearly, this was added to solve some some problem, but even
           | without any context it's pretty obviously a _bad_ solution to
           | whatever that problem is. Git blame lets me go back to see
           | what the original reason /bug was, evaluate if it's still
           | relevant and then either fix it properly or safely remove
           | this line.
        
             | sethherr wrote:
             | > I'll also note that in code that is well-commented and
             | well-tested, this step is rarely necessary.
             | 
             | This is the most important line from this comment.
             | 
             | If I'm using blame or bisect, it means I did something
             | wrong.
             | 
             | I merge 10 times a day - and if it takes me a minute,
             | because I'm slower than you - that means it's ten minutes a
             | day. Add the cognitive load of the whole process and the
             | added time of explaining it to other team members - 10
             | minutes is an underestimate.
             | 
             | I'm going to spend that time writing tests and comments,
             | since even you note that they're the better option.
             | 
             | Squash is good enough. I spend my energy improving the
             | right things.
        
             | codesnik wrote:
             | git CLI isn't so bad for that, why? I do --fixup commits
             | all the time, git commit -i gives me all the power to
             | reorder, merge and even split commits.
        
             | FalconSensei wrote:
             | > My experience comes from the opposite side. Usually I
             | identify the line of code causing a bug, it makes me go
             | "wtf, what is this even supposed to be doing?" so I do a
             | git blame.
             | 
             | If the commit message is something like: PROJECT-
             | TICKETNUMBER Description, you can you know... see the
             | ticket/issue along with full details and discussions?
        
           | pizza234 wrote:
           | > A complete and utter waste of time. You spend more time
           | messing about with rebase than solving problems.
           | 
           | I'm also one of those "micro committers".
           | 
           | Because of this attitude, many hundreds pedantic commit
           | rebases after, I'm now a better programmer (and I also spend
           | next to no time with that type of rebases).
           | 
           | This is because I have now a much higher capacity (and speed)
           | of breaking down problems into smaller, self-contained,
           | steps.
           | 
           | There is a common understanding that VCs are just storages.
           | VCs actually map the mental model of the developer. Having a
           | precise, clear, granular VC history has a bidirectional
           | relationship with being a precise, granular, clear developer.
        
           | mrkeen wrote:
           | > You spend more time messing about with rebase than solving
           | problems
           | 
           | Rebase should be quick and straightforward. If it's not,
           | you've got bigger problems.
           | 
           | By rebasing, you're taking responsibility for changing the
           | codebase as it exists now, rather than how it used to be.
        
         | lifeisstillgood wrote:
         | I think you are using got commit log as a form of
         | documentation.
         | 
         | I half-agree. I don't agree devs should spend (much) time
         | cleaning up their logs as opposed to actually writing docs
         | (inline) that help the next person
         | 
         | I do agree that there is rarely a good (even half good)
         | _history of decisions_. This is never the Jira  / tickets kept
         | outside of the system.
         | 
         | But it is also no good abusing the commit log as a form of ...
         | journal. (I know that sounds silly but )
         | 
         | I think an actual journal / blog kept by the dev lead will be
         | much more useful
        
           | oftenwrong wrote:
           | There's definitely value in putting documentation where it
           | will be seen. For developers, writing in a code comment or
           | commit message is a good bet. If there's an issue, they'll be
           | in the code, and they'll be in the commit log, and they'll
           | see what you've written.
           | 
           | At times I've been asked to document things in a company
           | wiki. Typically nobody reads it and it never gets updated
           | when others make changes to the code.
        
         | garmaine wrote:
         | > On the other hand, if you kept the history, and if this
         | history was clean from the get go, it becomes easy to read the
         | commits one-by-one and understand the issue.
         | 
         | Or git-bisect.
        
         | edoceo wrote:
         | maybe its not 100% one way or the other?
         | 
         | I love squash for somethings, but not others
         | 
         | today I brought in one big merge, 20+ commits , so don't squash
         | it.
         | 
         | but I also have 20 "little fix" branches, with 1,2 commits each
         | which I merge all together and squash in as one merge to main.
        
           | sevencolors wrote:
           | This is exactly how i run my project.
           | 
           | Amazing that folks get into these "two sides" debate. The
           | reality is always a grey area. And you should be flexible
           | enough to see the benefits of the different variations. With
           | some guidelines on how to make a decision.
        
           | PaulDavisThe1st wrote:
           | hah, a big merge. in the next month or so, i will need to
           | merge a branch with nearly 500 commits that change more that
           | 15k lines of code, under development for more than a year.
           | 
           | but yeah, like your "one big merge", i do not intend to
           | squash it (though there are actually some arguments in favor
           | that only kick on with a merge of this size).
        
         | codesnik wrote:
         | Mandatory squash merge would be very bad, right. But it sounds
         | like success of this policy is highly dependent on the code
         | size of average PR. I really liked to work at place where
         | commits were usually squash-rebased, which got rid of most
         | "typo"s, but long lived huge feature branches lived mostly
         | usual life. And if possible, some logically atomic and finished
         | groundwork parts of feature branches were extracted and squash-
         | rebased into master ahead of time, slimming feature branch,
         | sometimes to the point that feature branch could be squash-
         | rebased too. Git blame was VERY pleasant to work with, and git-
         | bisect would actually work if need would arise.
        
         | jtchang wrote:
         | > Squash merge has the major disadvantage of getting rid of
         | valuable meaningful git history.
         | 
         | There are reasons against it but the real world is messy.
         | Developers tend to commit things, recommit, undo, redo, move
         | things around. Is this valuable history? It can be but I would
         | say 99% of the time it's more valuable to have a good commit
         | message about what was intended rather than what actually
         | happened.
         | 
         | The idea of squash merge is to make the history clean from the
         | get-go as you said.
        
         | eximius wrote:
         | If only there was a way to have our cake and eat it too.
         | 
         | I'm assuming there is because I'm assuming those squashed
         | commits are floating somewhere in the reflog and somehow
         | connected to the squashed commits in a way that should still
         | let you bisect if only we had a tool to make it not suck.
         | 
         | Alas, maybe someone with greater git-fu can inform us.
        
           | waffletower wrote:
           | There are, as I answered separately:
           | 
           | `git log -p --first-parent`
        
           | rubyist5eva wrote:
           | In the article he says he keeps the original branch with all
           | of the history around after they squash them with a reference
           | to the original PR in the squashed merge commit message. So
           | you can always just checkout the original branch and go
           | digging in the full history.
        
             | waffletower wrote:
             | Unwieldy to keep branches around even on moderate sized
             | teams/projects. You don't need them to have a `squashed`
             | view of history when needed. I am surprised that developers
             | still cling to this outdated squashing regimen when pull
             | request tools found on github, gitlab, bitbucket etc.
             | already provide synthetic squash views derived from atomic
             | commits by default.
        
               | rubyist5eva wrote:
               | How is it unwieldy? They don't even show up locally when
               | you `git branch` unless you've checked them out.
               | Otherwise they just sit there doing nothing, there is
               | literally no maintenance required on them once they've
               | been squashed into mainline. Most central repositories
               | end up having hundreds (or even thousands) of "finished"
               | branches that nobody ever looks at anyway.
        
               | waffletower wrote:
               | Depends on configuration if you wind up having that many.
               | I use `git branch -v` often to figure out what branch a
               | colleague is working on. I guess I could apply my same
               | logic and use filtering tools here as well. But I don't
               | see the benefit of squashing the merge commits when git
               | CLI, github etc. can give you the same view without the
               | gratuitous data loss.
        
         | scott0129 wrote:
         | I would agree with your points if not for the fact that "clean
         | up commits" or "typo fix" is a necessary result of PR's.
         | 
         | Your teammate will request changes in your code, and the only
         | way to cleanly communicate "yes I made that change, and ONLY
         | that change" is through these clean-up commits.
         | 
         | Otherwise, if you amend/force-push or open an entirely new PR,
         | 99% of the diff are things that your team has already seen and
         | reviewed.
         | 
         | Squash merges let you clearly communicate how you addressed PR
         | comments, while also keeping the master history clean.
        
           | mrkeen wrote:
           | I don't mind this. I want my reviewers to look at the state
           | of the new code - not the diff, and not the diff of the diff.
           | 
           | Address PR comments using the PR comments.
        
           | jcelerier wrote:
           | > Otherwise, if you amend/force-push or open an entirely new
           | PR, 99% of the diff are things that your team has already
           | seen and reviewed.
           | 
           | gerrit has solved this issue for years by showing the diffs
           | between each successive revision of a patch.
           | 
           | e.g. look here the files at different origin patchsets :
           | https://codereview.qt-
           | project.org/c/qt/qtwayland/+/321246/3....
        
             | jonhohle wrote:
             | This was a feature of ReviewBoard as well. The history of
             | code review changes was maintained by the tool separately
             | from commit history.
        
             | pjc50 wrote:
             | It took me a while to get a workflow that actually works
             | with Gerrit, and I occasionally think of trying to do some
             | kind of autosquash so I can have successive commits
             | locally. In practice I just use --amend all the time.
             | 
             | Careful use of setting upstream and of course pull=rebase
             | makes keeping up with trunk manageable.
        
           | leafmeal wrote:
           | Fixup commits are a good way to work around this.
           | https://git-scm.com/docs/git-commit#Documentation/git-
           | commit...
           | 
           | You still create your "clean up" commits, but it's done in a
           | way where git can automatically squash everything back
           | together in the end.
           | 
           | See also https://git-scm.com/docs/git-
           | rebase#Documentation/git-rebase...
        
             | maxioatic wrote:
             | Huh, never knew about this. Seems quite useful. Thanks!
        
         | ginja wrote:
         | I do agree that PRs should have a clean history. However:
         | 
         | - Services like GitHub allow you to restore the original
         | branch, so you never actually lose history. So I don't see any
         | major drawbacks of squashing on merge.
         | 
         | - If your PRs are several thousand lines long, they probably
         | should've been broken up into multiple PRs (your reviewer will
         | appreciate it)
        
           | pizza234 wrote:
           | > - Services like GitHub allow you to restore the original
           | branch, so you never actually lose history. So I don't see
           | any major drawbacks of squashing on merge.
           | 
           | There's definitely a drawback, and it's bisecting, which is
           | actually a big deal.
           | 
           | Bisecting allows in a semi-automated (depending on the issue)
           | bisecting what otherwise can be a large diff.
           | 
           | But of course, it requires a disciplined history - otherwise,
           | bisecting just won't work.
        
             | marcinzm wrote:
             | Can't you just bisect twice. Once to find the PR and a
             | second time within the PR's branch?
        
           | finnh wrote:
           | Spot-on for both counts. Preserving a PR's individual commits
           | in the master branch is insane - many of those commits won't
           | represent a fully working system anyway, so why keep them in
           | master?
        
             | sjoruk wrote:
             | > many of those commits won't represent a fully working
             | system anyway
             | 
             | This is entirely within the control of the developer(s)
             | working on the project. Whether it's worth it is up to the
             | people working on it. It's certainly not insane though -
             | it's much easier to fix merge/rebase conflicts when each
             | commit is small and easy to reason about.
        
               | finnh wrote:
               | > it's much easier to fix merge/rebase conflicts when
               | each commit is small and easy to reason about.
               | 
               | I agree, but I'd say it in the context of PRs. It's much
               | easier to fix issues (and avoid them) when _PRs_ are
               | small & easy to reason about.
        
         | asimpletune wrote:
         | Hey, so this POV comes up a lot, and I have to say that I think
         | it mistakes how git commits _should_ work, but I'll add that
         | you can sort of have both.
         | 
         | First, in a production branch, git commits should be thought of
         | as functions. Like "Apply commit X, get feature Y, unapply it
         | and you get the reverse". So the problem with preserving full
         | git history in master is that it breaks that invariant. You
         | have to sort of do like a range of commits, but then that
         | doesn't really always work because often times other commits
         | can be interleaved into yours.
         | 
         | To understand how I mean, just look at Linux or open source
         | software projects, where the technical experts are more or less
         | gatekeepers and aren't accountable to any other influence.
         | You'll find that commits work this way, and their history is
         | squashed. (Maintainers will also force you to rebase before
         | merging and basically put all the work on you to get the PR in
         | ship shape, which is a lot different from a corporate
         | environment)
         | 
         | Ok, but then to your point about preserving valuable, more
         | granular commits, well, the solution is you just leave that
         | remote branch up, either in your fork or elsewhere (but
         | probably in your fork). These branches should have formal names
         | identifying a ticket (at work) or an RFC or whatever, so it
         | should be easy for people to discover what happened. They can
         | see this remote issued a PR to this remote to implement RFC-123
         | or whatever, and they can go to that remote and see the
         | granular commits if you preserved.
         | 
         | Sorry, this is like a never ending debate and there are very
         | strong opinions, but I sincerely think that in this situation
         | there is actually a right answer. People who already know how
         | to do this or are super familiar with open source, maintainers
         | or team leads or whatever, I think don't see the point in
         | fighting over it, since they can just enforce whatever they
         | want in their gatekeeper role. However, the truth is that
         | there's sort of a fundamental misunderstanding that the
         | majority of the population has around git and I think it's
         | better to engage and explain when you get the chance.
        
         | cortesoft wrote:
         | > When you squash merge a feature branch that has thousands of
         | lines of code, and 6 months later you have a bug introduced by
         | this feature branch, it becomes extremely hard to find which
         | line introduced the bug.
         | 
         | So, the only way that I can think that having the squash commit
         | broken into individual commits would help to find the one
         | broken line is because it would enable git bisect to find the
         | failing commit.
         | 
         | However, that implies that each individual commit inside that
         | feature branch worked on its own. If that is the case, a better
         | suggestion would be to break up the giant feature branch into
         | smaller sub features that can be merged as they are completed
        
           | [deleted]
        
         | Ozzie_osman wrote:
         | > When you squash merge a feature branch that has thousands of
         | lines of code, and 6 months later you have a bug introduced by
         | this feature branch, it becomes extremely hard to find which
         | line introduced the bug.
         | 
         | I try to avoid feature branches with "thousands of lines of
         | code" on most of my teams, and have been pretty successful.
         | Those types of feature branches create a lot of other problems.
         | On the other hand, small, incremental pull requests that get
         | merged back to master and have really short lifespans, along
         | with things like feature flags to decouple delivering code from
         | delivering functionality have worked really well.
         | 
         | In this world, squash merges are awesome, because any squash
         | merge is basically a "commit" in the other world, and
         | developers can feel free to commit however they want within the
         | branch.
        
       | FalconSensei wrote:
       | I love squash-merge, and never had to see which specific commit
       | inside a PR/MR changed a line. In my current project we use
       | `JIRA-### Title` and the MR title (Gitlab). All information is in
       | the ticket. Also, IF I needed to see the detailed commit history
       | for that merge, Gitlab still has it. We try to have our ticket to
       | be small, so any merge shouldn't have a long commit history
       | anyway.
        
       | deathanatos wrote:
       | Squash & merge is objectively worse than rebase && merge --no-
       | ff.1 (Roughly what the article calls "no fast-forward".) "No
       | fast-forward" meets all of the criteria the article's author
       | proposes:
       | 
       | > 1. Combines all the code changes related to a single logical
       | change
       | 
       | Yes: the merge commit is that.
       | 
       | > 2. Provides an explanatory commit message that helps people
       | understand the intent of the change
       | 
       | This is no more or less true that squash & merge. (Although, I
       | don't off the top of my head remember how good the automatic
       | message in Github is.) But that's more a problem of an automatic
       | message than it is the merge strategy, & squash and merge also
       | has this. (I've seen numerous squash commits with "Fix CI, Fix
       | CI, Code formatting, Fix Lint warning" in them...
       | 
       | Good commit messages boils down to the discipline of the coder.
       | (And a reviewer being able to say, "Can you write a better commit
       | message?".)
       | 
       | > 3. If you pick this commit independently from the history, it
       | makes sense on its own
       | 
       | The merge commit.
       | 
       | What you lose with squash & merge is the history. The author does
       | sort of address this:
       | 
       | > _In case you are wondering if we are losing the individual
       | changes, the answer is no. Each squash merge references back to a
       | PR where the whole changes are tracked:_
       | 
       | And while this is technically true, it's a reference only in the
       | textual message (which Github will nicely turn into a link, but
       | you must be in Github for that). "No fast-forward" will maintain
       | those references in the git commit parent information, which
       | means that tooling like git bisect should be able to see into it.
       | But with a squashed commit, the closest you get is "some commit
       | in this PR", essentially. Same with reverts: if just one commit
       | on a feature branch is bad, you can simply revert that one
       | commit. (Or, if most of the branch is bad, you can revert the
       | merge commit & cherry-pick the good bits.)
       | 
       | If you don't want to see all the feature branch commits in the
       | history, you can just follow first parents.
       | 
       | 1I prefer a quick rebase prior to merge but after code-review, as
       | it is a good balance between the resulting history being
       | readable, and not rewriting history while your reviewer is
       | looking at it. But the argument here should hold regardless; the
       | definition the article uses is sufficient, too.
        
       | hmsimha wrote:
       | This is great! The takeaway for me is that with squash merge you
       | get one commit with all the changes which (optimally) has the
       | full context in the commit message.
       | 
       | My typical workflow is basically the same, but with putting all
       | that context in the merge commit. This allows you to find it with
       | a bit of work (blame to find the commit line, then figure out
       | where that was merged in, then find the merge commit). Squash
       | merge puts all that context in one commit, and keeps a linear
       | history. I had assumed squash merge squashes the commits, then
       | creates a merge commit still, which would mean you'd probably do
       | something like combining messages for each commit in the squash
       | commit message, then capturing the overview in merge commit.
       | 
       | The article says you can still find the individual commits via
       | PR, which is a minor disadvantage as it means you can only do
       | exploration of these via github. If you've deleted the topic
       | branch on github are they still accessible? If it's been garbage
       | collected by git (or never existed locally if you're looking at
       | someone else's changes), is there a way to check them out?
        
       | decebalus1 wrote:
       | The older I get the more I find these discussion as
       | counterproductive as figuring out where to put the bike shed.
       | 
       | I worked in teams that did squash merge and in teams that didn't.
       | And in teams where some did and some didn't, on the same repo. In
       | the grand scheme of things it didn't matter, except for
       | hardliners who had nothing better to talk about.
        
         | allenu wrote:
         | Yep, I agree. In my ideal world, everybody (including me) would
         | have pristine commits that are broken up logically. There
         | wouldn't be small commits created as part of code review to fix
         | things up based on feedback. You'd just have a single "Add
         | feature X" or "Fix bug Y".
         | 
         | However, it's an imperfect world and there are trade-offs. It's
         | hard to train people to do things the right way, and then
         | policing when they do it the wrong way or putting roadblocks in
         | place to make them do it the right way takes energy, and often
         | the benefit is minor. It really depends on how frequently you
         | pore through your history to find an offending bug and a whole
         | bunch of other team policy that isn't directly related to git
         | commits.
         | 
         | I used to spend a lot of energy rebasing so that I could have a
         | beautiful history, but honestly it hasn't benefited me that
         | much. I still rebase, but I'm not so strict about it on my
         | commits, nor others. I'd rather spend my focus and energy
         | elsewhere.
         | 
         | Just do what the team is comfortable with and be flexible.
        
         | jakeva wrote:
         | > figuring out where to put the bike shed.
         | 
         | Anyway, the _real_ question is what color to paint the bike
         | shed?
        
         | scubbo wrote:
         | I agree with your point, but, tangentially, I do find it
         | amusing that "where to put the bike shed" is actually _more_
         | impactful than what I thought "bike-shedding" referred to
         | (arguing about what _colour_ to paint the bike shed for a
         | nuclear reactor). At least the location of the bike shed
         | actually has some (small) effect on people's commute!
        
       | bob1029 wrote:
       | I file this one under: It Depends(tm)
       | 
       | If your software development process requires that multiple
       | people have a hand in each pull request, or these pull requests
       | are part of a more complex merge graph, then I can clearly see
       | the argument for NOT doing squash merge. This is plainly obvious
       | to me and I would be on your side for not going down the squash
       | path. Knowing who was responsible for each part is a very
       | important thing.
       | 
       | If your software development process only ever has a single
       | author per pull request, and these are only ever directed from
       | work branch->master branch, then I would strongly argue for doing
       | the squash merge option. This is what we do today, because we
       | find it to be the ideal blend of hiding subjective commit styles
       | while still preserving essential knowledge about who did what.
       | 
       | Occasionally, we will break our own rules (oops i didn't squash
       | that one), but we don't make a big deal out of it. There are way
       | more important things to worry about most of the time. There is
       | only ever 1 specific commit hash you build your software at, so
       | it doesn't really matter if the branch has 1 or 10,000 commits in
       | it.
        
       | waffletower wrote:
       | Many developers naively sell `git squash` using a clarity
       | argument. By squashing you lose historical information: there are
       | times when the content of a merge requires a paper trail, times
       | when individual commits can aid to separate the portions of a
       | merge you would like to keep versus those you would like to
       | rollback. Perhaps in a 10 times a day release regimen you decide
       | never to look for such history. One size does not fit all,
       | however.
       | 
       | I prefer a "have your cake and eat it too" approach. Keep the
       | commit history. Use readily available tools to squash when
       | performing analysis should you choose:
       | 
       | `git log -p --first-parent` (git 2.31+ ability available in early
       | versions of git with different syntax)
        
       | Pxtl wrote:
       | Squash was the thing that convinced me that the emperor has no
       | clothes. Realizing that I was going to either have to train every
       | junior, every four-month community-college student brought in on
       | co-op to modify their history in an awful UI with tons of
       | gotchas, _or_ I would have to accept the downsides of squash?
       | 
       | It's so stupid.
       | 
       | Git desperately needs a layer above the commit that groups
       | related commits together into a semantically commit-like object
       | that you can show in history and jump to its HEAD and cherry-
       | pick. Because the squash is a dumb hack, and meticulously editing
       | your history is not productive work.
       | 
       | I want squash. But I want squash without all the boneheaded
       | implementation-detail downsides of squash. I want squash where I
       | can put up a PR and keep working and then not have to deal with
       | the cherry-pick pain if I want to build off that work after it
       | merges. I want squash where I can leave the branch up after the
       | merge and still see that it's behind the main branch.
       | 
       | But git's simple "everything is a commit" model makes that
       | impossible.
        
         | eMGm4D0zgUAVXc7 wrote:
         | > Git desperately needs a layer above the commit that groups
         | related commits together into a semantically commit-like object
         | that you can show in history and jump to its HEAD and cherry-
         | pick.
         | 
         | It's called a "branch" :) The commit-like object to represent
         | it is the merge commit.
         | 
         | If you need to group related things in a branch in a more fine-
         | grained fashion then do sub-branches and merge them into the
         | branch with "--no-ff" so you get a merge commit for each to
         | describe them.
        
         | operator-name wrote:
         | Can't you achieve something similar with `--no-ff` and tags?
        
       | Qerub wrote:
       | It would be nice if GitHub allowed comments on commit messages
       | and not only the changes so that they could be discussed for the
       | benefit of learning and improvement.
       | 
       | Squash merge commit messages are currently not reviewable at all
       | since they are not entered until just before the merge.
       | 
       | I'd prefer for nothing at all to be able to enter the main branch
       | without review, if not for anything else to protect myself
       | against my own mistakes.
       | 
       | Guess I should give Gerrit a shot.
        
       | f154hfds wrote:
       | My org has started to strongly recommend squashing before merging
       | (in my opinion one step less extreme than the forced squash-merge
       | mentioned in the article). I tend to consider this a decent
       | principle in general but rules are made to be broken.
       | 
       | My main concern is if anyone ever commits based off of a pre-
       | squashed branch. They won't be able to simply merge upstream
       | after their parent has been squashed, they will now have to
       | cherry-pick or incur strange redundant merge conflicts as they no
       | longer share history.
       | 
       | For a small team whose features tend to be short-lived before
       | going upstream this won't likely be a problem but believe me, if
       | you ever need long-lived feature branches on a larger team,
       | squashing them can cause more trouble than your nice history will
       | gain.
        
       | tediousdemise wrote:
       | My old team's git workflow required us to rebase our feature
       | branches to develop before merging, and our branch could only
       | contain a single commit. The commit needed a special formatting
       | (short title, description, and story number) so it would get
       | picked up by Jira scripts.
       | 
       | Although the history was super clean, the extra upkeep was a
       | small annoyance. I felt like tearing my hair out when there were
       | 10 other merge requests pending and never knew which would be the
       | next to merge with develop. An auto-rebase feature (assuming no
       | merge conflicts) would have saved me countless pointless minutes.
        
       | nickbauman wrote:
       | Generally, time spent twiddling with the repo is time not spent
       | delivering code. It's a distraction. Yes git has all these
       | features that lets you do that and those feature matter when
       | you're committing to the Linux repo which has thousands of eyes
       | and your commit history has to help you communicate to a very
       | wide audience. But the vast majority of us are not using git like
       | this. I've used git bisect so rarely all this is overkill.
        
         | gouggoug wrote:
         | - Generally time spent writing documentation is time not spent
         | delivering code.
         | 
         | - Generally time spent commenting code is time not spent
         | delivering code.
         | 
         | - Generally time spent diagramming on a white board is time not
         | spent delivering code.
         | 
         | - Generally time spent writing specs is time not spent
         | delivering code.
         | 
         | Yet, doing all of these are actually extremely important. How
         | much importance you give each one of them is up to you that's
         | for sure.
         | 
         | So, "Generally, time spent twiddling with the repo is time not
         | spent delivering code", is true, it's nonetheless important,
         | and, this statement disregards the fact that "twiddling"
         | usually only takes a few minutes.
        
         | bonzini wrote:
         | Time spent twiddling with the repo is time saved in the future
         | debugging or writing documentation.
         | 
         | You can have large refactoring PRs for which splitting them
         | further really makes little sense(*), but that have a huge risk
         | of introducing regressions. Being able to bisect them is much
         | easier with a properly maintained repository.
         | 
         | (*) And then you spend time twiddling with GitHub, which is the
         | same as twiddling with the repo except with worse tools.
        
       | wbronchart wrote:
       | I love clean linear history, but I don't like squash&merge, and I
       | don't like the other options that the github interface gives you
       | either.
       | 
       | Plug: I wrote a script recently that merges github pull requests
       | but preserves linear git history (basically, rebase + merge)
       | 
       | https://pypi.org/project/git-pr-linear-merge/
        
       ___________________________________________________________________
       (page generated 2021-04-30 23:01 UTC)