[HN Gopher] Some of us like "interdiff" code review
       ___________________________________________________________________
        
       Some of us like "interdiff" code review
        
       Author : todsacerdoti
       Score  : 231 points
       Date   : 2024-09-10 20:39 UTC (1 days ago)
        
 (HTM) web link (gist.github.com)
 (TXT) w3m dump (gist.github.com)
        
       | zdw wrote:
       | I've generally found that code review first, and rebase-centric
       | systems like Gerrit tend to be much easier to review code in. One
       | of the best parts of this is native support for stacking multiple
       | patches, so people make smaller patches that are easier to
       | review.
       | 
       | Code review in Github feels like a bad afterthought - the space-
       | wasting interface that looks more like a forum thread, the
       | inability to track over rebases, etc.
        
         | zeotroph wrote:
         | There is one thing I miss on Gerrit when you push a stack of
         | commits: A central place to talk about the whole of the stack,
         | not just individual commits. This "big picture", but still
         | technical stuff, too often happens in the issue tracker. But
         | where to place it, I have no idea. This stack is just too
         | ephemeral and and can be completely different on the next push.
        
           | lima wrote:
           | Many teams use topics for this.
        
             | aseipp wrote:
             | Yeah, but you can't really discuss the topic _itself_ ,
             | right?
             | 
             | I do think this is a weakness of Gerrit. It doesn't really
             | capture "big picture" stuff nearly so well. At least on GH
             | you can read the top-level comment, which is independent of
             | the commits inside it. Most of the time I was deep in
             | Gerrit doing review or writing patches, it was because the
             | architectural decisions had already been made elsewhere.
             | 
             | I guess it's one of the tradeoffs to Gerrit _only_ being a
             | code review tool. Phabricator also didn 't suffer from this
             | so much because you could just create a ticket to discuss
             | things in the exact same space. Gerrit is amazingly
             | extensible though so plugging this in is definitely
             | possible, at least.
        
               | another-acct wrote:
               | On a mailing list, you used to be able to write up the
               | big picture in the "cover letter" (patch#0). Design-level
               | discussions would generally occur in a subthread of
               | patch#0. Also, once the patch set was fully reviewed, the
               | maintainer could choose to apply the branches on a side
               | branch at first (assuming the series was originally
               | posted with proper "--base" information), then merge said
               | side branch into master at once. This would preserve
               | proper development history, plus the merge commit
               | provides space for capturing the big picture language
               | from the cover letter in the git commit log.
        
       | jeffbee wrote:
       | Someone should do a deep dive into developer productivity after
       | LLVM switched from Phabricator to GitHub. How many other major
       | projects have done a switch like that?
        
         | ajkjk wrote:
         | Better it be a deep dive into developer sanity, a more
         | important but comparatively under-valued metric.
        
           | jeffbee wrote:
           | For me, GitHub PR review drives me crazy. It's good for
           | exactly one round of exchange. After that nobody can tell
           | what the heck is going on. So my self-reported mental health
           | would be worse.
           | 
           | But on non-subjective metrics it seems like LLVM PRs on
           | GitHub are gathering noticeably less discussion than they
           | used to enjoy as Phabricator diffs.
        
             | zeotroph wrote:
             | And just visually, GitHub wastes so much vertical space, so
             | even trying to place what belong to which patchset becomes
             | hard.
        
       | ams92 wrote:
       | Most of the complaints here could be solved by having smaller
       | pull requests and then squashing commits when it's time to merge.
        
         | eximius wrote:
         | Well, yes. Or by having stacked branches and rebasing your
         | branches, etc.
         | 
         | The point is that GitHub/git's default experience makes this
         | harder to do than a system that bakes it in.
        
         | gloryjulio wrote:
         | We use stacked commits + rebase only in our company. The commit
         | history is linear and it's very easy to revert changes. I don't
         | see any advantage of using merging instead of rebase
         | 
         | I am not sure why we need to squash commits. We encourage the
         | opposite where you should commit small and often. So if we need
         | to revert any commit, it's less painful to do so.
        
           | majormajor wrote:
           | Without squashing it's hard for me to commit as small and
           | often as I would like.
           | 
           | Some things I want out of the final series of commits:
           | 
           | 1) everything builds. If I need to revert something or roll
           | back a commit, the resulting point of the codebase is valid
           | and functional and has all passing tests.
           | 
           | 2) features are logically grouped and consistent - kinda
           | similar to the first, but it's not just that I want the build
           | to pass, I don't want, say, module A to be not yet ready for
           | feature flag X but module B to expect the feature flag to
           | work. In the original article, this is to say that I want the
           | three commits listed, but not one halfway through the
           | "migrate API users" step.
           | 
           | But when I'm developing I _do_ want to commit halfway through
           | steps. I might commit 50 lines of changes that I 'm confident
           | in and then try the next 50 lines and decide I want to throw
           | them away and try a different way. I might just want to push
           | a central copy of what I've got at the end of the day in case
           | my laptop breaks overnight (it's rare, but happens!). I might
           | want to push something WIP for a coworker to take an initial
           | look at with no intent of it being ready to land.
           | 
           | But I don't want any of those inconsistent/not-buildable/not-
           | runnable states to be in the permanent history. It fucks with
           | things like git bisect and git blame.
        
             | gloryjulio wrote:
             | When we publish a stack of commits, our ci ensures that
             | every commit is build and tested individually. There is no
             | consistency issue
             | 
             | Squash and merge actually makes the above goal harder. With
             | rebase + small commits, all we need to make sure is that
             | every commit pass all the build signals and tests during ci
        
               | john_the_writer wrote:
               | This only works if your commit in a green state.
               | Sometimes we have to change when things are still
               | "Yellow"..
               | 
               | I tend to add all my tests in one go and commit the RED.
               | "tests are written" Then as I pass each test, I commit
               | that.
               | 
               | This pattern works really well for me because if I mess
               | up, then rolling back to the last yellow is easy. I can
               | also WIP commit if I have to fix an urgent bug, and then
               | get back to the WIP later.
        
               | gloryjulio wrote:
               | Not sure what you mean... When we ship a stack of
               | commits, every commit has to pass everything in CI. You
               | are not suppose to ship a commit that's not passing the
               | ci bar. There is a escape hatch that you can bypass but
               | it's rarely used.
               | 
               | You can make changes before you ship however you wanted
               | as long as they pass ci. If you already shipped the code
               | and want to make changes later, that means making new
               | commit or reverting a bad commit. It's simple as that
        
             | tome wrote:
             | I think there's an ambiguity here between squashing every
             | commit in the PR into a single one, and squashing fixup
             | commits made as responses to review into the commits that
             | originated them.
             | 
             | For example, if the original commit series was
             | Do a small refactor before I can start adding the test
             | Add the test for the feature         Do a small refactor
             | before I can start adding the feature         Work in
             | progress         Complete sub-feature 1         Work in
             | progress         lint         lint         Complete sub-
             | feature 2         Respond to reviewer 1 comments
             | Respond to reviewer 2 comments
             | 
             | Then you can either squash the entire PR down to
             | Implement feature
             | 
             | or you can, using interactive rebase, squash (or more
             | precisely fixup) individual WIP, lint, and response commits
             | into where they belong to obtain                   Do a
             | small refactor before I can start adding the test
             | Do a small refactor before I can start adding the feature
             | Complete sub-feature 1         Complete sub-feature 2
             | 
             | where each commit individually builds and passes tests. I
             | _far_ prefecr the latter!
        
           | PoignardAzur wrote:
           | My experience is that systemically squashing PRs enables a
           | "fire and forget" style where you can add a bunch of small
           | commits to your PR to address reviews and CI failures without
           | worrying about making them fit a narrative of "these are the
           | commits my PR is made of".
           | 
           | On a more concrete level, squashing PRs means every single
           | commit is guaranteed to pass CI (assuming you also use merge
           | queues) which is helpful when bisecting.
        
             | gloryjulio wrote:
             | With stacked commits, every commit is already passing CI
             | though.
             | 
             | To us the mental model is minimum. All you need to do is to
             | make sure each commit pass CI. You can ship any number of
             | stacked commits together
             | 
             | -----------------------------------------------------------
             | -----------------------------------------
             | 
             | Not sure why I can't reply in a technical discussion. I
             | have to edit to answer your question @danparsonson
             | 
             | > if I'm working on a long series of changes across
             | multiple days, and halfway through it the code doesn't
             | build yet?
             | 
             | That's why you break them down into small commits. The
             | early you push it to CI, the earlier you will know whether
             | each commit builds. For example, push commit 1 2 3 to the
             | CI when they are ready. When the CI is running, you are
             | working on commit 4 5 6
             | 
             | > The code won't pass CI because I'm not finished, but I
             | want to commit my progress
             | 
             | If your commit 1,2,3 are ready, just ship them. It doesn't
             | stop you have a few commits in reviews and a few WIP
             | commits. There is no down time
        
               | danparsonson wrote:
               | Perhaps I misunderstand you but what if I'm working on a
               | long series of changes across multiple days, and halfway
               | through it the code doesn't build yet? The code won't
               | pass CI because I'm not finished, but I want to commit my
               | progress so I don't lose it if something goes wrong, and
               | I can roll back if make mistakes.
        
               | tome wrote:
               | Then fix up the commit history at the end, for example
               | like this: https://news.ycombinator.com/item?id=41509051
        
         | lozenge wrote:
         | Not really. The idea is to split work into separate stages
         | which are reviewed separately, but as a whole.
         | 
         | In the example: "small refactor 25LOC -> new API 500LOC ->
         | migrate API users 50LOC"
         | 
         | Making a PR of the small refactor will probably garner comments
         | about "why is this necessary".
         | 
         | Opening two PRs at the same time is clutter as GitHub presents
         | them as separate.
         | 
         | As well, sometimes CI won't pass on one of the stages meaning
         | it can't be a separate PR, but it would still be useful in the
         | code review to see it as a separate stage.
        
           | john_the_writer wrote:
           | the comments about "why is this necessary" can be handled
           | with a decent PR template, and a comment.
           | 
           | What I tend to do is make the changes locally with different
           | commits and then cherry pick the refactor into a PR branch
           | and wait for that to be accepted. Then I rebase the FULL
           | branch with "master" after the merge and create the PR.
        
           | danparsonson wrote:
           | I'd be quite happy with seeing the three jobs in the article
           | as three separate PRs. Fixing a bug and adding a feature are
           | two jobs that, as I think we all agree, need to be tracked
           | individually - so work on them individually.
           | 
           | > As well, sometimes CI won't pass on one of the stages
           | meaning it can't be a separate PR
           | 
           | Could you give an example of this? Not sure what you mean.
        
             | Forge36 wrote:
             | Commits aren't always perfect.
             | 
             | Sometimes I'll make the unit test first, which fails CI and
             | the next set of commits implements the behavior.
        
               | trashburger wrote:
               | By doing this, you break commit atomicity and make
               | bisects hell. Please don't do this. Commits aren't
               | perfect at first for sure, but they should be by the time
               | you make them reviewable.
        
               | snatchpiesinger wrote:
               | It's fine to break commit atomicity on feature branches.
               | You can use git bisect --first-parent on you
               | development/master branch.
        
               | trashburger wrote:
               | I completely disagree. In doing so you lose all
               | visibility into the components and gradual evolution of
               | the code that atomic commits provide. Same thing with
               | squashing (which is just the worst).
        
         | fHr wrote:
         | Yeah I don't see the point, why not just use mergetrains?
        
         | globular-toast wrote:
         | That's like a caveman approach to the problem. Imagine the
         | extra overhead required to submit the "refactor" commit. The
         | result world be either nobody refactors or refactors are just
         | bundled into the feature commit so it's never clear what you're
         | actually reviewing.
        
         | keybored wrote:
         | Can someone explain to me why everything has to be done with
         | PRs? Like you just have three commits for a PR. But the correct
         | way is to split that up into three single-commit PRs? Why?
         | 
         | Not to mention that it doesn't give you an interdiff. Because
         | now you need to diff across three pull request.
         | 
         | It looks more like you are punting on the problem. Not solving
         | anything.
        
       | OJFord wrote:
       | I create my GitHub PRs like a reviewer's going to look at the
       | commits if it's too large overall. I'm also fairly sure they
       | don't, because it's basically never worth doing so in their PRs
       | ('fix the test', 'merge origin/master', 'address review comment',
       | etc. commits).
       | 
       | I suppose I agree GitHub doesn't help with this / implicitly
       | opposes it (I don't see the 'explicitly and' claim justified
       | though?) but there's nothing about a PR that isn't a 'series' of
       | 'patches'. Maybe GitHub's just responding to the way most people
       | use it, and making that easier/better instead of being
       | principled? Doesn't mean we can't be.
        
       | enasterosophes wrote:
       | Nice, this taught me about `git range-diff` which wasn't on my
       | radar before.
       | 
       | Is the conclusion likely to be that the author thinks Gerrit is
       | good, or is there some nuance I didn't pick up? I've used Gerrit
       | before and in hindsight I much prefer it to other ways of doing
       | code review.
        
         | aseipp wrote:
         | Yes, Gerrit is fucking great. If you actually want to do _code
         | review_ and not just rubber stamp shit on GitHub because your
         | eyes are going to bleed after reading the same thing for the
         | 15th time, just use Gerrit.
         | 
         | The thing is, I just never got around to finishing this article
         | because what's there right now is "good enough" to get the
         | ideas across.
        
       | ajkjk wrote:
       | 100% agree that this is ideal, the way Github does it is
       | completely godawful and it's a tragedy that so many people have
       | it normalized for them.
       | 
       | We did this with Phabricator, although it was a somewhat-manual
       | process, helped along by having some command line macros for
       | updating all the reviews at once. But better still would be an
       | explicit UI for it.
        
         | aseipp wrote:
         | I am the author and used the phrase "Code review is a pretty
         | good idea, in general" in the opening very specifically,
         | because it used be one of the selling points listed on the
         | Phabricator homepage. :) I miss it.
        
           | taspeotis wrote:
           | > Grab ahold of tasks, literally. Place them in confusing,
           | new orders. Make a column just for interns! Ignore the
           | backlog forever.
           | 
           | I swear the Herald selling points also had keeping tabs on
           | the pesky interns, or something to that effect.
        
             | aseipp wrote:
             | One of my favorites was something like:
             | 
             | > Phabricator is well received, and has raging reviews from
             | users inside Facebook, such as "Mandatory" and "OK"
             | 
             | Also, Evan Priestley still uses the
             | https://secure.phabricator.com instance to occasionally
             | post life updates, so I check in on it.
             | 
             | Last I checked he was learning about PCB design and
             | printing. I think he wanted to learn how to actually
             | physically manufacture his own boards (e.g. etching your
             | own conductive copper layers) and then he posted some
             | update like "I'm not sure I'll go down this route, because
             | it turns out you can send your design to PCBWay, and it
             | will come back within 5 days, because there is a magical
             | PCB Faucet somewhere in Shenzen apparently."
             | 
             | What a guy.
        
           | Groxx wrote:
           | Every day I have to use Jira and GitHub instead of Phab is a
           | rather painful day. Sadly those days are increasing.
           | 
           | Maybe one day I'll be able to use Gerrit (it sounds great),
           | and then I can be only annoyed at Jira.
        
       | fosterfriends wrote:
       | I agree with the argument laid out here. Series of small diffs
       | with versions is a fantastic clean model.
       | 
       | When creating Graphite on top of GitHub, we chose to only support
       | rebase model (despite the chaos that creates in GitHub timeline
       | events). We also added "versions" support, which wasn't too hard
       | because GitHub holds on to old commits even if you force push
       | over them.
       | 
       | A lot of what we try to build is the exact ideas this author is
       | championing, in a way that's compatible on top of GitHub. My
       | dream is us and others help usher Eng back towards the patterns
       | Phabricator and Gerrit helped start :)
        
         | aseipp wrote:
         | We strongly considered Graphite as an alternative to Gerrit at
         | my job that I mentioned at the start of this post (which I am
         | no longer at, actually) because it does look like an absolutely
         | excellent product, I will admit. You should all be proud of a
         | smart design and smart set of tools.
         | 
         | But there's a really really really really really really big
         | problem. Me and the other main engineer on our team used a
         | custom frontend to Git called Jujutsu[1] for all development.
         | Jujutsu is about 1000x better than Git. So that's nice. (I'm
         | also one of the developers, so I'm not going to abandon it
         | anytime soon.)
         | 
         | But gt, the graphite client, is not open source. I have no idea
         | how to make them work together. I have no idea how to extend
         | Jujutsu to handle Graphite stacks, because I don't even think
         | there's an API to handle any of this.
         | 
         | I even wrote a Gerrit integration for Jujutsu because JJ works
         | so well at stacking, and Gerrit + Jujutsu is absolutely a force
         | to be reckoned with IMO, even if the UX isn't as nice as
         | Graphite's. (I'm happy to show the people at Graphite why that
         | is, too, if anyone has time or is interested, but I suspect you
         | could all grasp Jujutsu quite easily :)
         | 
         | Please! Make gt open source and make it possible for third
         | parties to make and update stacks. This isn't just useful for
         | jj but all kinds of automation that wants to contribute patches
         | -- imagine tools like Google's internal "Code Review ML models"
         | for Critique that might recommend you rename a variable based
         | on context. They will suggest the fix for you or even apply it!
         | You can get around _some_ of those workflows with  "Incorporate
         | suggested edits" which is great on Gerrit (and Graphite?), but
         | not all of them.
         | 
         | [1] https://github.com/martinvonz/jj
        
           | jacobegold wrote:
           | We (Graphite) love Jujutsu - comes up in conversation all the
           | time here.
           | 
           | A prior version of the CLI is open source, the core data
           | model (using git refs to store some extra data about what a
           | branch's parent is) is still the same.
           | https://github.com/withgraphite/graphite-cli
           | 
           | We've talked about supporting other clients, but don't
           | currently have the bandwidth to build something like that -
           | definitely something I am personally passionate about making
           | sure happens at some point.
        
             | tombl wrote:
             | I think the cli repo went private a while ago.
        
           | ankitdce wrote:
           | If you are looking for an open source stacked PRs CLI, you
           | can look into av CLI (https://github.com/aviator-co/av).
           | Unfortunately this also only works with GitHub, but it should
           | be possible to add support for any Git-based platform.
           | 
           | Disclaimer: I'm the founder of Aviator who supports the av
           | CLI. It's a free tool to manage stacked PRs.
        
       | aseipp wrote:
       | I am the person who wrote this. AMA
       | 
       | EDIT: Also, I'm not sure if this is against the rules, but I also
       | need a new job as of recently. I like working on dev tools and
       | other hard problems. If you liked reading this, want me to make
       | your dev team more productive, or just want to experience and
       | enjoy my excellent (and occasionally _eclectic_ ) taste, the
       | email is in my profile.
        
         | rosmax_1337 wrote:
         | What is the meaning of life?
        
           | aseipp wrote:
           | I know this is in jest, but I'll just take the opportunity to
           | respond by posting my favorite poem. The relationship between
           | it and your question -- well, that's for you to decide.
           | The birds have vanished down the sky.         Now the last
           | cloud drains away.         We sit together, the mountain and
           | me,         until only the mountain remains.
           | 
           | -- Zazen on Ching-t'ing Mountain
        
             | xelxebar wrote:
             | Beautiful. I wasn't aware of Li Bai (Li Bai ). Took a bit
             | of searching, but the original looks to be called Du Zuo
             | Jing Ting Shan :                   Du Zuo Jing Ting Shan
             | Zhong Niao Gao Fei Jin          Gu Yun Du Qu Xian
             | Xiang Kan Liang Bu Yan          Zhi You Jing Ting Shan
             | 
             | This is gorgeous. Thank you for sharing!
        
             | rosmax_1337 wrote:
             | The question was sincere yet playful. I appreciate your
             | answer. Here's a poem I like.                 O'er all the
             | hilltops       Is quiet now,       In all the treetops
             | Hearest thou       Hardly a breath;       The birds are
             | asleep in the trees:       Wait, soon like these       Thou
             | too shalt rest.
             | 
             | -- Wanderer's Nightsong II, Goethe
        
         | zeotroph wrote:
         | From my interaction with the free part of GitHub, "diff soup"
         | describes it very well. Does the paid version do anything
         | better? What about GitLab, can this get near Gerrit? And then
         | there are the external services which try to make GitHub less
         | painful (and quite pricey, especially compared to a selfhosted
         | Gerrit), by providing stacked diff support, did you look at
         | these?
        
           | aseipp wrote:
           | No, paying for GH doesn't make the code review experience any
           | better. It's identical across public/cloud/enterprise GH.
           | 
           | I do not know if GitLab does anything different; I've never
           | used it in anger. I'd bet $10 the answer is "no, it's
           | basically just the same as GitHub", though.
           | 
           | If you want a service that adds stacking on top of GitHub, my
           | conclusion after some research is that https://graphite.dev/
           | is the best option. We did consider it, but it couldn't work
           | in my last job for "reasons" (see my other replies in this
           | thread) but if you've got a shop of Git users and just want
           | to throw money at the problem, I think it's the best choice.
           | 
           | GerritHub is also a possibility, they employ many of the
           | Gerrit devs and know what they're doing, but holy shit the
           | corporate options are expensive out of the gate. It's like
           | $20k/yr minimum regardless of size or number of users.
           | 
           | Honestly, Graphite is cheap as hell considering how much more
           | productive your engineers can be with a good review tool.
           | Gerrit was basically night and day for us. It's not "oh, it
           | pays for itself really quickly in a few days!" You'd probably
           | pay off the monthly cost in less than an _hour_ of actual
           | code review. And you don 't even have to opt most of your
           | engineers in; you can trial it where 90% of them use GH and
           | only a subset use graphite and pay.
        
             | zeotroph wrote:
             | Right, for Graphite $20/dev/month is nothing (I wonder if
             | Enterprise is less or more more than that...), considering
             | an ounce of review (prevention) is worth a pound of
             | bugfixes (cure).
             | 
             | And when you can not get corporate to switch away from GH,
             | then that is it. In hindsight an obvious way to (almost)
             | print money, congratulations, but also a sad state of
             | affairs.
             | 
             | But I imagine the $20k/yr is something you can easily spend
             | on a 1/5 of a dev doing Gerrit maintenance.
        
             | fHr wrote:
             | Never going to understand those finance teams who think
             | like 20k/yr for any enterprise deal is a good deal to
             | onboard more customers and increase reach.
        
               | sureglymop wrote:
               | Because they don't need _that much reach_ if even a
               | single customer nets them 20k /yr.
        
             | amstan wrote:
             | Does Graphite have a gerrit instance or something? I'm
             | prepared to say "shut up and take my money" compared to the
             | other 20k/year offer.
        
             | kspacewalk2 wrote:
             | >I do not know if GitLab does anything different; I've
             | never used it in anger. I'd bet $10 the answer is "no, it's
             | basically just the same as GitHub", though.
             | 
             | You would bet incorrectly then. GitLab does essentially
             | what you're describing, the only difference being that it
             | compares different iterations of the force-push "naively",
             | so if your force-push includes for example a rebase onto
             | master because another MR has been merged ahead of yours,
             | the diff will include the changes that have been rebased
             | onto.
             | 
             | If you decide to register an account on GitLab, simulate
             | the MR and prove to yourself that ~90% of your interdiff
             | post has been implemented by GitLab for about a decade,
             | kindly donate the $10 to your nearest homeless shelter.
        
               | aseipp wrote:
               | > the only difference being that it compares different
               | iterations of the force-push "naively", so if your force-
               | push includes for example a rebase onto master because
               | another MR has been merged ahead of yours, the diff will
               | include the changes that have been rebased onto.
               | 
               | > ~90% of your post
               | 
               | This is literally what GitHub does, down to the very
               | word, and it it is inferior to Gerrit, and it is not
               | sufficient to get 90% of the way, the last 10% matters.
               | As I have explained a dozen times in this thread. Lol.
        
               | another-acct wrote:
               | > GitLab [...] compares different iterations of the
               | force-push "naively", so if your force-push includes for
               | example a rebase onto master because another MR has been
               | merged ahead of yours, the diff will include the changes
               | that have been rebased onto
               | 
               | That's quite the deal breaker IMO; for example it
               | couldn't be used to compare a backport series (targeting
               | an older stable branch, for example) against the original
               | commit range on the master branch.
        
             | atombender wrote:
             | Graphite is OK. It's not great.
             | 
             | It's very buggy. My stacks frequently just hang and become
             | unmergeable.
             | 
             | It frequently gets confused about its state and "gt sync"
             | thinks that the upstream has changed even when it hasn't.
             | 
             | Because of the three states involved (local, GitHub, and
             | Graphite), there is a lot of "syncing". "gt submit", "gt
             | track", "gt untrack", "gt sync" are all needed constantly,
             | which adds mental overhead over the usual git pull/push.
             | 
             | You end up with a _lot_ of force-push churn in the GitHub
             | PR activity when re-stacking. In fact, I dread syncing
             | anything because in 6-PR stack it could cause an avalanche
             | of force-pushes. This is of course a side effect of the
             | GitHub PRs model and their rebase approach.
             | 
             | The web UI is pretty terrible. It's full of AI crap, and
             | there's a "meme library". It kind of seems like they want
             | people to live in their app, taking over from GitHub. But
             | they don't really offer a reason whatsoever to go to
             | graphite.dev instead of GitHub. Graphite doesn't really
             | help the review process.
             | 
             | They _really_ want you to sell you their merge queue
             | product.
             | 
             | In short, when it works it's fine. It's nice to do a series
             | of PRs and then watch as it merges them. But it's not a
             | tool I enjoy using, for the above reasons.
             | 
             | I'd prefer a lightweight tool that just managed PRs and
             | then did the merge magic. Graphite just wants to be too
             | much.
        
           | WorldMaker wrote:
           | GitHub does have some stacked diff/"merge train" tools on the
           | _deeply_ paid side (the  "Call Us" sorts of Pricing tiers)
           | that I've only seen screenshots and demos of.
           | 
           | On the other side: If you get into the habit of "Reviews" on
           | GitHub, which are in the free part, too, GitHub gives you a
           | quick button for "Review commits since your last Review"
           | under the Commits dropdown in the Files view. That mostly
           | only works if you add commits rather than rebase, hence the
           | complaints about contributing to "diff soup", but it's a
           | reasonably useful workflow and there are workarounds on the
           | "other side" to help deal with "diff soup".
           | 
           | This is why some encourage Squash Merging as the GitHub
           | preferred merge button. Review as a bunch of small commits
           | over time, merge an entire PR to a single final commit.
           | 
           | That said, as an alternative to squash merging, git itself
           | provides some useful tools for dealing with "diff soup" style
           | repositories using real merge commits: `--first-parent`. `git
           | log --first-parent`/`git blame --first-parent`/`git bisect
           | --first-parent` and more give you a "PR top-level view" in
           | your integration branches (such as main branch) without you
           | needing to rebase/squash.
           | 
           | I wish more UIs took a `--first-parent` by default approach,
           | including/especially GitHub's weak commit views (though it is
           | understandable why GitHub pushes you to wanting to use its
           | PRs list instead of commit views by keeping them weak).
        
         | torarnv wrote:
         | Amazing summary of the mental model of the interdiff review
         | style, and the problems with GitHub's approach to code review.
         | Thank you!
         | 
         | One thing you don't touch upon (yet) is that the diff soup may
         | lead people to prefer the squash merge strategy, to get rid of
         | the "noise" of the fixup commits, which throws away the "good"
         | initial 3 atomic commits as well.
         | 
         | With interdiff review style you're left with the initial 3
         | commits, and the choice of whether to land them individually or
         | squash them is based entirely on the commits themselves and how
         | atomic they really are.
        
         | dijksterhuis wrote:
         | What do you think of GitLab set up with Merge Request
         | Dependencies + Squash+Merge merge strategy?
         | 
         | I remember it being pretty easy to have that series of MRs
         | pattern set up. On merging 3x MRs it would be 3x merge commits
         | w/ a single squash commit for each. Regardless of the MR's
         | commit history.
         | 
         | Tradeoffs are having to merge earlier series branches into
         | later series branches after changes are made during review.
         | 
         | But people can do what they want to the commit history during
         | review. Don't matter as it just gets squashed.
         | 
         | Been a while since I've done this mind, been slumming it with
         | GitHub so I might be looking at it with rose tinted sunglasses.
        
       | chipx86 wrote:
       | We first built interdiffs in Review Board
       | [https://www.reviewboard.org] way back in 2006 (in fact I think I
       | may have coined the term, or arrived at it independently). It's
       | still my favorite part of the product and my process when doing
       | code reviews. And it's one of the things we hear the most
       | nostalgia for when people move to something like GitHub.
       | 
       | I've never felt that fix-it commits are really a proper
       | alternative, since:
       | 
       | 1) They don't tell you what upstream changes have been
       | incorporated into a series of commits.
       | 
       | 2) They tend to mess up the commit graph, even if temporarily,
       | and make it more difficult to review. If you've been following
       | along with a review, you may have already read the code being
       | fixed in fix-it commits, but if you're coming in fresh, you may
       | start off with a bad sense of what that code's trying to do or
       | how it's structured.
       | 
       | 3) Not everyone uses Git or other multi-commit-capable SCMs.
       | Plenty of people are on Perforce (such as in gamedev) or on
       | specialized SCMs like Keysight SOS (such as chip manufacturers).
       | So fix-it commits aren't even an option there.
       | 
       | A proper interdiff-capable code review systems means one reviewer
       | can start off from the first published review request and follow
       | along with every update, seeing only what's changed, while
       | another can jump in to the latest full change and not have to
       | worry about the series of fix-its that led up to it. And it can
       | do this regardless of the SCM.
       | 
       | If done right, it can also exist alongside multi-commit changes.
       | 
       | Let's say I have a small project I've broken up into multiple
       | commits to help with the review process (say, an API handler,
       | front-end UI, and documentation), and have decided this is
       | suitable for posting as one review request (since the commits are
       | largely interrelated and having these as one change helps lend
       | context to the reviewers -- if they aren't, multiple review
       | requests in a dependency chain are probably ideal).
       | 
       | Based on review feedback, I may end up making a series of changes
       | to one or all of those commits. When people go to review my
       | updates, it's nice to be able to see how each piece evolved,
       | without trying to do the mental arithmetic of mapping fix-it
       | commits and their changes to their corresponding changes.
       | 
       | So yes, interdiffs are fantastic! More people should use them,
       | whether they're working with lots of small commits or large
       | commits, single-commit review requests or multi-commit.
        
         | aseipp wrote:
         | That's sick as hell, friend. Actually, I have a second part to
         | this article discussing some of the history and politics of
         | what brought me to these tools.
         | 
         | In about 2013, I migrated the Glasgow Haskell Compiler from
         | "read .patch files on bug reporter" that I joked about, to
         | using Phabricator. For a couple reasons, but at the time one of
         | them was _not_ stacked diffs. It was because GitHub was so bad
         | for review it didn 't even have _side-by-side diffs_! It was a
         | total non-starter for me for those reasons among others.
         | 
         | But that actually wasn't the first time I migrated a team to a
         | code review tool. My first job in 2009 was a very small tight
         | knit team of engineers in a single room in Houston, and I
         | remember thinking it would be really good to get reviews of my
         | code from other people, and to read the things they wrote so I
         | could better understand the codebase. So, the first thing I did
         | in the first few months was pester my manager to set up...
         | ReviewBoard! And we all really liked it.
         | 
         | So I guess this is a way of saying thanks for RB! I still think
         | of it fondly from time to time. And because of it, Code Review
         | has always just been a huge part of my career, almost since day
         | one (and I could still do more of it.)
        
           | chipx86 wrote:
           | I love that! Thanks for sharing that with me :) It made my
           | day. Looking forward to the second part! If you're on really
           | any of the current social platforms, I'd love to connect. I'm
           | chipx86 everywhere.
           | 
           | I still develop Review Board full-time with a small team :)
           | The development world has changed a lot, and much of the
           | world has converged on GitHub (it's a hard market to be in
           | right now), so we still look for opportunities to build
           | review capabilities that target problems people have that
           | aren't being addressed elsewhere.
           | 
           | Just as an example, we launched PDF Review and diffing a
           | while back, which we see companies use for things like
           | industrial designs and schematics. It's neat, we actually
           | diff two rendered PDFs without just converting them into text
           | files first, like you usually see. Following that up with
           | full Office Document Review soon.
           | 
           | Dark mode _finally_ shipped earlier this year (I should write
           | about that endeavor sometime). And there 's a couple of
           | super-neat code review capabilities we've come up with that
           | nobody's doing, which I'm keeping under wraps for next year.
           | I think they're going to be pretty awesome.
        
       | sesuximo wrote:
       | If someone brought back phabricator i think they'd make easy $
        
         | roryokane wrote:
         | Phorge (https://phorge.it/) already exists. It's a still-
         | maintained fork of Phabricator.
        
       | deathanatos wrote:
       | Yes! This is what I imagine in my head as a real code review
       | style, not the stuff Github does. Glad to have a name for it.
       | 
       | I'd add I'd also like my review system to be able to kick patches
       | "out" of the review once they're ready. E.g., the small bugfixes
       | that you make while working on that bigger feature should
       | hopefully be small, isolated patches, ones that are going to find
       | consensus with a reviewer quite quickly. Once that happens ... I
       | want to just eject that patch from the entire series & cherry-
       | pick it to main, and rebase the review on the new HEAD. (Or,
       | differently, rebase the latest version of the patch series on
       | main, reordered such that the agreed patch is fist, and then FF
       | main onto that agreed upon patch.)
       | 
       | Essentially, narrow the scope of the review to "the part we're
       | still talking about", but let bug fixes see merging as soon as
       | they're ready.
       | 
       | The argument I'd have against this is "just make that a separate
       | review/PR". But then you get into the hairiness of patchset A
       | depends on patchset B, until B is merged, then it just depends on
       | main.
        
         | zeotroph wrote:
         | That's exactly what Gerrit can do. When you push an x-b-c-d-e
         | chain, these show up stacked in the UI, but you can easily
         | cherry-pick b onto main (see that the CI passes, and the usual
         | review), and rebase everything on top of that. If it is x, the
         | bottom one, you can directly submit it and continue with the
         | others.
        
           | lolc wrote:
           | All this rebasing sounds like constant pain to pull from
           | Gerrit. Does it actually create new branches v2 v3 after a
           | rebase? Or how do I switch my local checkout to the rebased
           | branch from remote?
        
         | aseipp wrote:
         | I keep saying this over and over but, Gerrit basically does
         | that. :) You can see the relationships between any two patches
         | on Gerrit, and more importantly, Gerrit shows you _each patch
         | individually_. So you can see in a series A - > B -> C that
         | yeah, B is small, let's go ahead and get that in.
         | 
         | Part of this is that UX has some really smart ideas like the
         | "Attention Set". The attention set is basically "Which people
         | need to take the next step?" Like a turn-based game. So, if you
         | just did a review, you're not in the attention set for that
         | patch anymore -- the _author_ is.
         | 
         | That means Gerrit puts it down at the bottom of your queue in
         | the UX. And what's at the top of the queue? Things where _you_
         | are in the attention set! So it naturally groups things this
         | way.
         | 
         | I didn't get into all the other really annoying papercuts with
         | GitHub's UX, but even the pull request listing is worse than
         | the alternatives. How do you know what state _anything_ is in?
         | You don 't, you have to go read the whole thing.
        
           | zeotroph wrote:
           | Attention Set <3 - that alone is worth another post. Gerrit
           | really is one of the best kept dev secrets, and if you never
           | had the luck of seeing it in person at a company where you
           | worked at, well...
           | 
           | Makes me wonder what other git or dev-in-general blindspots I
           | have.
        
             | ankitdce wrote:
             | Yeh, Attention Set is a game changer. We (Aviator) also
             | took inspiration (ahem.. copied) attention set from Gerrit:
             | https://docs.aviator.co/attentionset
        
           | deathanatos wrote:
           | I guess I missed it in your article, and I've never had the
           | opportunity otherwise to use Gerrit. (Since Github is
           | essentially so pervasive. I've only used that, Gitlab, and an
           | internal review system that didn't do interdiff.)
        
           | SlySherZ wrote:
           | Do you know how to create a local branch that tracks a gerrit
           | commit? Usually I just do git commit --amend to update
           | gerrit, but then I lose access to my patch's history (it's
           | still on gerrit, but I want it locally).
        
       | pronoiac wrote:
       | I'm midway through, but a nitpick:
       | 
       | > You're on your own figuring out the Commit IDs and punching
       | them into the URL bar if you want something more granular.
       | 
       | There's a "commits" tab at the top, like: Conversation |
       | [Commits] | Checks | Files changed
       | 
       | Example: https://github.com/raspberrypi/linux/pull/6330/commits
        
         | aseipp wrote:
         | I more meant that it's hard to diff between _arbitrary_ commits
         | without using the URL bar. The results are also... Weird. For
         | example, let 's say you have the base B, and commit X:
         | B  ---> X
         | 
         | Now someone pushed to main, so you rebase on B'
         | B' ---> X
         | 
         | Now, you modify X to address something (maybe just a spelling
         | error)                   B' ---> X'
         | 
         | Now you push the new rebased branch. Question: how do you view
         | the difference between X' and X?
         | 
         | Well, you have to use that little "Compare" button, but there's
         | a really big problem with it: it shows you the diff from X to
         | X' _and_ the diff from B to B ' at the EXACT same time. Which
         | is really bad! Imagine if the difference between B and B' is
         | 500 lines; it will completely dwarf the 1 line typofix from X
         | to X', making it impossible to read.
         | 
         | Now, this is kind of a problem in Gerrit too. But they use a UX
         | technique to make it manageable, which is very smart: they
         | color-code the lines of the diff, depending on if the diff
         | comes from B..B' or from X..X' -- so you can see _at a glance_
         | if the hunk is relevant.
         | 
         | More broadly, interdiffing between commit X and commit Y can be
         | tricky, because what you _really_ want to do is something like
         | "Rebase Y onto the parent of X, then diff X and Y", because
         | otherwise you get the included differences between their
         | baselines. (We do "Rebase Y on X's parent" for Jujutsu's
         | "interdiff" command IIRC?) But sometimes you DO want to include
         | the base diff, because the changes from B..B' can be VERY
         | relevant to your patch.
         | 
         | So, you need all these options, really. But once you go off
         | this beaten path where you want to compare X to Y... yeah, you
         | have to start typing into the URL bar, I think.
         | 
         | But I will say, the commits tab and the little n/p keyboard
         | shortcuts to "flip through the commits" like book pages is at
         | least a HUGE improvement over the basic UX though. I use that
         | all the time on GH projects these days, even if I have tons of
         | other problems.
        
           | skydhash wrote:
           | At my last job, the workflow was mostly merge based instead
           | of rebasing. After you create a branch, you merge from
           | develop of there's anything new you want or to resolve
           | conflict. And the whole PR will be squashed and merged
           | (there's an ID referring to the ticket). Force pushing to an
           | open PR was frowned upon.
           | 
           | I think it reflected the email based approach a bit better as
           | you can't alter the first patch you sent, only send new ones.
           | So even if the history of a PR is messy, we preserve its
           | chronological aspect alongside the discussion.
        
       | strken wrote:
       | This is interesting. At work we use PRs like the author uses
       | commits, and in fact we squash-and-merge them at the end, but our
       | approach requires rebasing the later PRs whenever we make a
       | change to the earlier PRs. This can be quite laborious, falls
       | afoul of the "don't force-push" rule, takes a long time for
       | engineers to learn, and tends to break existing code review
       | comments in the GitHub interface, but works out okay for two or
       | three PRs. In our workflow a commit is less a unit of work and
       | more a savepoint.
       | 
       | I also use heavily use stashes, as well as undo-tree-mode in
       | Emacs. This means we have four different ways of tracking the
       | history of source code, which sounds redundant but works out okay
       | in practice.
       | 
       | The ergonomics of doing this in Git are pretty bad. I found
       | Phabicator better but still unnecessarily difficult. Perhaps a
       | new source control management tool could have first class support
       | for higher level concepts than just commits and branches, or
       | perhaps that would be even worse to use.
        
         | steveklabnik wrote:
         | > Perhaps a new source control management tool could have first
         | class support for higher level concepts than just commits and
         | branches, or perhaps that would be even worse to use.
         | 
         | You should check out jj, sapling, or mercurial.
        
           | eddd-ddde wrote:
           | +1 for jj
           | 
           | It simplifies my workflow a lot when compared to plain git. I
           | love not having to track and stage manually, plus undo is
           | godsend.
        
           | rfoo wrote:
           | If you are stuck with git, try git-branchless. At least it
           | makes rebasing your patch stack easier (git sync && git
           | submit).
        
             | keybored wrote:
             | Git-branchless gave me thousands of refs that a simple
             | deinit wouldn't delete. It's so heavyweight for just
             | experimenting with a few new workflows here and there.
             | 
             | I can't recommend Git-branchless until it either tells me
             | what it is about overall or figures it out for itself.
        
             | steveklabnik wrote:
             | jj (and sapling, in my understanding) uses git as a
             | backend, so you can use it even if you're stuck with git.
             | Not that I think git-branchless is bad, mind you, but just
             | to be clear about it.
        
         | keybored wrote:
         | You use commits as PRs. This complicates things because of PR
         | dependencies.
         | 
         | Why not a better tool than Git? Sure, I would like a more user-
         | friendly tool as well. Less warts. But what about the stopgap
         | solution of simply treating commits like commits instead of
         | making things harder for yourself?
         | 
         | (I wonder how much GitHub is to blame for giving people these
         | PR-colored glasses)
        
           | steveklabnik wrote:
           | Given that you can use git with Gerrit, I am confident in
           | placing the blame on GitHub here.
        
       | codeapprove wrote:
       | Love the blog post, it's great to see people actually thinking
       | about how code review should work!
       | 
       | I've used four different code review systems extensively, all
       | with different strengths and weaknesses: Critique (Google
       | internal), Gerrit (at Google, but same as external), GitHub
       | (duh), and CodeApprove (the one I built).
       | 
       | Critique was far and away the best, but it only works because
       | it's perfectly fit to Google's monorepo and the custom VCS
       | they've built as well as all of their custom lint/test tooling. I
       | designed CodeApprove to bring as much of that as I could to
       | GitHub, but it will never really be close.
       | 
       | Gerrit was the second best in terms of the reviewer experience
       | ... but as an author I always hated it. It just seemed to be so
       | author-hostile. There were more wrong ways to do something than
       | right ways. And the UI is not exactly beautiful.
       | 
       | GitHub is extremely author friendly, it works how we think. You
       | write code, you get feedback, you write more code, etc. If you
       | squash and merge at the end of a PR you don't have the history
       | problems the author mentioned. It's not very reviewer or team
       | friendly though. Incremental diffs are not highlighted. Diffs and
       | conversation are in different tabs. Force pushes and rebases
       | destroy history. Comments are lost as "outdated". You can't
       | comment on files outside the diff window. Large files are hidden
       | by default, etc etc. They clearly don't care about this too much
       | and maybe they know something I don't.
       | 
       | In the end, the thing I find most frustrating is how many teams
       | just accept whatever code review tool is built in to their VCS
       | platform. That would be like using whatever IDE shipped with your
       | laptop! There are so many better options out there today. My
       | favorites (besides CodeApprove) are GitContext, Reviewable, and
       | Graphite but I can name half a dozen other excellent choices.
       | Don't accept the defaults!
        
         | another-acct wrote:
         | > GitHub is [...] not very reviewer or team friendly though
         | 
         | The problem is that reviewers / maintainers are much scarcer
         | than contributors. Workflows and UIs should optimize for
         | reviewer throughput, IMO.
         | 
         | (I've never used the three other tools you mention, so my
         | argument is general.)
        
       | mstachowiak wrote:
       | It's always exciting to see new approaches to code reviews -
       | GitHub has its strengths, but it's far from perfect.
       | 
       | For the scenario you've outlined, have you thought about
       | splitting the 3 patches into separate, dependent pull requests?
       | While GitHub doesn't natively support this, the right code review
       | tool (shameless plug - I'm part of a team building one called
       | GitContext) should allow you to keep pull requests small while
       | maintaining dependencies between them. For example, patch 3 can
       | depend on patch 2, which in turn depends on patch 1. The
       | dependency tracking between them - provided by the code review
       | tool - can ensure everything is released in unison if that's
       | required.
       | 
       | Each patch can then be reviewed on its own, making feedback more
       | targeted and easier to respond to. You can even squash commits
       | within a pull request, ensuring a clean commit history with
       | messages that accurately reflect the individual changes. Better
       | still, with the right tool, you can use AI to summarize your pull
       | request and review, streamlining the creation of accurate commit
       | messages without all the manual effort.
       | 
       | A good code review tool also won't get bogged down by git
       | operations like rebases, merges, or force pushes. Reviewers
       | should always see only the changes since their last review, no
       | matter how many crazy git operations happen behind the scenes.
       | That way, you avoid having to re-review large diffs and can focus
       | on what's new. The review history stays clean, separate from the
       | commit history.
       | 
       | I'd be curious if this approach to splitting up pull requests and
       | tracking their inter-dependencies would address your needs?
        
         | fHr wrote:
         | Why not just do good old mergetrains with pullrequest A points
         | to branch B amd then B points to master, merge B into master
         | and thereafter point A back to master or am I missing the
         | point?
        
           | steveklabnik wrote:
           | This is called "stacked diffs" and it's a good workflow; the
           | issue is that it's annoying to use on GitHub without tooling.
           | The "point A back to master" bit isn't easy/obvious with pull
           | requests.
        
             | keybored wrote:
             | From the peanut gallery of HN I've never understood Stacked
             | Diffs. It looks like they reinvented commits as dependent
             | PRs. Which are stacked on top of each other like commits
             | are.
        
               | steveklabnik wrote:
               | Fun fact: part of the reason that this article is on HN,
               | I believe, is because I linked it to someone on another
               | site as a means of explaining stacked diffs.
               | 
               | > It looks like they reinvented commits as dependent PRs.
               | 
               | Sort of kind of. It really depends on what you mean by
               | PR: if we're talking about "review this branch please,"
               | which is what I would argue most mean by PRs, then yes,
               | in the context of "stacked PRs for GitHub" it's largely
               | about tooling that makes dependent PRs easier.
               | 
               | But there are other, non-GitHub tools. With those tools,
               | you don't say "here's a branch, review it please," you
               | say "here is a stack of commits, review them please."
               | There's no branch going on inside. It's just a sequence
               | of commits. This matters because it centers the commit as
               | the unit of code review, not a branch. This also means
               | that you can merge parts of the stack in at different
               | times: to use the example from the article, once "small
               | refactor" is good to go, it can be landed while "new API"
               | is awaiting review. etc.
               | 
               | I think it takes actually using some of these tools to
               | really "get it." I never understood them either, until I
               | actually messed around with them. I am currently on my
               | project solo, so I don't really stack at the moment, I
               | think it really helps more the larger of a team you're
               | working with is.
        
               | keybored wrote:
               | Oh hey, thanks for the explanation! I've been wondering
               | about this for a while. The linked articles on HN tended
               | to be heavy on arguing how unlike the workflow is to what
               | "you are used to" that the description of what it was
               | about got obfuscated.
               | 
               | I've used Git with email a little bit which also lets you
               | review commits in isolation. It's too bad that so many
               | review tools bury the commits (ask me how many times
               | someone at work has asked "what this is about" on the PR
               | diff when the relevant commit message explains exactly
               | that).
               | 
               | But what I like about email is that the whole series/PR
               | also gets reviewed as a unit. Both worlds.
        
               | steveklabnik wrote:
               | You're welcome. Yeah, it as a flow is much closer to the
               | email workflow than the PR workflow, when you commit to
               | it.
        
         | codethief wrote:
         | As mentioned elsewhere in this thread, this is also the
         | approach that Sapling follows.
         | 
         | As for GitContext, how do you keep track of commits across
         | fixups, rebases, reordering, etc.?
        
           | crabbone wrote:
           | What's the point of keeping track of commits? I honestly
           | never understood people wanting that. Is this for some kind
           | of weird accounting / social-score system where the number of
           | commits decides your yearly bonus?
           | 
           | It's useful to see how the system evolved (because you might
           | want to go back a bit and redo the newer stuff), but it's
           | pointless to see the mistakes made along the way, for
           | example, unless you have some administrative use for that.
           | 
           | Similarly, if a sequence of commits doesn't make sense as
           | committed, but would make better sense if split into a
           | different sequence: then I see no problem doing that. What's
           | the point of keeping history in a bad shape? It's just harder
           | to work with, if it's in a bad shape, but gives no practical
           | advantages.
        
         | torarnv wrote:
         | As far as I know, splitting the series into individual PRs only
         | works if you have commit rights to the repository, so you can
         | base one PR on a different branch (in the main repository) than
         | main.
         | 
         | As an outside contributor, with a fork of the repository, your
         | three PRs will incrementally contain change A, A+B, and A+B+C,
         | making the review of the last two PRs harder, because you need
         | to review diffs for code you're already reviewed in another PR.
        
           | meling wrote:
           | Not sure about the fork workflow but otherwise it is possible
           | to change the base branch (manually on GH's web interface) so
           | that you don't have to see the original branches and review
           | the changes from A to B and from B to C. Maybe this is not
           | possible with fork workflows?
        
         | dogleash wrote:
         | > It's always exciting to see new approaches to code reviews -
         | GitHub has its strengths, but it's far from perfect.
         | 
         | This is nice sentiment, it's positive reception to an idea and
         | polite to the incumbent.
         | 
         | But it's so thoroughly not a new idea. It's literally the
         | workflow git was designed to support, and is core to many long-
         | standing criticisms about GitHub's approach for as long as
         | GitHub has had pull requests.
         | 
         | And I'm over here wondering why this idea took _*checks
         | calendar*_ over 15 years to graduate from the denigrated
         | mailing list degens and into hip trendy development circles.
         | 
         | I thought we were knowingly choosing shit workflows because we
         | had to support the long-standing refusal by so many software
         | devs to properly learn one of their most-used tools. That's why
         | I chose the tools I chose, and built the workflows I built,
         | when I migrated a company to git. Nobody gets fired for buying
         | IBM after all.
        
           | aseipp wrote:
           | I mean, the answer is simple. Even if email-based flows use
           | range-diff, which is the correct _conceptual_ model, all the
           | actual details of using email are, I would estimate, about
           | 1,000x shittier in 2024 than using GitHub in 2008 when I
           | signed up for the beta as user #3000-something.
           | 
           | Email flows fucking suck ass. Yes I have used them. No, I
           | won't budge on this, and no, I'm not going to go proselytize
           | on LKML or Sourcehut or whatever about it, in Rome I'll just
           | do as the Romans even if I think it sucks. But I've used
           | every strategy you can think of to submit patches, and I
           | can't really blame anyone for not wading through 500 gallons
           | of horrendous bullshit known as the mailing list experience
           | in order to glean the important things from it (like range-
           | diff), even if I'm willing to do it because I have high pain
           | tolerance or am a hired gun for someone's project.
           | 
           | Also, to be fair, Gerrit was released in 2009, and as the
           | creator of ReviewBoard (in this thread!) also noted it
           | supports interdiffs, and supported them for multiple version
           | control backends, was released in 2006! This was not a
           | totally foreign concept, it's just that for reasons, GitHub
           | won, and the defaults chosen by the most popular software
           | forge in history tend to have downstream cultural
           | consequences, both good and bad, as you note.
        
             | dogleash wrote:
             | Dude, I'm not making a defense of mailing list workflows
             | here. I'm just pondering the nature of the world where
             | despite all the yapping about git I've seen floating around
             | on the internet for as long as I've been lurking social
             | media, the yappers are just recently keying in on
             | something.
        
               | aseipp wrote:
               | If you're asking "Why did this take 15 years for people
               | to understand" and my reply is "Because it was under 1000
               | layers of other bullshit", then that's the answer to your
               | pontification. It has nothing to do with whether you
               | think email is good or not. You pondered, I answered.
               | That simple.
        
               | another-acct wrote:
               | > Because it was under 1000 layers of other bullshit
               | 
               | Not only because of that.
               | 
               | git-range-diff, while absolutely a killer feature, is a
               | relatively new feature of git as well (a bit similarly to
               | "git rebase --update-refs" -- which I've just learned of
               | from you <https://news.ycombinator.com/item?id=41511241>,
               | so thanks for that :)).
               | 
               | Namely, git-range-diff existed out-of-tree as "git
               | tbdiff" <https://github.com/trast/tbdiff> originally. It
               | was ported to git proper in August 2018
               | <https://github.com/git/git/commit/d9c66f0b5bfd>; so it's
               | not a feature people could have used "15 years ago".
               | 
               | (FWIW, before git-range-diff was a thing, and also before
               | I had learned about git-tbdiff, I had developed a silly
               | little script for myself, for doing nearly the same.
               | Several other people did the same for themselves, too.
               | Incremental review was vital for most serious
               | maintainers, so it was a no-brainer to run "git format-
               | patch" on two versions of a series, and colordiff those.
               | The same workflow is essential for comparing a backport
               | to the original (upstream) version of the series. Of
               | course my stupid little script couldn't recognize
               | reorderings of patches, or a subject line rewrite while
               | the patch body stayed mostly the same.)
        
             | another-acct wrote:
             | > all the actual details of using email are, I would
             | estimate, about 1,000x shittier in 2024 than using GitHub
             | in 2008
             | 
             | Disagree about "all". Tracking patches in need of review is
             | better done in a good MUA than on github. I can suspend a
             | review mid-series, and continue with the next patch two
             | days later. Writing comments as manually numbered,
             | plaintext paragraphs, inserted at the right locations of
             | the original patch is also lightyears better than the
             | clunky github interface. For one, github doesn't even let
             | you attach comments to commit message lines. For another,
             | github's data model ties comments to lines of the
             | _cumulative diff_ , not to lines of specific patches. This
             | is incredibly annoying, it can cause your comment for patch
             | X to show up under patch Y, just because patch Y includes
             | _context_ from patch X.
             | 
             | Edited to add: github also has no support for git-notes.
             | git-notes is essential for maintaining patch-level
             | changelogs between rebases. Those patch-level changelogs
             | are super helpful to reviewers. The command line git
             | utilities, such as git-format-patch, git-rebase, git-range-
             | diff, all support git-notes.
        
           | juped wrote:
           | Nope, none of it was knowingly done, and plenty of teams are
           | almost trivially convertible to the normal workflow, even
           | without inventing a buzzword like TFA did!
           | 
           | Though plenty aren't. I get it. (But one of the magic phrases
           | that really works well is "this is what git, itself, does,
           | and there's a man page installed on your system at this very
           | moment explaining it")
        
         | keybored wrote:
         | The author already has a branch and its commits. What then
         | would be the purpose of splitting the branch into _three pull
         | requests_ with one commit each?
         | 
         | - Each commit can be reviewed on its own
         | 
         | - Dependency tracking between commits? Yes...
         | 
         | - They don't have AI built-in (a good thing)
         | 
         | - You _can_ have an interdiff between commits unlike PRs
         | 
         | Commits are the bread and butter of Git. Just use commits.
        
       | EPWN3D wrote:
       | Is it just me or is the author just arguing for fixup commits and
       | squashing them, something GitHub and Stash/Bitbucket handle just
       | fine? I don't see how the idea of "publish a new version of the
       | three commits" is new with or exclusive to Gerritt.
        
       | wodenokoto wrote:
       | When doing code reviews, I think it is annoying that every time I
       | comment on a line, PR author gets a notification.
       | 
       | This is not a simultaneous, real-time thing.
       | 
       | I'm in the middle of doing my review, and my comments are not
       | ready to be read. Maybe I'll change my mind on my comment on line
       | 8 when I reach line 80.
        
         | globular-toast wrote:
         | GitLab lets you compose the entire review before submitting it
         | so the author won't see anything until you're done.
        
           | djur wrote:
           | Github does, too. You just click "Start a review" when
           | filling out your first comment.
        
           | senko wrote:
           | GitHub as well.
        
         | aseipp wrote:
         | GitHub lets you do this, but _only_ when writing a review for
         | someone else. Not when _addressing_ someone else 's review, in
         | which case your complaint 100% stands.
         | 
         | And yes, it absolutely drives me nuts, honestly. Why can I
         | batch review comments, but not resolutions! Another thing
         | Gerrit gets right! GitHub... Please...
        
           | another-acct wrote:
           | ... Another thing that mailing list-based development gets
           | right ;) Most MUAs should know about a concept called
           | "Drafts". I can have as many draft messages concurrently as I
           | want, I can work on them over several days, and I can send
           | them out (in practice) as a batch.
        
       | globular-toast wrote:
       | GitLab supports this. Every time someone pushes _or force pushes_
       | it tags that as a version which you can diff. If your developers
       | know how to generate new commits then you can do it right away
       | with GitLab.
       | 
       | The problem is generating the new commits. Developers just aren't
       | very good at doing this. They can modify a single commit just
       | fine, but modify a commit that isn't the latest commit involves a
       | rebase.
       | 
       | Magit has the "instant fixup" option which is basically like
       | amending an arbitrary commit instead of just the latest. What is
       | actually doing is doing a commit with `--fixup` then `rebase
       | --autosquash`. This technique can be used manually. Fixup/squash
       | commits should be part of all developers' toolkits.
        
         | zeotroph wrote:
         | > Developers just aren't very good at doing this.
         | 
         | GitHub provided a way to contribute, but also to avoid learning
         | to rebase, thus making it more welcoming to devs who only know
         | about commit and pull - that is what made it so popular. The
         | squash then rebase or merge step is done on server side. Plus
         | it has a very "harmless" UI, but that hides a lot of details
         | (patchsets) and the layout wastes so much space imo.
         | 
         | This also means devs could avoid learning more about git, and
         | this lowest common denominator git workflow makes it so
         | frustrating for those of us who learned git all the way. I
         | can't even mark a PR as "do not squash" to prevent it being
         | merged in the default way which throws out all history.
        
           | globular-toast wrote:
           | Yeah, this annoys me too. I actually find the "forced merge"
           | (`--no-ff`) style even worse because you can never tell if
           | you're looking at "real" commits or just crap because they
           | couldn't be bothered to rebase.
           | 
           | Must say, though, I think "history" is completely the wrong
           | way to think about version control. It's not about tracking
           | history, it's about tracking versions. History is the crap,
           | unrebased commits. Rebasing turns the history (throwaway,
           | works in progress) into versions.
        
         | kspacewalk2 wrote:
         | Rewriting history and breaking N sloppy commits into M well-
         | thought-out, logical commits is an essential git-based version
         | control skill for developers. Thus, interactive rebase should
         | be considered essential for anyone using git for anything non-
         | trivial. It's TUI-like interface is a bit quirky for some
         | people, but it is rock-solid once you figure it out and
         | therefore worth investing a bit of time into learning. (That
         | also describes git in its entirety quite well).
        
       | cryptonector wrote:
       | > Interlude: Can you please just tell me if git rebase is evil or
       | not so that we can derail the entire discussion over it?
       | 
       | Ha, that's funny. But yes, please we need interdiffs in GitHub
       | and GitLab. I want to have my PRs/MRs always rebased and I don't
       | want "fix code review comments" commits.
        
         | crabbone wrote:
         | Why would you want that in GitHub or Gitlab? :| The Web
         | interface is atrocious no matter what you are using it for.
         | Just the fact that you need to edit text in a dysfunctional Web
         | editor negates all the benefits of any workflow you can
         | imagine...
         | 
         | The most natural way to deal with PRs is to use your editor you
         | use to write your code. Here you have all the tools necessary
         | to navigate the code, to look up the history of the change, to
         | run tests and so on.
         | 
         | This is just not the part that needs to be done by a Git
         | hosting service. Having to do it through a Web interface just
         | feels like random punishment...
        
       | lynguist wrote:
       | I did code reviews on Azure devops and on Bitbucket. I noticed
       | that Azure devops shows exactly this diff soup, and Bitbucket
       | shows interdiffs, and comments do point to previous points in
       | time etc. It is much better from both points of view.
        
       | senko wrote:
       | I usually deal with "isolated patch series followed by a bunch of
       | fixups" by letting them pile on top, then rebasing just before
       | merging (the setting is usually feature branches all branched
       | from main so that's fine).
       | 
       | It is extra work and not everyone appreciates the benefits, so
       | it's hard to convince coworkers to do the same.
        
       | codethief wrote:
       | Nice article, it captures the issues I've had with code reviews
       | very well. I'm just not sure the "pairwise diff" would work well
       | in practice. Sometimes you do forget a change which should be
       | separate commit (in between the existing commits), etc.
       | 
       | I recently got introduced to Sapling, specifically to the
       | approach of never merging more than a single commit and also
       | doing code reviews commit by commit. I like that idea even
       | better!
       | 
       | Of course with existing tools like git & GitHub this seems rather
       | difficult to implement, but Sapling automatically keeps track of
       | how commits change across fixups & rebases, and it also handles
       | entire "stacks" of merge requests / commits.
        
       | snatchpiesinger wrote:
       | Another way to look at this that this is 3 linearly dependent PRs
       | masquerading as one. Make each a distinct PR and the problem goes
       | away, especially if you can mark the PR to depend on another one
       | (on Gitlab you can, not sure about Github). If you want to see
       | each change as a single logical unit, then they will each be a
       | distinct merge commit on your master branch, use `git log
       | --first-parent-only master` to only see these kind of changes.
        
         | zeotroph wrote:
         | > [GitLab] you can mark the PR to depend on another
         | 
         | How much user interaction does that require, and how is this
         | visualized in the review UI? Gerrit creates this dependency
         | with a single `git push`.
        
           | snatchpiesinger wrote:
           | It's a bit cumbersome, and I think only recently you can make
           | longer dependency chains. It's certainly not automated away
           | with just git commands, but maybe there there is a Gitlab API
           | way. The only way I know is to "edit" the PR (or MR in Gitlab
           | speak) and paste the URL into some "depends on" field, then
           | save.
           | 
           | There are certainly other problems as well, like you might
           | have an MR 1 from feature1 to master, and MR 2 from feature2
           | to master which in turn depends on MR 1. Most likely your
           | feature2 branch is off your feature1 branch, so it contains
           | feature1's changes when compared to master, and that's what
           | is shown in the Gitlab review UI. This makes reviewing MR 2's
           | changes in parallel to MR 1 frankly impossible.
           | 
           | Having said that, I still think that this would be the right
           | way to organize this kind of work, however Gitlab's execution
           | is not great, unfortunately. Any of this is probably
           | impossible in Github too. I wonder if Gerrit gets this right,
           | I have no experience with it.
           | 
           | edit:
           | 
           | One interesting point of MR dependencies in Gitlab is that I
           | think you can depend on MRs from other projects. This is
           | sometimes useful if you have dependent changes across
           | projects.
        
             | rfoo wrote:
             | I usually just point the target branch of MR 2 to MR 1.
             | After merging MR 1 GitLab automatically change it to the
             | default branch so it's more or less okay.
             | 
             | However this makes updating these MRs very rebase heavy and
             | as said in OP it is hostile to reviewers.
        
         | aseipp wrote:
         | > Another way to look at this that this is 3 linearly dependent
         | PRs masquerading as one.
         | 
         | The first commit in the example is not dependent on anything
         | else and may not exist at all. Rather its existence is illusory
         | to show that sometimes when you write commits, a few things
         | happen at once. You might just churn out a doc fix, a bug fix,
         | and a small other thing all at once. It's just the nature of
         | the work.
         | 
         | > (on Gitlab you can, not sure about Github)
         | 
         | You cannot do this on GitHub without write access to the
         | repository, so it's effectively a non-option for anyone who is
         | not a committer in an open-source context. Don't ask me why
         | this limitation exists.
         | 
         | If you do have write access, you can kind of do some similar
         | things like open N pull requests where each PR 1...N has
         | commits 1...N. Then you do something like "Read every PR, then
         | merge only the last one which contains all N commits, and close
         | all the others." Weird but OK, I guess.
         | 
         | Still, organizing this is a pain, and I don't think GitHub
         | really emphasizes it -- and also rebasing the dependent
         | branches really requires you to use something like --update-
         | refs to make it sane at all. So, you can also use tools like
         | <https://github.com/spacedentist/spr> in order to organize it
         | for you. Not the end of the world considering everyone uses 50
         | different Git wrappers, but not ideal.
        
       | Vinnl wrote:
       | I'm using mostly this workflow with GitHub, with the main
       | disadvantages being that it's more work on my side, and not
       | obvious to my collaborators. But it does carry the same
       | advantages of allowing reviewers to view diffs with just their
       | feedback incorporated, without breaking `git blame` and `git
       | bisect`.
       | 
       | When I incorporate a reviewer's feedback, I'll commit that with
       | `git commit --fixup <hash of commit that I want to update>`. I'll
       | then push that up and leave a comment reply to the review
       | feedback sharing the fixup commit hash.
       | 
       | Then when the PR is approved and I'm about to merge, I'll do
       | git rebase --interactive origin/main --autosquash
       | 
       | This will then combine the fixup commits with the correct
       | original commits. I then do a final `git push --force-with-lease`
       | and merge it. (Make sure to note force push before the review is
       | done, because then reviewers lose the ability to see what you
       | added since their last review.)
       | 
       | This relies heavily on autocomplete in my terminal, so that I
       | only have to type `git re<right arrow>` to get to that long
       | command above, for example. And it's a bit clunky, so using a
       | tool that supports and encourages this workflow would be nice.
       | 
       | But given that I'm stuck with GitHub, it's OK.
        
         | Piraty wrote:
         | https://news.ycombinator.com/item?id=37086022
        
           | tome wrote:
           | Wow, my jaw is on the floor. This is such a good idea! I
           | already had the idea of tracking issues in the same repo as
           | the code -- not that I actually use that idea -- but I didn't
           | have the idea of doing PRs in the repo itself. Love it!
        
         | mizzao wrote:
         | I was going to say... interactive rebase addresses a lot of the
         | "diff soup" comments that the writer complains about. It's
         | really only done by disciplined engineering teams though (who
         | bother to learn some more advanced features of git)
        
           | aseipp wrote:
           | Interactive rebase is fine, I used it probably 50x a day for
           | many years. The problem isn't really with Git here, it's more
           | a bunch of interrelated problems with GitHub's UX, and the
           | fact that many people culturally have never approached the
           | problem in any other way. A lot of places basically just
           | outright copy GitHub's UX, which I think is a good example of
           | this.
           | 
           | For example, using 'git absorb' or autosquash doesn't solve
           | the problem of the review UX just doing badly when you say
           | "What is the difference between the last version of commit Y,
           | and the new version." If those changes include a rebase, it
           | shows you the entire diff including the rebase. That's often
           | not good, but sometimes extremely necessary. (See my other
           | posts in the thread about how Gerrit does this better.) And
           | if you have to do something like rebase, then solve a
           | conflict as a result of the rebase, it is basically
           | unavoidable that you have to do all of them at once if you
           | want the CI to stay clean (which I find important, at least.)
           | 
           | Also, a lot of teams just culturally hate shit like
           | autosquash, or any squash/rebasing at all. Do I think they're
           | misinformed? Yes. Do I think they could use tools that make
           | those flows 1000x better and faster? Yes. But at the end of
           | the day default user interface and design philosophy of one
           | of the most popular software forges in the world _does_
           | actually matter and have downstream cultural impact. So you
           | have to interrogate that and break it down for people. GitHub
           | does not really encourage, explain, or smooth out workflows
           | like this. People learn it, then hate anything else. (I mean,
           | fucking hell, Git rebase didn 't even have --update-refs, a
           | huge QOL improvement for stacked diffs, until like 2022!) So,
           | you often don't have a choice. I've had many places where
           | engineers despised GH force pushes, but only because...
           | GitHub's UX is bad at tracking addressed/reviewed comments on
           | a PR after a force push! They get hung by their own noose, so
           | to speak, and don't even know it.
           | 
           | I do have other problems with Git, but this post is mostly
           | about GitHub/Gitlab style PR review flows. The same problems
           | still happen even with autosquash or tools that accomplish
           | the same thing.
        
           | jrochkind1 wrote:
           | I realized reading the first part of this article that I
           | often want to set up a _sequence_ of PR 's, for very similar
           | reasons as in the begininng of OP. Say, a prefatory refactor,
           | then the main work, then some data cleanup.
           | 
           | When I do this, the problem with rebase is that it kind of
           | breaks the additional "next in sequence" PRs "on top", or at
           | least requires (confusing to me) cleanup in all of them when
           | I rebase the base.
        
             | shambulatron wrote:
             | Have you come across `git rebase --update-refs`? This
             | automatically moves your "intermediate" branches during a
             | rebase and sounds like it could be useful in your
             | situation.
        
               | jrochkind1 wrote:
               | I have not! I'll try to check it out, thanks!
               | 
               | The git cli still scares me when I get off the familiar
               | path. :(
        
         | aseipp wrote:
         | Yes, fixup commits are a good way to approach this, though I
         | don't personally like them, though. I think Sapling's "absorb"
         | command which works on an underlying SCCS weave to
         | automatically absorb changes into the relevant diff is much
         | more elegant. It prompts you with an interactive UI. (For me,
         | it sorta falls into the same space as rebasing a series that
         | has multiple branches on it. You need --update-refs for that.
         | Because otherwise it's like, why am I, the human, doing the
         | work of tracking the graph relationships and manually punching
         | in the commits, and moving the branches, and doing all this
         | bullshit? Computers are good at graphs! Let them handle it!)
         | 
         | There is also a `git absorb`, but it isn't as robust as
         | Sapling's implementation[1].
         | 
         | Really the problem isn't interactive rebase or not. It's mostly
         | a problem of the UX of the review tool itself more than
         | anything, and the kind of "cycle" it promotes. I mentioned it
         | elsewhere here, but fixup commits for example still won't solve
         | the problem of GitHub showing you diffs between baselines, for
         | example, which can absolutely ruin a review if the baseline is
         | large (e.g. you rebased on top of 10 new commits.)
         | 
         | I do have problems with Git's UX beyond this, but the original
         | post is mostly a gripe about GitHub.
         | 
         | [1] There is an example in this GitHub issue that captures the
         | difference between the two underlying algorithms:
         | https://github.com/martinvonz/jj/issues/170
        
         | faangguyindia wrote:
         | Biggest problem working with Git is remembering all these flags
         | and commands.
         | 
         | If you do it everyday for years, sure you remember most of
         | them.
         | 
         | But for weekend hackers or those who specialise in some other
         | areas, it creates a lot of frustration.
         | 
         | I often forget all these flags etc....
         | 
         | My brother made this https://github.com/zerocorebeta/Option-K
         | 
         | This enables me to simply write in thermal, "interactive rebase
         | main, auto squash.
         | 
         | And then I hit option+k and it replaces it with the above
         | command.
        
           | ipaddr wrote:
           | If you are a weekend warrior you are not working with a team.
           | Just commit and push.
        
           | keybored wrote:
           | You've only mentioned your bros GitHub projects four times in
           | the last two days. Time to step it up maybe?
        
         | riquito wrote:
         | You can add to your ~/.gitconfig                  [rebase]
         | autosquash = true
         | 
         | from then on `git rebase -i origin/main` automatically reorder
         | fixup and squash commits. It's a small thing but greatly
         | improved my workflow
        
           | dietr1ch wrote:
           | Squashing is nice because it keeps the amount of commits
           | minimal, but as mentioned in the article, it has the problem
           | that now reviewers have to figure out what changed since
           | their last review, which can get hard to do if the tree of
           | changes originally proposed needed updates in dependencies of
           | the commits they reviewed.
           | 
           | I really like the idea that I'm working on a tree (often a
           | stack/list) of changes, and as the review progress I get this
           | cross product of iterations over time like [a-better-way-
           | interdiff-review-aka-git-range-diff](https://gist.github.com/
           | thoughtpolice/9c45287550a56b2047c631...) showcases. In the
           | end, the time dimension is useless after things get
           | submitted, so the repository only gets the latest state of
           | the change tree committed, which is simple and like the auto-
           | squashed version you mention, but during review reviewers get
           | to see the change tree evolve in an easy way.
        
             | pinkorchid wrote:
             | > the problem that now reviewers have to figure out what
             | changed
             | 
             | That's something that the review tool can solve, and I
             | agree that github doesn't handle it well. But other code
             | review tools can show diffs between revisions independently
             | of how the commits have been rebased or squashed over time.
        
         | Guvante wrote:
         | Git aliases are magical BTW
        
       | outsomnia wrote:
       | Guys... look into stgit if you like the sound of this iconoclasm
       | 
       | https://stacked-git.github.io/
        
       | Piraty wrote:
       | using github's UI as preferred way to interact with code in
       | review is a bad idea, as it encourages lazy from-the-couch-just-
       | yolo-approve-it-looks-alright style of review. this is where
       | gerrit+mail based workflows shine as reviewer is more encouraged
       | to apply the series, compile/run it in their env (which might
       | differ from your's); here is an example [0].
       | 
       | here are some useful notes on how to have a purely branch centric
       | review process regardless of a webUI: [1]
       | 
       | [0]: https://drewdevault.com/2022/07/25/Code-review-with-
       | aerc.htm... [1]: https://news.ycombinator.com/item?id=37086022
        
         | another-acct wrote:
         | > mail based workflows shine as reviewer is more encouraged to
         | apply the series, compile/run it in their env
         | 
         | not to mention: if the reviewer does this for every version of
         | the posted series, on appropriately named (versioned) local
         | branches, then they can trivially run git-range-diff between
         | adjacent versions!
        
       | kgeist wrote:
       | After JetBrains discontinued Upsource, we tried several code
       | review tools and it shocked us that many of the tools (both
       | commercial and open source) don't have built-in tools to review
       | incrementally. It's just just on big soup of code, or you have to
       | create new PR's for each small change. Now we have to use
       | JetBrains SpaceCode, which still lacks many of the niceties of
       | Upsource.
        
       | dusted wrote:
       | We're working with gitlab and not a day goes by where I don't
       | miss gerrit with it's fast and functional ui, good and practical
       | diff viewer, patch-sets, topics, review workflow, and
       | manifests...
       | 
       | The thing that gitlab does really well, is making it clear how
       | good gerrit is.
        
       | Izkata wrote:
       | As someone who has been on a maintenance team for years and
       | regularly has to dig through the history to figure things out, I
       | strongly prefer the original "bad" version with 7 individual
       | commits. Yes "git blame" takes a little bit of extra work to get
       | through all the commits, but knowing what initial mistakes were
       | made and refactors done makes it much easier to tell what the
       | original intent was.
       | 
       | For example, if "fix bob review", "fix alice review", or "minor"
       | introduced something that wasn't noticed until later, by having
       | them separate we can tell whether it was intended functionality
       | or a bug. This has happened to us a whole bunch of times with
       | rarely seen edge cases, so the bug wasn't found until years
       | later, or some other part of the code was masking the issue so it
       | didn't manifest as a bug until years later. At least one of these
       | was even caused in a "linting" commit, and all of these were much
       | more easily fixable because we could tell the bug was introduced
       | in one of these code-review-update commits, rather than the core
       | feature commit.
        
         | travisb wrote:
         | I concur.
         | 
         | I don't know why people are so happy to throw out the history.
         | I find it's not uncommon that bugs are introduced during
         | rebasing, and merging in from master. Sometimes this is due to
         | incorrectly resolved conflicts, sometimes due to concurrent
         | changes elsewhere in the codebase.
         | 
         | With --first-parent it isn't hard to have your cake (full
         | history) and eat it too (linear mainline history).
        
         | crabbone wrote:
         | This is solving the problem with the wrong tool. What you need
         | is documentation. But people working on the project you have to
         | maintain didn't write one. So, you are trying to use git blame,
         | astrology and ouija board to guess what the project authors
         | wanted.
         | 
         | And, while doing so, you are arguing for never cleaning the
         | house, keeping all the garbage where it falls. Which will only
         | make your situation worse, because the history will bloat with
         | a lot of contradictory or false information.
        
           | Izkata wrote:
           | > What you need is documentation. But people working on the
           | project you have to maintain didn't write one.
           | 
           | Oh there is documentation, but it's old and contradictory.
           | Commits are self-organizing by history.
           | 
           | > So, you are trying to use git blame, astrology and ouija
           | board to guess what the project authors wanted.
           | 
           | There's no guessing involved. Keep the original commits and
           | it's right there in the commit message and order of changes.
           | 
           | Guessing is what happens when you destroy the history by
           | using rebases like this.
           | 
           | > And, while doing so, you are arguing for never cleaning the
           | house, keeping all the garbage where it falls. Which will
           | only make your situation worse, because the history will
           | bloat with a lot of contradictory or false information.
           | 
           | I'm not saying "don't refactor" at all. Clean the code up as
           | much as you want. The history however is completely hidden
           | and not causing any clutter, it's there when you need it and
           | invisible the rest of the time, so erasing useful history is
           | just busywork for negative benefit.
           | 
           | Any apparent contradictions are also easily resolved because
           | commits have timestamps and ordering and are part of the
           | larger completely automatically generated history.
        
       | planetpluta wrote:
       | I'm confused by how pushing to new branches would work on GitHub
       | (or is the point that it doesn't...)? Are you able to change the
       | branch of a PR from `v1` to `v2` without making a new PR?
        
         | aseipp wrote:
         | Yes, the point is that it basically doesn't support that.
         | 
         | Well, OK. You can push two branches, v1 and v2, each with the
         | commits. Then to do pairwise diffs, you type in the commit
         | object hashes directly into the URL bar to diff the two objects
         | in the repository using the 'blobs' API but like... I don't
         | think that qualifies so much as "supporting" it as much as an
         | absurd hack, right?
         | 
         | > Are you able to change the branch of a PR from `v1` to `v2`
         | without making a new PR?
         | 
         | No, you have to open a whole different PR.
        
           | planetpluta wrote:
           | Okay that makes sense. Agreed, wouldn't quite consider that
           | "supporting" as far as I'm concerned.
        
       | typeofhuman wrote:
       | I just wish I could group related files in the my PR so the
       | reviewer has a better way of reading the changes.
       | 
       | It would be like a "PR Journey".
        
       | tuckerpo wrote:
       | Looks like the author really wants `rebase -i`
        
       | hoten wrote:
       | Some of the problem stated in the post is a little forced, namely
       | the issues with bisect and blame. That's only an issue if the
       | review doesn't end with a squash.
       | 
       | Also as a user of gerrit via the Chromium project, I'm not aware
       | of a way to structure the patch sets uploaded as individual
       | changes without infinite foresight. It's always appended into the
       | last. Whereas with github, at least you can rebase. I fully admit
       | I could be missing a gerrit feature.
       | 
       | That said... I would really love an interdiff feature for the
       | GitHub rebase-to-fold-in-feedback-to-meaningful-commits workflow.
        
       | jrochkind1 wrote:
       | I have found myself arranging PR's on github in sequence, for the
       | exact sort of use case OP talks about. (Each PR might have more
       | than one commit, so it's not exactly the same practice). And been
       | frustrated that Github's interface does not make it very easy to
       | make the sequence apparent or have good DX for the reviewers.
       | 
       | One could definitely imagine a UX/DX that would. (I do apprecaite
       | github's UI generlaly).
        
       | loeg wrote:
       | This workflow is exactly what Phabricator[1] facilitates, for
       | what it's worth. Also, if I remember correctly, ReviewBoard.
       | (Though I have not used that in some time and might misremember.
       | Also, it has its own flaws.)
       | 
       | Sadly the open source version is unmaintained now. It is still
       | used by FreeBSD. Facebook uses it internally for every single
       | diff, of millions.
       | 
       | [1]:
       | https://secure.phabricator.com/book/phabricator/article/diff...
        
       | swolchok wrote:
       | https://github.com/ezyang/ghstack
        
       | clktmr wrote:
       | I agree in general, but running git bisect on individual PR
       | commits is just doing it wrong. There will always be commits that
       | break stuff temporarily. Run git bisect only on the merge commits
       | instead, which are typically already tested by CI.
        
         | another-acct wrote:
         | > I agree in general, but running git bisect on individual PR
         | commits is just doing it wrong. There will always be commits
         | that break stuff temporarily.
         | 
         | That's unacceptable in my book. Before submitting any patch set
         | for review, the contributor is responsible for ensuring that
         | the series builds (compiles) at every stage -- at every patch
         | boundary. Specifically so that a later git bisect never fail to
         | build any patch across the series.
         | 
         | This requries the contributor to construct the series from the
         | bottom up, as a graph of dependencies, serialized into a patch
         | set (kind of a "topological sort"). It usually means an
         | entirely separate "second pass" during development, where the
         | already working and tested (test-case-covered) code is
         | reorganized (rebased / reconstructed), just for determining the
         | proper patch boundaries, the patch order, and cleaning up the
         | commit messages. The series of commit messages should read a
         | bit like a math textbook -- start with the basics, then build
         | upon them.
         | 
         | Furthermore, the patch set should preferably also pass the test
         | suite at every stage (i.e., not just build at every stage).
         | Existent code / features should never be functionally
         | regressed, even temporarily. It's possible to write code like
         | this; it just takes a lot more work -- in a way you need to see
         | the future, see the end goal at the beginning. That's why it's
         | usually done with a separate, second pass of development.
        
       | keybored wrote:
       | As the author is aware of[1][2] the Git project uses this
       | interdiff approach with email.
       | 
       | - Patch series (PR equivalent) go through a round of reviews
       | 
       | - Each version has a cover letter (like PR desription) unless
       | it's only a one-patch series
       | 
       | - Each version has that optional cover letter with each patch as
       | a reply email to it
       | 
       | - The next version is a reply to the previous cover letter
       | 
       | - And each version 2 and above cover letter has a git-range-diff
       | in it (courtesy of git-format-patch)
       | 
       | - And also a human/manual summary of changes between versions
       | 
       | - Optionally you can have little comments of each patch that are
       | _not_ part of the commit /patch message: just put it between the
       | three dashes and the diff. Or use Git Notes and let it handle it
       | for you (it will put it in the same space).
       | 
       | In turn there are no "address feedback" commits in the final
       | (merged) series. Only the changes themselves.
       | 
       | Just look at any `[PATCH v2 0/...` or higher (v3...) email on the
       | mailing list: https://lore.kernel.org/git/
       | 
       | Of course this isn't the easiest workflow:
       | 
       | - Email
       | 
       | - You ought to keep track of the base commit between versions
       | (you could have rebased on the main branch)
       | 
       | - You need to store versions of your branches
       | 
       | - You need to keep track of who to CC on the emails. Well,
       | perhaps not if they are the same people throughout, but it is
       | good courtesy to add people who reply to these versions to the CC
       | list
       | 
       | - You need to harvest the email message id on the cover letters
       | and use that `In-Reply-To`
       | 
       | Phew!
       | 
       | But this is quite sublime for reviewers and people who come back
       | to the series years later:[3]
       | 
       | - All of the review in the same thread overview
       | 
       | - Each version moves the thread to the right
       | 
       | - Each patch (to be commit) is commented on individually
       | 
       | - You can reply to the commit message and the diff by quoting
       | them directly
       | 
       | - The contributor will both give you the range diff (which will
       | highlight diff changes _and_ metadata changes like edited commit
       | messages) _and_ a manual summary of the changes
       | 
       | [1] https://news.ycombinator.com/item?id=41511649
       | 
       | [2] And it is with _some_ trepidation that I bring this up
       | because of the aversion some people have to email workflows.
       | Because the "interdiff style" of PRs is useful!
       | 
       | [3] Git Notes `amlog` records the message id of the patch email
       | where the commit came from
        
       | samatman wrote:
       | Posts like this add fuel to the fire of my conviction that the
       | future of revision control is Pijul, or something much like it.
       | 
       | I can't completely justify the intuition here, but this seems
       | like an artificial distinction forced by a snapshot model of a
       | codebase. It's literally about the same patches, but arranged
       | differently to give clarity to the history of the project.
       | 
       | So it seems obvious on that level that a system where patches are
       | the fundamental stratum has an advantage there, although I
       | confess that I have yet to use Pijul for a substantial project,
       | so I can't describe in detail how it might make this better. I'm
       | just going off things like how a cherry-pick in Pijul applies the
       | _actual_ change from one branch to the other, rather than
       | duplicating it over.
       | 
       | I get frustrated a bit when I think about the gap between the
       | kind of difference a leap forward in revision control could make
       | for our profession, and the amount of resources the Pijul project
       | has to work with. So I'm just putting this out there: MSFT bought
       | GitHub for about 9 billion dollars in 2024 money. Anything which
       | displaces it would be worth more than that. That's a lot of
       | leverage for anyone forward-looking enough to apply it.
        
         | travisb wrote:
         | While patches and snapshots are duals of each other, patches
         | are generally less easy to reason about and work with, in no
         | small part because compilers and humans work on the snapshot,
         | not the patch.
         | 
         | Having worked extensively with patches as the semantic unit via
         | patch(1) and quilt and stgit, I'm very skeptical that a VCS
         | based on patches is actually superior in most circumstances.
         | 
         | In this particular case, perhaps it would be helpful to view
         | code reviews as a snapshot where the differences (possibly
         | intra-review revision differences) are highlighted instead of
         | as patches.
        
       | zoogeny wrote:
       | I've heard people argue for this strategy a few times but I am
       | not convinced. Most projects I work on have feature branches that
       | get squashed into a single commit (erasing the history of the
       | branch). If we even have the case described by the author we
       | would just do the three steps (refactor, new api, update) as 3
       | commits.
       | 
       | One thing that has been a solid practice for me is to avoid long-
       | lived branches. That seems to be what leads to this kind of
       | multi-stage commit scenario. Someone wants to work for several
       | days (or worse, weeks) on a feature and then drop the whole thing
       | all at once. I strongly prefer to move things into main every day
       | (or multiple times a day). One way to do this is to use feature
       | flags to allow work-in-progress commits to land without causing
       | problems. This also has the advantage (if systems are set up
       | correctly) to enable on dev and staging environments for testing.
       | It is also a hop-and-skip jump away from blue/green rollouts.
       | 
       | I don't want ways to make big commits easier to review. I want to
       | force the team to commit small changes early and often. I
       | understand not everyone agrees.
        
         | steveklabnik wrote:
         | The people I know who prefer stacked diffs argue that it makes
         | it easier to integrate changes faster, no matter the size. Part
         | of this is because unlike a PR on GitHub, you can land parts of
         | a stack: to use the example from the article, if the "small
         | refactor" diff is good to go, it can be landed without landing
         | the "new api" and "migrate API users" diffs. Centering commits
         | rather than branches has the effect of making for smaller
         | overall commits rather than long-lived branches.
        
           | zoogeny wrote:
           | I'm not sure I understand your exact logic, so let me talk
           | through a simplified scenario for arguments sake. Imagine
           | each change takes 1 day. Imagine the team releases code to
           | production every day. On Monday a developer picks up the
           | task, does the refactor, puts up a PR and the refactor is
           | shipped Tuesday morning. Tuesday he works on the api and it
           | ships Wednesday. Wednesday he finishes the migration which
           | ships on Thursday.
           | 
           | In the alternate scenario the entire stacked commits go out
           | together on Thursday after the entire 3 days of work is
           | completed. So I do not see how it would make it easier to
           | integrate changes faster. This problem becomes worse, as you
           | can imagine, if the second task in the stack takes a week vs.
           | a day. I believe this kind of logic can be extended to any
           | timescale.
        
         | another-acct wrote:
         | > have feature branches that get squashed into a single commit
         | (erasing the history of the branch)
         | 
         | Terrible.
         | 
         | It makes git-blame and git-bisect essentially unusable.
         | 
         | If you have a regression, git-bisect can help you narrow it
         | down to a single patch. Because of that, you want to have fifty
         | 160-line patches in the git history, for a particular feature,
         | rather than one 8000 line patch.
         | 
         | If a given line of code looks fishy, you want git-blame (or a
         | series of git-blame commands) to lead you to a 160-line commit,
         | with its own detailed commit message, rather than to a
         | 8000-line commit.
         | 
         | You also want to preserve the _order_ of the original commits.
         | Just reading through the individual commit messages, in order,
         | a few years later, can be super helpful for understanding the
         | original design. (Of course the original patch set has to be
         | constructed in dependency order; it needs to compile at every
         | stage, and so on. That 's a separate development step that
         | comes on top of just implementing the functionality. The code
         | must be _presented_ in logical stages as well.)
        
       | knallfrosch wrote:
       | In the future, we'll just have AI group the differences, write
       | the summaries and understand the changes.
        
       | quodlibetor wrote:
       | I have been chasing the gerrit code review high since I left a
       | company that used it almost 5 years ago.
       | 
       | Stacked pull requests are usually what people point to to get
       | this back, but this article points out that _just_ stacked pull
       | requests don't handle it correctly. Specifically with github, you
       | can't really see the differences in response to code review
       | comments, you just get a new commit. Additionally, often github
       | loses conversations on lines that have disappeared due to force
       | pushes.
       | 
       | That said, I have a couple scripts that make it easier to to work
       | with stacks of PRs (the git-*stack scripts in[1]) and a program
       | git-instafix[2] that makes amending old commits less painful. I
       | recently found ejoffe/spr[3] which seems like a tool that is
       | similar to my scripts but much more pleasant for working with
       | stacked PRs.
       | 
       | There's also spacedentist/spr[4] which gets _much_ closer to
       | gerrit-style "treat each commit like a change and make it easier
       | for people to review responses" with careful branch and commit
       | management. Changes don't create new commits locally, they only
       | create new commits in the PR that you're working on. It's,
       | unfortunately, got many more rough edges than ejoffe/spr and is
       | less maintained.
       | 
       | [1]:
       | https://github.com/quodlibetor/dotfiles/tree/main/dot_local/...
       | [2]: https://github.com/quodlibetor/git-instafix/ [3]:
       | https://github.com/ejoffe/spr [4]:
       | https://github.com/spacedentist/spr
        
       | crabbone wrote:
       | I really don't like either idea, and don't understand why is this
       | such a big issue...
       | 
       | Working with Web interface for anything code-related is a huge
       | downside. Workflow doesn't matter at this point, the usability of
       | this approach, no matter what it does is so bad, it's not worth
       | discussing further.
       | 
       | The best way it ever worked for me is that:
       | 
       | 1. PR author creates a PR and gets assigned a reviewer.
       | 
       | 2. The reviewer leaves a commit with                   #
       | REVIEW(reviewer): Comments
       | 
       | 3. Then the PR author changes something or argues back.
       | 
       | 4. If reviewer is happy, the author gets to organize the commits
       | in whatever sequence they want (this would typically involve
       | something like squashing everything, removing review comments if
       | they are no longer necessary, and then splitting the PR into
       | logical parts). Otherwise we go back to (2).
       | 
       | No need for complicated workflows, no need for any kind of
       | external system...
        
       ___________________________________________________________________
       (page generated 2024-09-11 23:01 UTC)