[HN Gopher] Some of us like "interdiff" code review
___________________________________________________________________
Some of us like "interdiff" code review
Author : todsacerdoti
Score : 231 points
Date : 2024-09-10 20:39 UTC (1 days ago)
(HTM) web link (gist.github.com)
(TXT) w3m dump (gist.github.com)
| zdw wrote:
| I've generally found that code review first, and rebase-centric
| systems like Gerrit tend to be much easier to review code in. One
| of the best parts of this is native support for stacking multiple
| patches, so people make smaller patches that are easier to
| review.
|
| Code review in Github feels like a bad afterthought - the space-
| wasting interface that looks more like a forum thread, the
| inability to track over rebases, etc.
| zeotroph wrote:
| There is one thing I miss on Gerrit when you push a stack of
| commits: A central place to talk about the whole of the stack,
| not just individual commits. This "big picture", but still
| technical stuff, too often happens in the issue tracker. But
| where to place it, I have no idea. This stack is just too
| ephemeral and and can be completely different on the next push.
| lima wrote:
| Many teams use topics for this.
| aseipp wrote:
| Yeah, but you can't really discuss the topic _itself_ ,
| right?
|
| I do think this is a weakness of Gerrit. It doesn't really
| capture "big picture" stuff nearly so well. At least on GH
| you can read the top-level comment, which is independent of
| the commits inside it. Most of the time I was deep in
| Gerrit doing review or writing patches, it was because the
| architectural decisions had already been made elsewhere.
|
| I guess it's one of the tradeoffs to Gerrit _only_ being a
| code review tool. Phabricator also didn 't suffer from this
| so much because you could just create a ticket to discuss
| things in the exact same space. Gerrit is amazingly
| extensible though so plugging this in is definitely
| possible, at least.
| another-acct wrote:
| On a mailing list, you used to be able to write up the
| big picture in the "cover letter" (patch#0). Design-level
| discussions would generally occur in a subthread of
| patch#0. Also, once the patch set was fully reviewed, the
| maintainer could choose to apply the branches on a side
| branch at first (assuming the series was originally
| posted with proper "--base" information), then merge said
| side branch into master at once. This would preserve
| proper development history, plus the merge commit
| provides space for capturing the big picture language
| from the cover letter in the git commit log.
| jeffbee wrote:
| Someone should do a deep dive into developer productivity after
| LLVM switched from Phabricator to GitHub. How many other major
| projects have done a switch like that?
| ajkjk wrote:
| Better it be a deep dive into developer sanity, a more
| important but comparatively under-valued metric.
| jeffbee wrote:
| For me, GitHub PR review drives me crazy. It's good for
| exactly one round of exchange. After that nobody can tell
| what the heck is going on. So my self-reported mental health
| would be worse.
|
| But on non-subjective metrics it seems like LLVM PRs on
| GitHub are gathering noticeably less discussion than they
| used to enjoy as Phabricator diffs.
| zeotroph wrote:
| And just visually, GitHub wastes so much vertical space, so
| even trying to place what belong to which patchset becomes
| hard.
| ams92 wrote:
| Most of the complaints here could be solved by having smaller
| pull requests and then squashing commits when it's time to merge.
| eximius wrote:
| Well, yes. Or by having stacked branches and rebasing your
| branches, etc.
|
| The point is that GitHub/git's default experience makes this
| harder to do than a system that bakes it in.
| gloryjulio wrote:
| We use stacked commits + rebase only in our company. The commit
| history is linear and it's very easy to revert changes. I don't
| see any advantage of using merging instead of rebase
|
| I am not sure why we need to squash commits. We encourage the
| opposite where you should commit small and often. So if we need
| to revert any commit, it's less painful to do so.
| majormajor wrote:
| Without squashing it's hard for me to commit as small and
| often as I would like.
|
| Some things I want out of the final series of commits:
|
| 1) everything builds. If I need to revert something or roll
| back a commit, the resulting point of the codebase is valid
| and functional and has all passing tests.
|
| 2) features are logically grouped and consistent - kinda
| similar to the first, but it's not just that I want the build
| to pass, I don't want, say, module A to be not yet ready for
| feature flag X but module B to expect the feature flag to
| work. In the original article, this is to say that I want the
| three commits listed, but not one halfway through the
| "migrate API users" step.
|
| But when I'm developing I _do_ want to commit halfway through
| steps. I might commit 50 lines of changes that I 'm confident
| in and then try the next 50 lines and decide I want to throw
| them away and try a different way. I might just want to push
| a central copy of what I've got at the end of the day in case
| my laptop breaks overnight (it's rare, but happens!). I might
| want to push something WIP for a coworker to take an initial
| look at with no intent of it being ready to land.
|
| But I don't want any of those inconsistent/not-buildable/not-
| runnable states to be in the permanent history. It fucks with
| things like git bisect and git blame.
| gloryjulio wrote:
| When we publish a stack of commits, our ci ensures that
| every commit is build and tested individually. There is no
| consistency issue
|
| Squash and merge actually makes the above goal harder. With
| rebase + small commits, all we need to make sure is that
| every commit pass all the build signals and tests during ci
| john_the_writer wrote:
| This only works if your commit in a green state.
| Sometimes we have to change when things are still
| "Yellow"..
|
| I tend to add all my tests in one go and commit the RED.
| "tests are written" Then as I pass each test, I commit
| that.
|
| This pattern works really well for me because if I mess
| up, then rolling back to the last yellow is easy. I can
| also WIP commit if I have to fix an urgent bug, and then
| get back to the WIP later.
| gloryjulio wrote:
| Not sure what you mean... When we ship a stack of
| commits, every commit has to pass everything in CI. You
| are not suppose to ship a commit that's not passing the
| ci bar. There is a escape hatch that you can bypass but
| it's rarely used.
|
| You can make changes before you ship however you wanted
| as long as they pass ci. If you already shipped the code
| and want to make changes later, that means making new
| commit or reverting a bad commit. It's simple as that
| tome wrote:
| I think there's an ambiguity here between squashing every
| commit in the PR into a single one, and squashing fixup
| commits made as responses to review into the commits that
| originated them.
|
| For example, if the original commit series was
| Do a small refactor before I can start adding the test
| Add the test for the feature Do a small refactor
| before I can start adding the feature Work in
| progress Complete sub-feature 1 Work in
| progress lint lint Complete sub-
| feature 2 Respond to reviewer 1 comments
| Respond to reviewer 2 comments
|
| Then you can either squash the entire PR down to
| Implement feature
|
| or you can, using interactive rebase, squash (or more
| precisely fixup) individual WIP, lint, and response commits
| into where they belong to obtain Do a
| small refactor before I can start adding the test
| Do a small refactor before I can start adding the feature
| Complete sub-feature 1 Complete sub-feature 2
|
| where each commit individually builds and passes tests. I
| _far_ prefecr the latter!
| PoignardAzur wrote:
| My experience is that systemically squashing PRs enables a
| "fire and forget" style where you can add a bunch of small
| commits to your PR to address reviews and CI failures without
| worrying about making them fit a narrative of "these are the
| commits my PR is made of".
|
| On a more concrete level, squashing PRs means every single
| commit is guaranteed to pass CI (assuming you also use merge
| queues) which is helpful when bisecting.
| gloryjulio wrote:
| With stacked commits, every commit is already passing CI
| though.
|
| To us the mental model is minimum. All you need to do is to
| make sure each commit pass CI. You can ship any number of
| stacked commits together
|
| -----------------------------------------------------------
| -----------------------------------------
|
| Not sure why I can't reply in a technical discussion. I
| have to edit to answer your question @danparsonson
|
| > if I'm working on a long series of changes across
| multiple days, and halfway through it the code doesn't
| build yet?
|
| That's why you break them down into small commits. The
| early you push it to CI, the earlier you will know whether
| each commit builds. For example, push commit 1 2 3 to the
| CI when they are ready. When the CI is running, you are
| working on commit 4 5 6
|
| > The code won't pass CI because I'm not finished, but I
| want to commit my progress
|
| If your commit 1,2,3 are ready, just ship them. It doesn't
| stop you have a few commits in reviews and a few WIP
| commits. There is no down time
| danparsonson wrote:
| Perhaps I misunderstand you but what if I'm working on a
| long series of changes across multiple days, and halfway
| through it the code doesn't build yet? The code won't
| pass CI because I'm not finished, but I want to commit my
| progress so I don't lose it if something goes wrong, and
| I can roll back if make mistakes.
| tome wrote:
| Then fix up the commit history at the end, for example
| like this: https://news.ycombinator.com/item?id=41509051
| lozenge wrote:
| Not really. The idea is to split work into separate stages
| which are reviewed separately, but as a whole.
|
| In the example: "small refactor 25LOC -> new API 500LOC ->
| migrate API users 50LOC"
|
| Making a PR of the small refactor will probably garner comments
| about "why is this necessary".
|
| Opening two PRs at the same time is clutter as GitHub presents
| them as separate.
|
| As well, sometimes CI won't pass on one of the stages meaning
| it can't be a separate PR, but it would still be useful in the
| code review to see it as a separate stage.
| john_the_writer wrote:
| the comments about "why is this necessary" can be handled
| with a decent PR template, and a comment.
|
| What I tend to do is make the changes locally with different
| commits and then cherry pick the refactor into a PR branch
| and wait for that to be accepted. Then I rebase the FULL
| branch with "master" after the merge and create the PR.
| danparsonson wrote:
| I'd be quite happy with seeing the three jobs in the article
| as three separate PRs. Fixing a bug and adding a feature are
| two jobs that, as I think we all agree, need to be tracked
| individually - so work on them individually.
|
| > As well, sometimes CI won't pass on one of the stages
| meaning it can't be a separate PR
|
| Could you give an example of this? Not sure what you mean.
| Forge36 wrote:
| Commits aren't always perfect.
|
| Sometimes I'll make the unit test first, which fails CI and
| the next set of commits implements the behavior.
| trashburger wrote:
| By doing this, you break commit atomicity and make
| bisects hell. Please don't do this. Commits aren't
| perfect at first for sure, but they should be by the time
| you make them reviewable.
| snatchpiesinger wrote:
| It's fine to break commit atomicity on feature branches.
| You can use git bisect --first-parent on you
| development/master branch.
| trashburger wrote:
| I completely disagree. In doing so you lose all
| visibility into the components and gradual evolution of
| the code that atomic commits provide. Same thing with
| squashing (which is just the worst).
| fHr wrote:
| Yeah I don't see the point, why not just use mergetrains?
| globular-toast wrote:
| That's like a caveman approach to the problem. Imagine the
| extra overhead required to submit the "refactor" commit. The
| result world be either nobody refactors or refactors are just
| bundled into the feature commit so it's never clear what you're
| actually reviewing.
| keybored wrote:
| Can someone explain to me why everything has to be done with
| PRs? Like you just have three commits for a PR. But the correct
| way is to split that up into three single-commit PRs? Why?
|
| Not to mention that it doesn't give you an interdiff. Because
| now you need to diff across three pull request.
|
| It looks more like you are punting on the problem. Not solving
| anything.
| OJFord wrote:
| I create my GitHub PRs like a reviewer's going to look at the
| commits if it's too large overall. I'm also fairly sure they
| don't, because it's basically never worth doing so in their PRs
| ('fix the test', 'merge origin/master', 'address review comment',
| etc. commits).
|
| I suppose I agree GitHub doesn't help with this / implicitly
| opposes it (I don't see the 'explicitly and' claim justified
| though?) but there's nothing about a PR that isn't a 'series' of
| 'patches'. Maybe GitHub's just responding to the way most people
| use it, and making that easier/better instead of being
| principled? Doesn't mean we can't be.
| enasterosophes wrote:
| Nice, this taught me about `git range-diff` which wasn't on my
| radar before.
|
| Is the conclusion likely to be that the author thinks Gerrit is
| good, or is there some nuance I didn't pick up? I've used Gerrit
| before and in hindsight I much prefer it to other ways of doing
| code review.
| aseipp wrote:
| Yes, Gerrit is fucking great. If you actually want to do _code
| review_ and not just rubber stamp shit on GitHub because your
| eyes are going to bleed after reading the same thing for the
| 15th time, just use Gerrit.
|
| The thing is, I just never got around to finishing this article
| because what's there right now is "good enough" to get the
| ideas across.
| ajkjk wrote:
| 100% agree that this is ideal, the way Github does it is
| completely godawful and it's a tragedy that so many people have
| it normalized for them.
|
| We did this with Phabricator, although it was a somewhat-manual
| process, helped along by having some command line macros for
| updating all the reviews at once. But better still would be an
| explicit UI for it.
| aseipp wrote:
| I am the author and used the phrase "Code review is a pretty
| good idea, in general" in the opening very specifically,
| because it used be one of the selling points listed on the
| Phabricator homepage. :) I miss it.
| taspeotis wrote:
| > Grab ahold of tasks, literally. Place them in confusing,
| new orders. Make a column just for interns! Ignore the
| backlog forever.
|
| I swear the Herald selling points also had keeping tabs on
| the pesky interns, or something to that effect.
| aseipp wrote:
| One of my favorites was something like:
|
| > Phabricator is well received, and has raging reviews from
| users inside Facebook, such as "Mandatory" and "OK"
|
| Also, Evan Priestley still uses the
| https://secure.phabricator.com instance to occasionally
| post life updates, so I check in on it.
|
| Last I checked he was learning about PCB design and
| printing. I think he wanted to learn how to actually
| physically manufacture his own boards (e.g. etching your
| own conductive copper layers) and then he posted some
| update like "I'm not sure I'll go down this route, because
| it turns out you can send your design to PCBWay, and it
| will come back within 5 days, because there is a magical
| PCB Faucet somewhere in Shenzen apparently."
|
| What a guy.
| Groxx wrote:
| Every day I have to use Jira and GitHub instead of Phab is a
| rather painful day. Sadly those days are increasing.
|
| Maybe one day I'll be able to use Gerrit (it sounds great),
| and then I can be only annoyed at Jira.
| fosterfriends wrote:
| I agree with the argument laid out here. Series of small diffs
| with versions is a fantastic clean model.
|
| When creating Graphite on top of GitHub, we chose to only support
| rebase model (despite the chaos that creates in GitHub timeline
| events). We also added "versions" support, which wasn't too hard
| because GitHub holds on to old commits even if you force push
| over them.
|
| A lot of what we try to build is the exact ideas this author is
| championing, in a way that's compatible on top of GitHub. My
| dream is us and others help usher Eng back towards the patterns
| Phabricator and Gerrit helped start :)
| aseipp wrote:
| We strongly considered Graphite as an alternative to Gerrit at
| my job that I mentioned at the start of this post (which I am
| no longer at, actually) because it does look like an absolutely
| excellent product, I will admit. You should all be proud of a
| smart design and smart set of tools.
|
| But there's a really really really really really really big
| problem. Me and the other main engineer on our team used a
| custom frontend to Git called Jujutsu[1] for all development.
| Jujutsu is about 1000x better than Git. So that's nice. (I'm
| also one of the developers, so I'm not going to abandon it
| anytime soon.)
|
| But gt, the graphite client, is not open source. I have no idea
| how to make them work together. I have no idea how to extend
| Jujutsu to handle Graphite stacks, because I don't even think
| there's an API to handle any of this.
|
| I even wrote a Gerrit integration for Jujutsu because JJ works
| so well at stacking, and Gerrit + Jujutsu is absolutely a force
| to be reckoned with IMO, even if the UX isn't as nice as
| Graphite's. (I'm happy to show the people at Graphite why that
| is, too, if anyone has time or is interested, but I suspect you
| could all grasp Jujutsu quite easily :)
|
| Please! Make gt open source and make it possible for third
| parties to make and update stacks. This isn't just useful for
| jj but all kinds of automation that wants to contribute patches
| -- imagine tools like Google's internal "Code Review ML models"
| for Critique that might recommend you rename a variable based
| on context. They will suggest the fix for you or even apply it!
| You can get around _some_ of those workflows with "Incorporate
| suggested edits" which is great on Gerrit (and Graphite?), but
| not all of them.
|
| [1] https://github.com/martinvonz/jj
| jacobegold wrote:
| We (Graphite) love Jujutsu - comes up in conversation all the
| time here.
|
| A prior version of the CLI is open source, the core data
| model (using git refs to store some extra data about what a
| branch's parent is) is still the same.
| https://github.com/withgraphite/graphite-cli
|
| We've talked about supporting other clients, but don't
| currently have the bandwidth to build something like that -
| definitely something I am personally passionate about making
| sure happens at some point.
| tombl wrote:
| I think the cli repo went private a while ago.
| ankitdce wrote:
| If you are looking for an open source stacked PRs CLI, you
| can look into av CLI (https://github.com/aviator-co/av).
| Unfortunately this also only works with GitHub, but it should
| be possible to add support for any Git-based platform.
|
| Disclaimer: I'm the founder of Aviator who supports the av
| CLI. It's a free tool to manage stacked PRs.
| aseipp wrote:
| I am the person who wrote this. AMA
|
| EDIT: Also, I'm not sure if this is against the rules, but I also
| need a new job as of recently. I like working on dev tools and
| other hard problems. If you liked reading this, want me to make
| your dev team more productive, or just want to experience and
| enjoy my excellent (and occasionally _eclectic_ ) taste, the
| email is in my profile.
| rosmax_1337 wrote:
| What is the meaning of life?
| aseipp wrote:
| I know this is in jest, but I'll just take the opportunity to
| respond by posting my favorite poem. The relationship between
| it and your question -- well, that's for you to decide.
| The birds have vanished down the sky. Now the last
| cloud drains away. We sit together, the mountain and
| me, until only the mountain remains.
|
| -- Zazen on Ching-t'ing Mountain
| xelxebar wrote:
| Beautiful. I wasn't aware of Li Bai (Li Bai ). Took a bit
| of searching, but the original looks to be called Du Zuo
| Jing Ting Shan : Du Zuo Jing Ting Shan
| Zhong Niao Gao Fei Jin Gu Yun Du Qu Xian
| Xiang Kan Liang Bu Yan Zhi You Jing Ting Shan
|
| This is gorgeous. Thank you for sharing!
| rosmax_1337 wrote:
| The question was sincere yet playful. I appreciate your
| answer. Here's a poem I like. O'er all the
| hilltops Is quiet now, In all the treetops
| Hearest thou Hardly a breath; The birds are
| asleep in the trees: Wait, soon like these Thou
| too shalt rest.
|
| -- Wanderer's Nightsong II, Goethe
| zeotroph wrote:
| From my interaction with the free part of GitHub, "diff soup"
| describes it very well. Does the paid version do anything
| better? What about GitLab, can this get near Gerrit? And then
| there are the external services which try to make GitHub less
| painful (and quite pricey, especially compared to a selfhosted
| Gerrit), by providing stacked diff support, did you look at
| these?
| aseipp wrote:
| No, paying for GH doesn't make the code review experience any
| better. It's identical across public/cloud/enterprise GH.
|
| I do not know if GitLab does anything different; I've never
| used it in anger. I'd bet $10 the answer is "no, it's
| basically just the same as GitHub", though.
|
| If you want a service that adds stacking on top of GitHub, my
| conclusion after some research is that https://graphite.dev/
| is the best option. We did consider it, but it couldn't work
| in my last job for "reasons" (see my other replies in this
| thread) but if you've got a shop of Git users and just want
| to throw money at the problem, I think it's the best choice.
|
| GerritHub is also a possibility, they employ many of the
| Gerrit devs and know what they're doing, but holy shit the
| corporate options are expensive out of the gate. It's like
| $20k/yr minimum regardless of size or number of users.
|
| Honestly, Graphite is cheap as hell considering how much more
| productive your engineers can be with a good review tool.
| Gerrit was basically night and day for us. It's not "oh, it
| pays for itself really quickly in a few days!" You'd probably
| pay off the monthly cost in less than an _hour_ of actual
| code review. And you don 't even have to opt most of your
| engineers in; you can trial it where 90% of them use GH and
| only a subset use graphite and pay.
| zeotroph wrote:
| Right, for Graphite $20/dev/month is nothing (I wonder if
| Enterprise is less or more more than that...), considering
| an ounce of review (prevention) is worth a pound of
| bugfixes (cure).
|
| And when you can not get corporate to switch away from GH,
| then that is it. In hindsight an obvious way to (almost)
| print money, congratulations, but also a sad state of
| affairs.
|
| But I imagine the $20k/yr is something you can easily spend
| on a 1/5 of a dev doing Gerrit maintenance.
| fHr wrote:
| Never going to understand those finance teams who think
| like 20k/yr for any enterprise deal is a good deal to
| onboard more customers and increase reach.
| sureglymop wrote:
| Because they don't need _that much reach_ if even a
| single customer nets them 20k /yr.
| amstan wrote:
| Does Graphite have a gerrit instance or something? I'm
| prepared to say "shut up and take my money" compared to the
| other 20k/year offer.
| kspacewalk2 wrote:
| >I do not know if GitLab does anything different; I've
| never used it in anger. I'd bet $10 the answer is "no, it's
| basically just the same as GitHub", though.
|
| You would bet incorrectly then. GitLab does essentially
| what you're describing, the only difference being that it
| compares different iterations of the force-push "naively",
| so if your force-push includes for example a rebase onto
| master because another MR has been merged ahead of yours,
| the diff will include the changes that have been rebased
| onto.
|
| If you decide to register an account on GitLab, simulate
| the MR and prove to yourself that ~90% of your interdiff
| post has been implemented by GitLab for about a decade,
| kindly donate the $10 to your nearest homeless shelter.
| aseipp wrote:
| > the only difference being that it compares different
| iterations of the force-push "naively", so if your force-
| push includes for example a rebase onto master because
| another MR has been merged ahead of yours, the diff will
| include the changes that have been rebased onto.
|
| > ~90% of your post
|
| This is literally what GitHub does, down to the very
| word, and it it is inferior to Gerrit, and it is not
| sufficient to get 90% of the way, the last 10% matters.
| As I have explained a dozen times in this thread. Lol.
| another-acct wrote:
| > GitLab [...] compares different iterations of the
| force-push "naively", so if your force-push includes for
| example a rebase onto master because another MR has been
| merged ahead of yours, the diff will include the changes
| that have been rebased onto
|
| That's quite the deal breaker IMO; for example it
| couldn't be used to compare a backport series (targeting
| an older stable branch, for example) against the original
| commit range on the master branch.
| atombender wrote:
| Graphite is OK. It's not great.
|
| It's very buggy. My stacks frequently just hang and become
| unmergeable.
|
| It frequently gets confused about its state and "gt sync"
| thinks that the upstream has changed even when it hasn't.
|
| Because of the three states involved (local, GitHub, and
| Graphite), there is a lot of "syncing". "gt submit", "gt
| track", "gt untrack", "gt sync" are all needed constantly,
| which adds mental overhead over the usual git pull/push.
|
| You end up with a _lot_ of force-push churn in the GitHub
| PR activity when re-stacking. In fact, I dread syncing
| anything because in 6-PR stack it could cause an avalanche
| of force-pushes. This is of course a side effect of the
| GitHub PRs model and their rebase approach.
|
| The web UI is pretty terrible. It's full of AI crap, and
| there's a "meme library". It kind of seems like they want
| people to live in their app, taking over from GitHub. But
| they don't really offer a reason whatsoever to go to
| graphite.dev instead of GitHub. Graphite doesn't really
| help the review process.
|
| They _really_ want you to sell you their merge queue
| product.
|
| In short, when it works it's fine. It's nice to do a series
| of PRs and then watch as it merges them. But it's not a
| tool I enjoy using, for the above reasons.
|
| I'd prefer a lightweight tool that just managed PRs and
| then did the merge magic. Graphite just wants to be too
| much.
| WorldMaker wrote:
| GitHub does have some stacked diff/"merge train" tools on the
| _deeply_ paid side (the "Call Us" sorts of Pricing tiers)
| that I've only seen screenshots and demos of.
|
| On the other side: If you get into the habit of "Reviews" on
| GitHub, which are in the free part, too, GitHub gives you a
| quick button for "Review commits since your last Review"
| under the Commits dropdown in the Files view. That mostly
| only works if you add commits rather than rebase, hence the
| complaints about contributing to "diff soup", but it's a
| reasonably useful workflow and there are workarounds on the
| "other side" to help deal with "diff soup".
|
| This is why some encourage Squash Merging as the GitHub
| preferred merge button. Review as a bunch of small commits
| over time, merge an entire PR to a single final commit.
|
| That said, as an alternative to squash merging, git itself
| provides some useful tools for dealing with "diff soup" style
| repositories using real merge commits: `--first-parent`. `git
| log --first-parent`/`git blame --first-parent`/`git bisect
| --first-parent` and more give you a "PR top-level view" in
| your integration branches (such as main branch) without you
| needing to rebase/squash.
|
| I wish more UIs took a `--first-parent` by default approach,
| including/especially GitHub's weak commit views (though it is
| understandable why GitHub pushes you to wanting to use its
| PRs list instead of commit views by keeping them weak).
| torarnv wrote:
| Amazing summary of the mental model of the interdiff review
| style, and the problems with GitHub's approach to code review.
| Thank you!
|
| One thing you don't touch upon (yet) is that the diff soup may
| lead people to prefer the squash merge strategy, to get rid of
| the "noise" of the fixup commits, which throws away the "good"
| initial 3 atomic commits as well.
|
| With interdiff review style you're left with the initial 3
| commits, and the choice of whether to land them individually or
| squash them is based entirely on the commits themselves and how
| atomic they really are.
| dijksterhuis wrote:
| What do you think of GitLab set up with Merge Request
| Dependencies + Squash+Merge merge strategy?
|
| I remember it being pretty easy to have that series of MRs
| pattern set up. On merging 3x MRs it would be 3x merge commits
| w/ a single squash commit for each. Regardless of the MR's
| commit history.
|
| Tradeoffs are having to merge earlier series branches into
| later series branches after changes are made during review.
|
| But people can do what they want to the commit history during
| review. Don't matter as it just gets squashed.
|
| Been a while since I've done this mind, been slumming it with
| GitHub so I might be looking at it with rose tinted sunglasses.
| chipx86 wrote:
| We first built interdiffs in Review Board
| [https://www.reviewboard.org] way back in 2006 (in fact I think I
| may have coined the term, or arrived at it independently). It's
| still my favorite part of the product and my process when doing
| code reviews. And it's one of the things we hear the most
| nostalgia for when people move to something like GitHub.
|
| I've never felt that fix-it commits are really a proper
| alternative, since:
|
| 1) They don't tell you what upstream changes have been
| incorporated into a series of commits.
|
| 2) They tend to mess up the commit graph, even if temporarily,
| and make it more difficult to review. If you've been following
| along with a review, you may have already read the code being
| fixed in fix-it commits, but if you're coming in fresh, you may
| start off with a bad sense of what that code's trying to do or
| how it's structured.
|
| 3) Not everyone uses Git or other multi-commit-capable SCMs.
| Plenty of people are on Perforce (such as in gamedev) or on
| specialized SCMs like Keysight SOS (such as chip manufacturers).
| So fix-it commits aren't even an option there.
|
| A proper interdiff-capable code review systems means one reviewer
| can start off from the first published review request and follow
| along with every update, seeing only what's changed, while
| another can jump in to the latest full change and not have to
| worry about the series of fix-its that led up to it. And it can
| do this regardless of the SCM.
|
| If done right, it can also exist alongside multi-commit changes.
|
| Let's say I have a small project I've broken up into multiple
| commits to help with the review process (say, an API handler,
| front-end UI, and documentation), and have decided this is
| suitable for posting as one review request (since the commits are
| largely interrelated and having these as one change helps lend
| context to the reviewers -- if they aren't, multiple review
| requests in a dependency chain are probably ideal).
|
| Based on review feedback, I may end up making a series of changes
| to one or all of those commits. When people go to review my
| updates, it's nice to be able to see how each piece evolved,
| without trying to do the mental arithmetic of mapping fix-it
| commits and their changes to their corresponding changes.
|
| So yes, interdiffs are fantastic! More people should use them,
| whether they're working with lots of small commits or large
| commits, single-commit review requests or multi-commit.
| aseipp wrote:
| That's sick as hell, friend. Actually, I have a second part to
| this article discussing some of the history and politics of
| what brought me to these tools.
|
| In about 2013, I migrated the Glasgow Haskell Compiler from
| "read .patch files on bug reporter" that I joked about, to
| using Phabricator. For a couple reasons, but at the time one of
| them was _not_ stacked diffs. It was because GitHub was so bad
| for review it didn 't even have _side-by-side diffs_! It was a
| total non-starter for me for those reasons among others.
|
| But that actually wasn't the first time I migrated a team to a
| code review tool. My first job in 2009 was a very small tight
| knit team of engineers in a single room in Houston, and I
| remember thinking it would be really good to get reviews of my
| code from other people, and to read the things they wrote so I
| could better understand the codebase. So, the first thing I did
| in the first few months was pester my manager to set up...
| ReviewBoard! And we all really liked it.
|
| So I guess this is a way of saying thanks for RB! I still think
| of it fondly from time to time. And because of it, Code Review
| has always just been a huge part of my career, almost since day
| one (and I could still do more of it.)
| chipx86 wrote:
| I love that! Thanks for sharing that with me :) It made my
| day. Looking forward to the second part! If you're on really
| any of the current social platforms, I'd love to connect. I'm
| chipx86 everywhere.
|
| I still develop Review Board full-time with a small team :)
| The development world has changed a lot, and much of the
| world has converged on GitHub (it's a hard market to be in
| right now), so we still look for opportunities to build
| review capabilities that target problems people have that
| aren't being addressed elsewhere.
|
| Just as an example, we launched PDF Review and diffing a
| while back, which we see companies use for things like
| industrial designs and schematics. It's neat, we actually
| diff two rendered PDFs without just converting them into text
| files first, like you usually see. Following that up with
| full Office Document Review soon.
|
| Dark mode _finally_ shipped earlier this year (I should write
| about that endeavor sometime). And there 's a couple of
| super-neat code review capabilities we've come up with that
| nobody's doing, which I'm keeping under wraps for next year.
| I think they're going to be pretty awesome.
| sesuximo wrote:
| If someone brought back phabricator i think they'd make easy $
| roryokane wrote:
| Phorge (https://phorge.it/) already exists. It's a still-
| maintained fork of Phabricator.
| deathanatos wrote:
| Yes! This is what I imagine in my head as a real code review
| style, not the stuff Github does. Glad to have a name for it.
|
| I'd add I'd also like my review system to be able to kick patches
| "out" of the review once they're ready. E.g., the small bugfixes
| that you make while working on that bigger feature should
| hopefully be small, isolated patches, ones that are going to find
| consensus with a reviewer quite quickly. Once that happens ... I
| want to just eject that patch from the entire series & cherry-
| pick it to main, and rebase the review on the new HEAD. (Or,
| differently, rebase the latest version of the patch series on
| main, reordered such that the agreed patch is fist, and then FF
| main onto that agreed upon patch.)
|
| Essentially, narrow the scope of the review to "the part we're
| still talking about", but let bug fixes see merging as soon as
| they're ready.
|
| The argument I'd have against this is "just make that a separate
| review/PR". But then you get into the hairiness of patchset A
| depends on patchset B, until B is merged, then it just depends on
| main.
| zeotroph wrote:
| That's exactly what Gerrit can do. When you push an x-b-c-d-e
| chain, these show up stacked in the UI, but you can easily
| cherry-pick b onto main (see that the CI passes, and the usual
| review), and rebase everything on top of that. If it is x, the
| bottom one, you can directly submit it and continue with the
| others.
| lolc wrote:
| All this rebasing sounds like constant pain to pull from
| Gerrit. Does it actually create new branches v2 v3 after a
| rebase? Or how do I switch my local checkout to the rebased
| branch from remote?
| aseipp wrote:
| I keep saying this over and over but, Gerrit basically does
| that. :) You can see the relationships between any two patches
| on Gerrit, and more importantly, Gerrit shows you _each patch
| individually_. So you can see in a series A - > B -> C that
| yeah, B is small, let's go ahead and get that in.
|
| Part of this is that UX has some really smart ideas like the
| "Attention Set". The attention set is basically "Which people
| need to take the next step?" Like a turn-based game. So, if you
| just did a review, you're not in the attention set for that
| patch anymore -- the _author_ is.
|
| That means Gerrit puts it down at the bottom of your queue in
| the UX. And what's at the top of the queue? Things where _you_
| are in the attention set! So it naturally groups things this
| way.
|
| I didn't get into all the other really annoying papercuts with
| GitHub's UX, but even the pull request listing is worse than
| the alternatives. How do you know what state _anything_ is in?
| You don 't, you have to go read the whole thing.
| zeotroph wrote:
| Attention Set <3 - that alone is worth another post. Gerrit
| really is one of the best kept dev secrets, and if you never
| had the luck of seeing it in person at a company where you
| worked at, well...
|
| Makes me wonder what other git or dev-in-general blindspots I
| have.
| ankitdce wrote:
| Yeh, Attention Set is a game changer. We (Aviator) also
| took inspiration (ahem.. copied) attention set from Gerrit:
| https://docs.aviator.co/attentionset
| deathanatos wrote:
| I guess I missed it in your article, and I've never had the
| opportunity otherwise to use Gerrit. (Since Github is
| essentially so pervasive. I've only used that, Gitlab, and an
| internal review system that didn't do interdiff.)
| SlySherZ wrote:
| Do you know how to create a local branch that tracks a gerrit
| commit? Usually I just do git commit --amend to update
| gerrit, but then I lose access to my patch's history (it's
| still on gerrit, but I want it locally).
| pronoiac wrote:
| I'm midway through, but a nitpick:
|
| > You're on your own figuring out the Commit IDs and punching
| them into the URL bar if you want something more granular.
|
| There's a "commits" tab at the top, like: Conversation |
| [Commits] | Checks | Files changed
|
| Example: https://github.com/raspberrypi/linux/pull/6330/commits
| aseipp wrote:
| I more meant that it's hard to diff between _arbitrary_ commits
| without using the URL bar. The results are also... Weird. For
| example, let 's say you have the base B, and commit X:
| B ---> X
|
| Now someone pushed to main, so you rebase on B'
| B' ---> X
|
| Now, you modify X to address something (maybe just a spelling
| error) B' ---> X'
|
| Now you push the new rebased branch. Question: how do you view
| the difference between X' and X?
|
| Well, you have to use that little "Compare" button, but there's
| a really big problem with it: it shows you the diff from X to
| X' _and_ the diff from B to B ' at the EXACT same time. Which
| is really bad! Imagine if the difference between B and B' is
| 500 lines; it will completely dwarf the 1 line typofix from X
| to X', making it impossible to read.
|
| Now, this is kind of a problem in Gerrit too. But they use a UX
| technique to make it manageable, which is very smart: they
| color-code the lines of the diff, depending on if the diff
| comes from B..B' or from X..X' -- so you can see _at a glance_
| if the hunk is relevant.
|
| More broadly, interdiffing between commit X and commit Y can be
| tricky, because what you _really_ want to do is something like
| "Rebase Y onto the parent of X, then diff X and Y", because
| otherwise you get the included differences between their
| baselines. (We do "Rebase Y on X's parent" for Jujutsu's
| "interdiff" command IIRC?) But sometimes you DO want to include
| the base diff, because the changes from B..B' can be VERY
| relevant to your patch.
|
| So, you need all these options, really. But once you go off
| this beaten path where you want to compare X to Y... yeah, you
| have to start typing into the URL bar, I think.
|
| But I will say, the commits tab and the little n/p keyboard
| shortcuts to "flip through the commits" like book pages is at
| least a HUGE improvement over the basic UX though. I use that
| all the time on GH projects these days, even if I have tons of
| other problems.
| skydhash wrote:
| At my last job, the workflow was mostly merge based instead
| of rebasing. After you create a branch, you merge from
| develop of there's anything new you want or to resolve
| conflict. And the whole PR will be squashed and merged
| (there's an ID referring to the ticket). Force pushing to an
| open PR was frowned upon.
|
| I think it reflected the email based approach a bit better as
| you can't alter the first patch you sent, only send new ones.
| So even if the history of a PR is messy, we preserve its
| chronological aspect alongside the discussion.
| strken wrote:
| This is interesting. At work we use PRs like the author uses
| commits, and in fact we squash-and-merge them at the end, but our
| approach requires rebasing the later PRs whenever we make a
| change to the earlier PRs. This can be quite laborious, falls
| afoul of the "don't force-push" rule, takes a long time for
| engineers to learn, and tends to break existing code review
| comments in the GitHub interface, but works out okay for two or
| three PRs. In our workflow a commit is less a unit of work and
| more a savepoint.
|
| I also use heavily use stashes, as well as undo-tree-mode in
| Emacs. This means we have four different ways of tracking the
| history of source code, which sounds redundant but works out okay
| in practice.
|
| The ergonomics of doing this in Git are pretty bad. I found
| Phabicator better but still unnecessarily difficult. Perhaps a
| new source control management tool could have first class support
| for higher level concepts than just commits and branches, or
| perhaps that would be even worse to use.
| steveklabnik wrote:
| > Perhaps a new source control management tool could have first
| class support for higher level concepts than just commits and
| branches, or perhaps that would be even worse to use.
|
| You should check out jj, sapling, or mercurial.
| eddd-ddde wrote:
| +1 for jj
|
| It simplifies my workflow a lot when compared to plain git. I
| love not having to track and stage manually, plus undo is
| godsend.
| rfoo wrote:
| If you are stuck with git, try git-branchless. At least it
| makes rebasing your patch stack easier (git sync && git
| submit).
| keybored wrote:
| Git-branchless gave me thousands of refs that a simple
| deinit wouldn't delete. It's so heavyweight for just
| experimenting with a few new workflows here and there.
|
| I can't recommend Git-branchless until it either tells me
| what it is about overall or figures it out for itself.
| steveklabnik wrote:
| jj (and sapling, in my understanding) uses git as a
| backend, so you can use it even if you're stuck with git.
| Not that I think git-branchless is bad, mind you, but just
| to be clear about it.
| keybored wrote:
| You use commits as PRs. This complicates things because of PR
| dependencies.
|
| Why not a better tool than Git? Sure, I would like a more user-
| friendly tool as well. Less warts. But what about the stopgap
| solution of simply treating commits like commits instead of
| making things harder for yourself?
|
| (I wonder how much GitHub is to blame for giving people these
| PR-colored glasses)
| steveklabnik wrote:
| Given that you can use git with Gerrit, I am confident in
| placing the blame on GitHub here.
| codeapprove wrote:
| Love the blog post, it's great to see people actually thinking
| about how code review should work!
|
| I've used four different code review systems extensively, all
| with different strengths and weaknesses: Critique (Google
| internal), Gerrit (at Google, but same as external), GitHub
| (duh), and CodeApprove (the one I built).
|
| Critique was far and away the best, but it only works because
| it's perfectly fit to Google's monorepo and the custom VCS
| they've built as well as all of their custom lint/test tooling. I
| designed CodeApprove to bring as much of that as I could to
| GitHub, but it will never really be close.
|
| Gerrit was the second best in terms of the reviewer experience
| ... but as an author I always hated it. It just seemed to be so
| author-hostile. There were more wrong ways to do something than
| right ways. And the UI is not exactly beautiful.
|
| GitHub is extremely author friendly, it works how we think. You
| write code, you get feedback, you write more code, etc. If you
| squash and merge at the end of a PR you don't have the history
| problems the author mentioned. It's not very reviewer or team
| friendly though. Incremental diffs are not highlighted. Diffs and
| conversation are in different tabs. Force pushes and rebases
| destroy history. Comments are lost as "outdated". You can't
| comment on files outside the diff window. Large files are hidden
| by default, etc etc. They clearly don't care about this too much
| and maybe they know something I don't.
|
| In the end, the thing I find most frustrating is how many teams
| just accept whatever code review tool is built in to their VCS
| platform. That would be like using whatever IDE shipped with your
| laptop! There are so many better options out there today. My
| favorites (besides CodeApprove) are GitContext, Reviewable, and
| Graphite but I can name half a dozen other excellent choices.
| Don't accept the defaults!
| another-acct wrote:
| > GitHub is [...] not very reviewer or team friendly though
|
| The problem is that reviewers / maintainers are much scarcer
| than contributors. Workflows and UIs should optimize for
| reviewer throughput, IMO.
|
| (I've never used the three other tools you mention, so my
| argument is general.)
| mstachowiak wrote:
| It's always exciting to see new approaches to code reviews -
| GitHub has its strengths, but it's far from perfect.
|
| For the scenario you've outlined, have you thought about
| splitting the 3 patches into separate, dependent pull requests?
| While GitHub doesn't natively support this, the right code review
| tool (shameless plug - I'm part of a team building one called
| GitContext) should allow you to keep pull requests small while
| maintaining dependencies between them. For example, patch 3 can
| depend on patch 2, which in turn depends on patch 1. The
| dependency tracking between them - provided by the code review
| tool - can ensure everything is released in unison if that's
| required.
|
| Each patch can then be reviewed on its own, making feedback more
| targeted and easier to respond to. You can even squash commits
| within a pull request, ensuring a clean commit history with
| messages that accurately reflect the individual changes. Better
| still, with the right tool, you can use AI to summarize your pull
| request and review, streamlining the creation of accurate commit
| messages without all the manual effort.
|
| A good code review tool also won't get bogged down by git
| operations like rebases, merges, or force pushes. Reviewers
| should always see only the changes since their last review, no
| matter how many crazy git operations happen behind the scenes.
| That way, you avoid having to re-review large diffs and can focus
| on what's new. The review history stays clean, separate from the
| commit history.
|
| I'd be curious if this approach to splitting up pull requests and
| tracking their inter-dependencies would address your needs?
| fHr wrote:
| Why not just do good old mergetrains with pullrequest A points
| to branch B amd then B points to master, merge B into master
| and thereafter point A back to master or am I missing the
| point?
| steveklabnik wrote:
| This is called "stacked diffs" and it's a good workflow; the
| issue is that it's annoying to use on GitHub without tooling.
| The "point A back to master" bit isn't easy/obvious with pull
| requests.
| keybored wrote:
| From the peanut gallery of HN I've never understood Stacked
| Diffs. It looks like they reinvented commits as dependent
| PRs. Which are stacked on top of each other like commits
| are.
| steveklabnik wrote:
| Fun fact: part of the reason that this article is on HN,
| I believe, is because I linked it to someone on another
| site as a means of explaining stacked diffs.
|
| > It looks like they reinvented commits as dependent PRs.
|
| Sort of kind of. It really depends on what you mean by
| PR: if we're talking about "review this branch please,"
| which is what I would argue most mean by PRs, then yes,
| in the context of "stacked PRs for GitHub" it's largely
| about tooling that makes dependent PRs easier.
|
| But there are other, non-GitHub tools. With those tools,
| you don't say "here's a branch, review it please," you
| say "here is a stack of commits, review them please."
| There's no branch going on inside. It's just a sequence
| of commits. This matters because it centers the commit as
| the unit of code review, not a branch. This also means
| that you can merge parts of the stack in at different
| times: to use the example from the article, once "small
| refactor" is good to go, it can be landed while "new API"
| is awaiting review. etc.
|
| I think it takes actually using some of these tools to
| really "get it." I never understood them either, until I
| actually messed around with them. I am currently on my
| project solo, so I don't really stack at the moment, I
| think it really helps more the larger of a team you're
| working with is.
| keybored wrote:
| Oh hey, thanks for the explanation! I've been wondering
| about this for a while. The linked articles on HN tended
| to be heavy on arguing how unlike the workflow is to what
| "you are used to" that the description of what it was
| about got obfuscated.
|
| I've used Git with email a little bit which also lets you
| review commits in isolation. It's too bad that so many
| review tools bury the commits (ask me how many times
| someone at work has asked "what this is about" on the PR
| diff when the relevant commit message explains exactly
| that).
|
| But what I like about email is that the whole series/PR
| also gets reviewed as a unit. Both worlds.
| steveklabnik wrote:
| You're welcome. Yeah, it as a flow is much closer to the
| email workflow than the PR workflow, when you commit to
| it.
| codethief wrote:
| As mentioned elsewhere in this thread, this is also the
| approach that Sapling follows.
|
| As for GitContext, how do you keep track of commits across
| fixups, rebases, reordering, etc.?
| crabbone wrote:
| What's the point of keeping track of commits? I honestly
| never understood people wanting that. Is this for some kind
| of weird accounting / social-score system where the number of
| commits decides your yearly bonus?
|
| It's useful to see how the system evolved (because you might
| want to go back a bit and redo the newer stuff), but it's
| pointless to see the mistakes made along the way, for
| example, unless you have some administrative use for that.
|
| Similarly, if a sequence of commits doesn't make sense as
| committed, but would make better sense if split into a
| different sequence: then I see no problem doing that. What's
| the point of keeping history in a bad shape? It's just harder
| to work with, if it's in a bad shape, but gives no practical
| advantages.
| torarnv wrote:
| As far as I know, splitting the series into individual PRs only
| works if you have commit rights to the repository, so you can
| base one PR on a different branch (in the main repository) than
| main.
|
| As an outside contributor, with a fork of the repository, your
| three PRs will incrementally contain change A, A+B, and A+B+C,
| making the review of the last two PRs harder, because you need
| to review diffs for code you're already reviewed in another PR.
| meling wrote:
| Not sure about the fork workflow but otherwise it is possible
| to change the base branch (manually on GH's web interface) so
| that you don't have to see the original branches and review
| the changes from A to B and from B to C. Maybe this is not
| possible with fork workflows?
| dogleash wrote:
| > It's always exciting to see new approaches to code reviews -
| GitHub has its strengths, but it's far from perfect.
|
| This is nice sentiment, it's positive reception to an idea and
| polite to the incumbent.
|
| But it's so thoroughly not a new idea. It's literally the
| workflow git was designed to support, and is core to many long-
| standing criticisms about GitHub's approach for as long as
| GitHub has had pull requests.
|
| And I'm over here wondering why this idea took _*checks
| calendar*_ over 15 years to graduate from the denigrated
| mailing list degens and into hip trendy development circles.
|
| I thought we were knowingly choosing shit workflows because we
| had to support the long-standing refusal by so many software
| devs to properly learn one of their most-used tools. That's why
| I chose the tools I chose, and built the workflows I built,
| when I migrated a company to git. Nobody gets fired for buying
| IBM after all.
| aseipp wrote:
| I mean, the answer is simple. Even if email-based flows use
| range-diff, which is the correct _conceptual_ model, all the
| actual details of using email are, I would estimate, about
| 1,000x shittier in 2024 than using GitHub in 2008 when I
| signed up for the beta as user #3000-something.
|
| Email flows fucking suck ass. Yes I have used them. No, I
| won't budge on this, and no, I'm not going to go proselytize
| on LKML or Sourcehut or whatever about it, in Rome I'll just
| do as the Romans even if I think it sucks. But I've used
| every strategy you can think of to submit patches, and I
| can't really blame anyone for not wading through 500 gallons
| of horrendous bullshit known as the mailing list experience
| in order to glean the important things from it (like range-
| diff), even if I'm willing to do it because I have high pain
| tolerance or am a hired gun for someone's project.
|
| Also, to be fair, Gerrit was released in 2009, and as the
| creator of ReviewBoard (in this thread!) also noted it
| supports interdiffs, and supported them for multiple version
| control backends, was released in 2006! This was not a
| totally foreign concept, it's just that for reasons, GitHub
| won, and the defaults chosen by the most popular software
| forge in history tend to have downstream cultural
| consequences, both good and bad, as you note.
| dogleash wrote:
| Dude, I'm not making a defense of mailing list workflows
| here. I'm just pondering the nature of the world where
| despite all the yapping about git I've seen floating around
| on the internet for as long as I've been lurking social
| media, the yappers are just recently keying in on
| something.
| aseipp wrote:
| If you're asking "Why did this take 15 years for people
| to understand" and my reply is "Because it was under 1000
| layers of other bullshit", then that's the answer to your
| pontification. It has nothing to do with whether you
| think email is good or not. You pondered, I answered.
| That simple.
| another-acct wrote:
| > Because it was under 1000 layers of other bullshit
|
| Not only because of that.
|
| git-range-diff, while absolutely a killer feature, is a
| relatively new feature of git as well (a bit similarly to
| "git rebase --update-refs" -- which I've just learned of
| from you <https://news.ycombinator.com/item?id=41511241>,
| so thanks for that :)).
|
| Namely, git-range-diff existed out-of-tree as "git
| tbdiff" <https://github.com/trast/tbdiff> originally. It
| was ported to git proper in August 2018
| <https://github.com/git/git/commit/d9c66f0b5bfd>; so it's
| not a feature people could have used "15 years ago".
|
| (FWIW, before git-range-diff was a thing, and also before
| I had learned about git-tbdiff, I had developed a silly
| little script for myself, for doing nearly the same.
| Several other people did the same for themselves, too.
| Incremental review was vital for most serious
| maintainers, so it was a no-brainer to run "git format-
| patch" on two versions of a series, and colordiff those.
| The same workflow is essential for comparing a backport
| to the original (upstream) version of the series. Of
| course my stupid little script couldn't recognize
| reorderings of patches, or a subject line rewrite while
| the patch body stayed mostly the same.)
| another-acct wrote:
| > all the actual details of using email are, I would
| estimate, about 1,000x shittier in 2024 than using GitHub
| in 2008
|
| Disagree about "all". Tracking patches in need of review is
| better done in a good MUA than on github. I can suspend a
| review mid-series, and continue with the next patch two
| days later. Writing comments as manually numbered,
| plaintext paragraphs, inserted at the right locations of
| the original patch is also lightyears better than the
| clunky github interface. For one, github doesn't even let
| you attach comments to commit message lines. For another,
| github's data model ties comments to lines of the
| _cumulative diff_ , not to lines of specific patches. This
| is incredibly annoying, it can cause your comment for patch
| X to show up under patch Y, just because patch Y includes
| _context_ from patch X.
|
| Edited to add: github also has no support for git-notes.
| git-notes is essential for maintaining patch-level
| changelogs between rebases. Those patch-level changelogs
| are super helpful to reviewers. The command line git
| utilities, such as git-format-patch, git-rebase, git-range-
| diff, all support git-notes.
| juped wrote:
| Nope, none of it was knowingly done, and plenty of teams are
| almost trivially convertible to the normal workflow, even
| without inventing a buzzword like TFA did!
|
| Though plenty aren't. I get it. (But one of the magic phrases
| that really works well is "this is what git, itself, does,
| and there's a man page installed on your system at this very
| moment explaining it")
| keybored wrote:
| The author already has a branch and its commits. What then
| would be the purpose of splitting the branch into _three pull
| requests_ with one commit each?
|
| - Each commit can be reviewed on its own
|
| - Dependency tracking between commits? Yes...
|
| - They don't have AI built-in (a good thing)
|
| - You _can_ have an interdiff between commits unlike PRs
|
| Commits are the bread and butter of Git. Just use commits.
| EPWN3D wrote:
| Is it just me or is the author just arguing for fixup commits and
| squashing them, something GitHub and Stash/Bitbucket handle just
| fine? I don't see how the idea of "publish a new version of the
| three commits" is new with or exclusive to Gerritt.
| wodenokoto wrote:
| When doing code reviews, I think it is annoying that every time I
| comment on a line, PR author gets a notification.
|
| This is not a simultaneous, real-time thing.
|
| I'm in the middle of doing my review, and my comments are not
| ready to be read. Maybe I'll change my mind on my comment on line
| 8 when I reach line 80.
| globular-toast wrote:
| GitLab lets you compose the entire review before submitting it
| so the author won't see anything until you're done.
| djur wrote:
| Github does, too. You just click "Start a review" when
| filling out your first comment.
| senko wrote:
| GitHub as well.
| aseipp wrote:
| GitHub lets you do this, but _only_ when writing a review for
| someone else. Not when _addressing_ someone else 's review, in
| which case your complaint 100% stands.
|
| And yes, it absolutely drives me nuts, honestly. Why can I
| batch review comments, but not resolutions! Another thing
| Gerrit gets right! GitHub... Please...
| another-acct wrote:
| ... Another thing that mailing list-based development gets
| right ;) Most MUAs should know about a concept called
| "Drafts". I can have as many draft messages concurrently as I
| want, I can work on them over several days, and I can send
| them out (in practice) as a batch.
| globular-toast wrote:
| GitLab supports this. Every time someone pushes _or force pushes_
| it tags that as a version which you can diff. If your developers
| know how to generate new commits then you can do it right away
| with GitLab.
|
| The problem is generating the new commits. Developers just aren't
| very good at doing this. They can modify a single commit just
| fine, but modify a commit that isn't the latest commit involves a
| rebase.
|
| Magit has the "instant fixup" option which is basically like
| amending an arbitrary commit instead of just the latest. What is
| actually doing is doing a commit with `--fixup` then `rebase
| --autosquash`. This technique can be used manually. Fixup/squash
| commits should be part of all developers' toolkits.
| zeotroph wrote:
| > Developers just aren't very good at doing this.
|
| GitHub provided a way to contribute, but also to avoid learning
| to rebase, thus making it more welcoming to devs who only know
| about commit and pull - that is what made it so popular. The
| squash then rebase or merge step is done on server side. Plus
| it has a very "harmless" UI, but that hides a lot of details
| (patchsets) and the layout wastes so much space imo.
|
| This also means devs could avoid learning more about git, and
| this lowest common denominator git workflow makes it so
| frustrating for those of us who learned git all the way. I
| can't even mark a PR as "do not squash" to prevent it being
| merged in the default way which throws out all history.
| globular-toast wrote:
| Yeah, this annoys me too. I actually find the "forced merge"
| (`--no-ff`) style even worse because you can never tell if
| you're looking at "real" commits or just crap because they
| couldn't be bothered to rebase.
|
| Must say, though, I think "history" is completely the wrong
| way to think about version control. It's not about tracking
| history, it's about tracking versions. History is the crap,
| unrebased commits. Rebasing turns the history (throwaway,
| works in progress) into versions.
| kspacewalk2 wrote:
| Rewriting history and breaking N sloppy commits into M well-
| thought-out, logical commits is an essential git-based version
| control skill for developers. Thus, interactive rebase should
| be considered essential for anyone using git for anything non-
| trivial. It's TUI-like interface is a bit quirky for some
| people, but it is rock-solid once you figure it out and
| therefore worth investing a bit of time into learning. (That
| also describes git in its entirety quite well).
| cryptonector wrote:
| > Interlude: Can you please just tell me if git rebase is evil or
| not so that we can derail the entire discussion over it?
|
| Ha, that's funny. But yes, please we need interdiffs in GitHub
| and GitLab. I want to have my PRs/MRs always rebased and I don't
| want "fix code review comments" commits.
| crabbone wrote:
| Why would you want that in GitHub or Gitlab? :| The Web
| interface is atrocious no matter what you are using it for.
| Just the fact that you need to edit text in a dysfunctional Web
| editor negates all the benefits of any workflow you can
| imagine...
|
| The most natural way to deal with PRs is to use your editor you
| use to write your code. Here you have all the tools necessary
| to navigate the code, to look up the history of the change, to
| run tests and so on.
|
| This is just not the part that needs to be done by a Git
| hosting service. Having to do it through a Web interface just
| feels like random punishment...
| lynguist wrote:
| I did code reviews on Azure devops and on Bitbucket. I noticed
| that Azure devops shows exactly this diff soup, and Bitbucket
| shows interdiffs, and comments do point to previous points in
| time etc. It is much better from both points of view.
| senko wrote:
| I usually deal with "isolated patch series followed by a bunch of
| fixups" by letting them pile on top, then rebasing just before
| merging (the setting is usually feature branches all branched
| from main so that's fine).
|
| It is extra work and not everyone appreciates the benefits, so
| it's hard to convince coworkers to do the same.
| codethief wrote:
| Nice article, it captures the issues I've had with code reviews
| very well. I'm just not sure the "pairwise diff" would work well
| in practice. Sometimes you do forget a change which should be
| separate commit (in between the existing commits), etc.
|
| I recently got introduced to Sapling, specifically to the
| approach of never merging more than a single commit and also
| doing code reviews commit by commit. I like that idea even
| better!
|
| Of course with existing tools like git & GitHub this seems rather
| difficult to implement, but Sapling automatically keeps track of
| how commits change across fixups & rebases, and it also handles
| entire "stacks" of merge requests / commits.
| snatchpiesinger wrote:
| Another way to look at this that this is 3 linearly dependent PRs
| masquerading as one. Make each a distinct PR and the problem goes
| away, especially if you can mark the PR to depend on another one
| (on Gitlab you can, not sure about Github). If you want to see
| each change as a single logical unit, then they will each be a
| distinct merge commit on your master branch, use `git log
| --first-parent-only master` to only see these kind of changes.
| zeotroph wrote:
| > [GitLab] you can mark the PR to depend on another
|
| How much user interaction does that require, and how is this
| visualized in the review UI? Gerrit creates this dependency
| with a single `git push`.
| snatchpiesinger wrote:
| It's a bit cumbersome, and I think only recently you can make
| longer dependency chains. It's certainly not automated away
| with just git commands, but maybe there there is a Gitlab API
| way. The only way I know is to "edit" the PR (or MR in Gitlab
| speak) and paste the URL into some "depends on" field, then
| save.
|
| There are certainly other problems as well, like you might
| have an MR 1 from feature1 to master, and MR 2 from feature2
| to master which in turn depends on MR 1. Most likely your
| feature2 branch is off your feature1 branch, so it contains
| feature1's changes when compared to master, and that's what
| is shown in the Gitlab review UI. This makes reviewing MR 2's
| changes in parallel to MR 1 frankly impossible.
|
| Having said that, I still think that this would be the right
| way to organize this kind of work, however Gitlab's execution
| is not great, unfortunately. Any of this is probably
| impossible in Github too. I wonder if Gerrit gets this right,
| I have no experience with it.
|
| edit:
|
| One interesting point of MR dependencies in Gitlab is that I
| think you can depend on MRs from other projects. This is
| sometimes useful if you have dependent changes across
| projects.
| rfoo wrote:
| I usually just point the target branch of MR 2 to MR 1.
| After merging MR 1 GitLab automatically change it to the
| default branch so it's more or less okay.
|
| However this makes updating these MRs very rebase heavy and
| as said in OP it is hostile to reviewers.
| aseipp wrote:
| > Another way to look at this that this is 3 linearly dependent
| PRs masquerading as one.
|
| The first commit in the example is not dependent on anything
| else and may not exist at all. Rather its existence is illusory
| to show that sometimes when you write commits, a few things
| happen at once. You might just churn out a doc fix, a bug fix,
| and a small other thing all at once. It's just the nature of
| the work.
|
| > (on Gitlab you can, not sure about Github)
|
| You cannot do this on GitHub without write access to the
| repository, so it's effectively a non-option for anyone who is
| not a committer in an open-source context. Don't ask me why
| this limitation exists.
|
| If you do have write access, you can kind of do some similar
| things like open N pull requests where each PR 1...N has
| commits 1...N. Then you do something like "Read every PR, then
| merge only the last one which contains all N commits, and close
| all the others." Weird but OK, I guess.
|
| Still, organizing this is a pain, and I don't think GitHub
| really emphasizes it -- and also rebasing the dependent
| branches really requires you to use something like --update-
| refs to make it sane at all. So, you can also use tools like
| <https://github.com/spacedentist/spr> in order to organize it
| for you. Not the end of the world considering everyone uses 50
| different Git wrappers, but not ideal.
| Vinnl wrote:
| I'm using mostly this workflow with GitHub, with the main
| disadvantages being that it's more work on my side, and not
| obvious to my collaborators. But it does carry the same
| advantages of allowing reviewers to view diffs with just their
| feedback incorporated, without breaking `git blame` and `git
| bisect`.
|
| When I incorporate a reviewer's feedback, I'll commit that with
| `git commit --fixup <hash of commit that I want to update>`. I'll
| then push that up and leave a comment reply to the review
| feedback sharing the fixup commit hash.
|
| Then when the PR is approved and I'm about to merge, I'll do
| git rebase --interactive origin/main --autosquash
|
| This will then combine the fixup commits with the correct
| original commits. I then do a final `git push --force-with-lease`
| and merge it. (Make sure to note force push before the review is
| done, because then reviewers lose the ability to see what you
| added since their last review.)
|
| This relies heavily on autocomplete in my terminal, so that I
| only have to type `git re<right arrow>` to get to that long
| command above, for example. And it's a bit clunky, so using a
| tool that supports and encourages this workflow would be nice.
|
| But given that I'm stuck with GitHub, it's OK.
| Piraty wrote:
| https://news.ycombinator.com/item?id=37086022
| tome wrote:
| Wow, my jaw is on the floor. This is such a good idea! I
| already had the idea of tracking issues in the same repo as
| the code -- not that I actually use that idea -- but I didn't
| have the idea of doing PRs in the repo itself. Love it!
| mizzao wrote:
| I was going to say... interactive rebase addresses a lot of the
| "diff soup" comments that the writer complains about. It's
| really only done by disciplined engineering teams though (who
| bother to learn some more advanced features of git)
| aseipp wrote:
| Interactive rebase is fine, I used it probably 50x a day for
| many years. The problem isn't really with Git here, it's more
| a bunch of interrelated problems with GitHub's UX, and the
| fact that many people culturally have never approached the
| problem in any other way. A lot of places basically just
| outright copy GitHub's UX, which I think is a good example of
| this.
|
| For example, using 'git absorb' or autosquash doesn't solve
| the problem of the review UX just doing badly when you say
| "What is the difference between the last version of commit Y,
| and the new version." If those changes include a rebase, it
| shows you the entire diff including the rebase. That's often
| not good, but sometimes extremely necessary. (See my other
| posts in the thread about how Gerrit does this better.) And
| if you have to do something like rebase, then solve a
| conflict as a result of the rebase, it is basically
| unavoidable that you have to do all of them at once if you
| want the CI to stay clean (which I find important, at least.)
|
| Also, a lot of teams just culturally hate shit like
| autosquash, or any squash/rebasing at all. Do I think they're
| misinformed? Yes. Do I think they could use tools that make
| those flows 1000x better and faster? Yes. But at the end of
| the day default user interface and design philosophy of one
| of the most popular software forges in the world _does_
| actually matter and have downstream cultural impact. So you
| have to interrogate that and break it down for people. GitHub
| does not really encourage, explain, or smooth out workflows
| like this. People learn it, then hate anything else. (I mean,
| fucking hell, Git rebase didn 't even have --update-refs, a
| huge QOL improvement for stacked diffs, until like 2022!) So,
| you often don't have a choice. I've had many places where
| engineers despised GH force pushes, but only because...
| GitHub's UX is bad at tracking addressed/reviewed comments on
| a PR after a force push! They get hung by their own noose, so
| to speak, and don't even know it.
|
| I do have other problems with Git, but this post is mostly
| about GitHub/Gitlab style PR review flows. The same problems
| still happen even with autosquash or tools that accomplish
| the same thing.
| jrochkind1 wrote:
| I realized reading the first part of this article that I
| often want to set up a _sequence_ of PR 's, for very similar
| reasons as in the begininng of OP. Say, a prefatory refactor,
| then the main work, then some data cleanup.
|
| When I do this, the problem with rebase is that it kind of
| breaks the additional "next in sequence" PRs "on top", or at
| least requires (confusing to me) cleanup in all of them when
| I rebase the base.
| shambulatron wrote:
| Have you come across `git rebase --update-refs`? This
| automatically moves your "intermediate" branches during a
| rebase and sounds like it could be useful in your
| situation.
| jrochkind1 wrote:
| I have not! I'll try to check it out, thanks!
|
| The git cli still scares me when I get off the familiar
| path. :(
| aseipp wrote:
| Yes, fixup commits are a good way to approach this, though I
| don't personally like them, though. I think Sapling's "absorb"
| command which works on an underlying SCCS weave to
| automatically absorb changes into the relevant diff is much
| more elegant. It prompts you with an interactive UI. (For me,
| it sorta falls into the same space as rebasing a series that
| has multiple branches on it. You need --update-refs for that.
| Because otherwise it's like, why am I, the human, doing the
| work of tracking the graph relationships and manually punching
| in the commits, and moving the branches, and doing all this
| bullshit? Computers are good at graphs! Let them handle it!)
|
| There is also a `git absorb`, but it isn't as robust as
| Sapling's implementation[1].
|
| Really the problem isn't interactive rebase or not. It's mostly
| a problem of the UX of the review tool itself more than
| anything, and the kind of "cycle" it promotes. I mentioned it
| elsewhere here, but fixup commits for example still won't solve
| the problem of GitHub showing you diffs between baselines, for
| example, which can absolutely ruin a review if the baseline is
| large (e.g. you rebased on top of 10 new commits.)
|
| I do have problems with Git's UX beyond this, but the original
| post is mostly a gripe about GitHub.
|
| [1] There is an example in this GitHub issue that captures the
| difference between the two underlying algorithms:
| https://github.com/martinvonz/jj/issues/170
| faangguyindia wrote:
| Biggest problem working with Git is remembering all these flags
| and commands.
|
| If you do it everyday for years, sure you remember most of
| them.
|
| But for weekend hackers or those who specialise in some other
| areas, it creates a lot of frustration.
|
| I often forget all these flags etc....
|
| My brother made this https://github.com/zerocorebeta/Option-K
|
| This enables me to simply write in thermal, "interactive rebase
| main, auto squash.
|
| And then I hit option+k and it replaces it with the above
| command.
| ipaddr wrote:
| If you are a weekend warrior you are not working with a team.
| Just commit and push.
| keybored wrote:
| You've only mentioned your bros GitHub projects four times in
| the last two days. Time to step it up maybe?
| riquito wrote:
| You can add to your ~/.gitconfig [rebase]
| autosquash = true
|
| from then on `git rebase -i origin/main` automatically reorder
| fixup and squash commits. It's a small thing but greatly
| improved my workflow
| dietr1ch wrote:
| Squashing is nice because it keeps the amount of commits
| minimal, but as mentioned in the article, it has the problem
| that now reviewers have to figure out what changed since
| their last review, which can get hard to do if the tree of
| changes originally proposed needed updates in dependencies of
| the commits they reviewed.
|
| I really like the idea that I'm working on a tree (often a
| stack/list) of changes, and as the review progress I get this
| cross product of iterations over time like [a-better-way-
| interdiff-review-aka-git-range-diff](https://gist.github.com/
| thoughtpolice/9c45287550a56b2047c631...) showcases. In the
| end, the time dimension is useless after things get
| submitted, so the repository only gets the latest state of
| the change tree committed, which is simple and like the auto-
| squashed version you mention, but during review reviewers get
| to see the change tree evolve in an easy way.
| pinkorchid wrote:
| > the problem that now reviewers have to figure out what
| changed
|
| That's something that the review tool can solve, and I
| agree that github doesn't handle it well. But other code
| review tools can show diffs between revisions independently
| of how the commits have been rebased or squashed over time.
| Guvante wrote:
| Git aliases are magical BTW
| outsomnia wrote:
| Guys... look into stgit if you like the sound of this iconoclasm
|
| https://stacked-git.github.io/
| Piraty wrote:
| using github's UI as preferred way to interact with code in
| review is a bad idea, as it encourages lazy from-the-couch-just-
| yolo-approve-it-looks-alright style of review. this is where
| gerrit+mail based workflows shine as reviewer is more encouraged
| to apply the series, compile/run it in their env (which might
| differ from your's); here is an example [0].
|
| here are some useful notes on how to have a purely branch centric
| review process regardless of a webUI: [1]
|
| [0]: https://drewdevault.com/2022/07/25/Code-review-with-
| aerc.htm... [1]: https://news.ycombinator.com/item?id=37086022
| another-acct wrote:
| > mail based workflows shine as reviewer is more encouraged to
| apply the series, compile/run it in their env
|
| not to mention: if the reviewer does this for every version of
| the posted series, on appropriately named (versioned) local
| branches, then they can trivially run git-range-diff between
| adjacent versions!
| kgeist wrote:
| After JetBrains discontinued Upsource, we tried several code
| review tools and it shocked us that many of the tools (both
| commercial and open source) don't have built-in tools to review
| incrementally. It's just just on big soup of code, or you have to
| create new PR's for each small change. Now we have to use
| JetBrains SpaceCode, which still lacks many of the niceties of
| Upsource.
| dusted wrote:
| We're working with gitlab and not a day goes by where I don't
| miss gerrit with it's fast and functional ui, good and practical
| diff viewer, patch-sets, topics, review workflow, and
| manifests...
|
| The thing that gitlab does really well, is making it clear how
| good gerrit is.
| Izkata wrote:
| As someone who has been on a maintenance team for years and
| regularly has to dig through the history to figure things out, I
| strongly prefer the original "bad" version with 7 individual
| commits. Yes "git blame" takes a little bit of extra work to get
| through all the commits, but knowing what initial mistakes were
| made and refactors done makes it much easier to tell what the
| original intent was.
|
| For example, if "fix bob review", "fix alice review", or "minor"
| introduced something that wasn't noticed until later, by having
| them separate we can tell whether it was intended functionality
| or a bug. This has happened to us a whole bunch of times with
| rarely seen edge cases, so the bug wasn't found until years
| later, or some other part of the code was masking the issue so it
| didn't manifest as a bug until years later. At least one of these
| was even caused in a "linting" commit, and all of these were much
| more easily fixable because we could tell the bug was introduced
| in one of these code-review-update commits, rather than the core
| feature commit.
| travisb wrote:
| I concur.
|
| I don't know why people are so happy to throw out the history.
| I find it's not uncommon that bugs are introduced during
| rebasing, and merging in from master. Sometimes this is due to
| incorrectly resolved conflicts, sometimes due to concurrent
| changes elsewhere in the codebase.
|
| With --first-parent it isn't hard to have your cake (full
| history) and eat it too (linear mainline history).
| crabbone wrote:
| This is solving the problem with the wrong tool. What you need
| is documentation. But people working on the project you have to
| maintain didn't write one. So, you are trying to use git blame,
| astrology and ouija board to guess what the project authors
| wanted.
|
| And, while doing so, you are arguing for never cleaning the
| house, keeping all the garbage where it falls. Which will only
| make your situation worse, because the history will bloat with
| a lot of contradictory or false information.
| Izkata wrote:
| > What you need is documentation. But people working on the
| project you have to maintain didn't write one.
|
| Oh there is documentation, but it's old and contradictory.
| Commits are self-organizing by history.
|
| > So, you are trying to use git blame, astrology and ouija
| board to guess what the project authors wanted.
|
| There's no guessing involved. Keep the original commits and
| it's right there in the commit message and order of changes.
|
| Guessing is what happens when you destroy the history by
| using rebases like this.
|
| > And, while doing so, you are arguing for never cleaning the
| house, keeping all the garbage where it falls. Which will
| only make your situation worse, because the history will
| bloat with a lot of contradictory or false information.
|
| I'm not saying "don't refactor" at all. Clean the code up as
| much as you want. The history however is completely hidden
| and not causing any clutter, it's there when you need it and
| invisible the rest of the time, so erasing useful history is
| just busywork for negative benefit.
|
| Any apparent contradictions are also easily resolved because
| commits have timestamps and ordering and are part of the
| larger completely automatically generated history.
| planetpluta wrote:
| I'm confused by how pushing to new branches would work on GitHub
| (or is the point that it doesn't...)? Are you able to change the
| branch of a PR from `v1` to `v2` without making a new PR?
| aseipp wrote:
| Yes, the point is that it basically doesn't support that.
|
| Well, OK. You can push two branches, v1 and v2, each with the
| commits. Then to do pairwise diffs, you type in the commit
| object hashes directly into the URL bar to diff the two objects
| in the repository using the 'blobs' API but like... I don't
| think that qualifies so much as "supporting" it as much as an
| absurd hack, right?
|
| > Are you able to change the branch of a PR from `v1` to `v2`
| without making a new PR?
|
| No, you have to open a whole different PR.
| planetpluta wrote:
| Okay that makes sense. Agreed, wouldn't quite consider that
| "supporting" as far as I'm concerned.
| typeofhuman wrote:
| I just wish I could group related files in the my PR so the
| reviewer has a better way of reading the changes.
|
| It would be like a "PR Journey".
| tuckerpo wrote:
| Looks like the author really wants `rebase -i`
| hoten wrote:
| Some of the problem stated in the post is a little forced, namely
| the issues with bisect and blame. That's only an issue if the
| review doesn't end with a squash.
|
| Also as a user of gerrit via the Chromium project, I'm not aware
| of a way to structure the patch sets uploaded as individual
| changes without infinite foresight. It's always appended into the
| last. Whereas with github, at least you can rebase. I fully admit
| I could be missing a gerrit feature.
|
| That said... I would really love an interdiff feature for the
| GitHub rebase-to-fold-in-feedback-to-meaningful-commits workflow.
| jrochkind1 wrote:
| I have found myself arranging PR's on github in sequence, for the
| exact sort of use case OP talks about. (Each PR might have more
| than one commit, so it's not exactly the same practice). And been
| frustrated that Github's interface does not make it very easy to
| make the sequence apparent or have good DX for the reviewers.
|
| One could definitely imagine a UX/DX that would. (I do apprecaite
| github's UI generlaly).
| loeg wrote:
| This workflow is exactly what Phabricator[1] facilitates, for
| what it's worth. Also, if I remember correctly, ReviewBoard.
| (Though I have not used that in some time and might misremember.
| Also, it has its own flaws.)
|
| Sadly the open source version is unmaintained now. It is still
| used by FreeBSD. Facebook uses it internally for every single
| diff, of millions.
|
| [1]:
| https://secure.phabricator.com/book/phabricator/article/diff...
| swolchok wrote:
| https://github.com/ezyang/ghstack
| clktmr wrote:
| I agree in general, but running git bisect on individual PR
| commits is just doing it wrong. There will always be commits that
| break stuff temporarily. Run git bisect only on the merge commits
| instead, which are typically already tested by CI.
| another-acct wrote:
| > I agree in general, but running git bisect on individual PR
| commits is just doing it wrong. There will always be commits
| that break stuff temporarily.
|
| That's unacceptable in my book. Before submitting any patch set
| for review, the contributor is responsible for ensuring that
| the series builds (compiles) at every stage -- at every patch
| boundary. Specifically so that a later git bisect never fail to
| build any patch across the series.
|
| This requries the contributor to construct the series from the
| bottom up, as a graph of dependencies, serialized into a patch
| set (kind of a "topological sort"). It usually means an
| entirely separate "second pass" during development, where the
| already working and tested (test-case-covered) code is
| reorganized (rebased / reconstructed), just for determining the
| proper patch boundaries, the patch order, and cleaning up the
| commit messages. The series of commit messages should read a
| bit like a math textbook -- start with the basics, then build
| upon them.
|
| Furthermore, the patch set should preferably also pass the test
| suite at every stage (i.e., not just build at every stage).
| Existent code / features should never be functionally
| regressed, even temporarily. It's possible to write code like
| this; it just takes a lot more work -- in a way you need to see
| the future, see the end goal at the beginning. That's why it's
| usually done with a separate, second pass of development.
| keybored wrote:
| As the author is aware of[1][2] the Git project uses this
| interdiff approach with email.
|
| - Patch series (PR equivalent) go through a round of reviews
|
| - Each version has a cover letter (like PR desription) unless
| it's only a one-patch series
|
| - Each version has that optional cover letter with each patch as
| a reply email to it
|
| - The next version is a reply to the previous cover letter
|
| - And each version 2 and above cover letter has a git-range-diff
| in it (courtesy of git-format-patch)
|
| - And also a human/manual summary of changes between versions
|
| - Optionally you can have little comments of each patch that are
| _not_ part of the commit /patch message: just put it between the
| three dashes and the diff. Or use Git Notes and let it handle it
| for you (it will put it in the same space).
|
| In turn there are no "address feedback" commits in the final
| (merged) series. Only the changes themselves.
|
| Just look at any `[PATCH v2 0/...` or higher (v3...) email on the
| mailing list: https://lore.kernel.org/git/
|
| Of course this isn't the easiest workflow:
|
| - Email
|
| - You ought to keep track of the base commit between versions
| (you could have rebased on the main branch)
|
| - You need to store versions of your branches
|
| - You need to keep track of who to CC on the emails. Well,
| perhaps not if they are the same people throughout, but it is
| good courtesy to add people who reply to these versions to the CC
| list
|
| - You need to harvest the email message id on the cover letters
| and use that `In-Reply-To`
|
| Phew!
|
| But this is quite sublime for reviewers and people who come back
| to the series years later:[3]
|
| - All of the review in the same thread overview
|
| - Each version moves the thread to the right
|
| - Each patch (to be commit) is commented on individually
|
| - You can reply to the commit message and the diff by quoting
| them directly
|
| - The contributor will both give you the range diff (which will
| highlight diff changes _and_ metadata changes like edited commit
| messages) _and_ a manual summary of the changes
|
| [1] https://news.ycombinator.com/item?id=41511649
|
| [2] And it is with _some_ trepidation that I bring this up
| because of the aversion some people have to email workflows.
| Because the "interdiff style" of PRs is useful!
|
| [3] Git Notes `amlog` records the message id of the patch email
| where the commit came from
| samatman wrote:
| Posts like this add fuel to the fire of my conviction that the
| future of revision control is Pijul, or something much like it.
|
| I can't completely justify the intuition here, but this seems
| like an artificial distinction forced by a snapshot model of a
| codebase. It's literally about the same patches, but arranged
| differently to give clarity to the history of the project.
|
| So it seems obvious on that level that a system where patches are
| the fundamental stratum has an advantage there, although I
| confess that I have yet to use Pijul for a substantial project,
| so I can't describe in detail how it might make this better. I'm
| just going off things like how a cherry-pick in Pijul applies the
| _actual_ change from one branch to the other, rather than
| duplicating it over.
|
| I get frustrated a bit when I think about the gap between the
| kind of difference a leap forward in revision control could make
| for our profession, and the amount of resources the Pijul project
| has to work with. So I'm just putting this out there: MSFT bought
| GitHub for about 9 billion dollars in 2024 money. Anything which
| displaces it would be worth more than that. That's a lot of
| leverage for anyone forward-looking enough to apply it.
| travisb wrote:
| While patches and snapshots are duals of each other, patches
| are generally less easy to reason about and work with, in no
| small part because compilers and humans work on the snapshot,
| not the patch.
|
| Having worked extensively with patches as the semantic unit via
| patch(1) and quilt and stgit, I'm very skeptical that a VCS
| based on patches is actually superior in most circumstances.
|
| In this particular case, perhaps it would be helpful to view
| code reviews as a snapshot where the differences (possibly
| intra-review revision differences) are highlighted instead of
| as patches.
| zoogeny wrote:
| I've heard people argue for this strategy a few times but I am
| not convinced. Most projects I work on have feature branches that
| get squashed into a single commit (erasing the history of the
| branch). If we even have the case described by the author we
| would just do the three steps (refactor, new api, update) as 3
| commits.
|
| One thing that has been a solid practice for me is to avoid long-
| lived branches. That seems to be what leads to this kind of
| multi-stage commit scenario. Someone wants to work for several
| days (or worse, weeks) on a feature and then drop the whole thing
| all at once. I strongly prefer to move things into main every day
| (or multiple times a day). One way to do this is to use feature
| flags to allow work-in-progress commits to land without causing
| problems. This also has the advantage (if systems are set up
| correctly) to enable on dev and staging environments for testing.
| It is also a hop-and-skip jump away from blue/green rollouts.
|
| I don't want ways to make big commits easier to review. I want to
| force the team to commit small changes early and often. I
| understand not everyone agrees.
| steveklabnik wrote:
| The people I know who prefer stacked diffs argue that it makes
| it easier to integrate changes faster, no matter the size. Part
| of this is because unlike a PR on GitHub, you can land parts of
| a stack: to use the example from the article, if the "small
| refactor" diff is good to go, it can be landed without landing
| the "new api" and "migrate API users" diffs. Centering commits
| rather than branches has the effect of making for smaller
| overall commits rather than long-lived branches.
| zoogeny wrote:
| I'm not sure I understand your exact logic, so let me talk
| through a simplified scenario for arguments sake. Imagine
| each change takes 1 day. Imagine the team releases code to
| production every day. On Monday a developer picks up the
| task, does the refactor, puts up a PR and the refactor is
| shipped Tuesday morning. Tuesday he works on the api and it
| ships Wednesday. Wednesday he finishes the migration which
| ships on Thursday.
|
| In the alternate scenario the entire stacked commits go out
| together on Thursday after the entire 3 days of work is
| completed. So I do not see how it would make it easier to
| integrate changes faster. This problem becomes worse, as you
| can imagine, if the second task in the stack takes a week vs.
| a day. I believe this kind of logic can be extended to any
| timescale.
| another-acct wrote:
| > have feature branches that get squashed into a single commit
| (erasing the history of the branch)
|
| Terrible.
|
| It makes git-blame and git-bisect essentially unusable.
|
| If you have a regression, git-bisect can help you narrow it
| down to a single patch. Because of that, you want to have fifty
| 160-line patches in the git history, for a particular feature,
| rather than one 8000 line patch.
|
| If a given line of code looks fishy, you want git-blame (or a
| series of git-blame commands) to lead you to a 160-line commit,
| with its own detailed commit message, rather than to a
| 8000-line commit.
|
| You also want to preserve the _order_ of the original commits.
| Just reading through the individual commit messages, in order,
| a few years later, can be super helpful for understanding the
| original design. (Of course the original patch set has to be
| constructed in dependency order; it needs to compile at every
| stage, and so on. That 's a separate development step that
| comes on top of just implementing the functionality. The code
| must be _presented_ in logical stages as well.)
| knallfrosch wrote:
| In the future, we'll just have AI group the differences, write
| the summaries and understand the changes.
| quodlibetor wrote:
| I have been chasing the gerrit code review high since I left a
| company that used it almost 5 years ago.
|
| Stacked pull requests are usually what people point to to get
| this back, but this article points out that _just_ stacked pull
| requests don't handle it correctly. Specifically with github, you
| can't really see the differences in response to code review
| comments, you just get a new commit. Additionally, often github
| loses conversations on lines that have disappeared due to force
| pushes.
|
| That said, I have a couple scripts that make it easier to to work
| with stacks of PRs (the git-*stack scripts in[1]) and a program
| git-instafix[2] that makes amending old commits less painful. I
| recently found ejoffe/spr[3] which seems like a tool that is
| similar to my scripts but much more pleasant for working with
| stacked PRs.
|
| There's also spacedentist/spr[4] which gets _much_ closer to
| gerrit-style "treat each commit like a change and make it easier
| for people to review responses" with careful branch and commit
| management. Changes don't create new commits locally, they only
| create new commits in the PR that you're working on. It's,
| unfortunately, got many more rough edges than ejoffe/spr and is
| less maintained.
|
| [1]:
| https://github.com/quodlibetor/dotfiles/tree/main/dot_local/...
| [2]: https://github.com/quodlibetor/git-instafix/ [3]:
| https://github.com/ejoffe/spr [4]:
| https://github.com/spacedentist/spr
| crabbone wrote:
| I really don't like either idea, and don't understand why is this
| such a big issue...
|
| Working with Web interface for anything code-related is a huge
| downside. Workflow doesn't matter at this point, the usability of
| this approach, no matter what it does is so bad, it's not worth
| discussing further.
|
| The best way it ever worked for me is that:
|
| 1. PR author creates a PR and gets assigned a reviewer.
|
| 2. The reviewer leaves a commit with #
| REVIEW(reviewer): Comments
|
| 3. Then the PR author changes something or argues back.
|
| 4. If reviewer is happy, the author gets to organize the commits
| in whatever sequence they want (this would typically involve
| something like squashing everything, removing review comments if
| they are no longer necessary, and then splitting the PR into
| logical parts). Otherwise we go back to (2).
|
| No need for complicated workflows, no need for any kind of
| external system...
___________________________________________________________________
(page generated 2024-09-11 23:01 UTC)