[HN Gopher] When it comes to Git history, less is more
___________________________________________________________________
When it comes to Git history, less is more
Author : maximilianroos
Score : 172 points
Date : 2021-06-26 16:09 UTC (6 hours ago)
(HTM) web link (brennan.io)
(TXT) w3m dump (brennan.io)
| foreigner wrote:
| On this topic, is there a JavaScript lint tool that integrates
| with Git to enable you to gradually apply style changes to code
| as it evolves? E.g. if the change is tabs-to-spaces the lint tool
| would require spaces in new or changed lines of code, but not
| complain about tabs in unchanged lines of existing code.
| jabo wrote:
| You could have the linter only run on changed files in a
| precommit hook, instead of on all files.
| foreigner wrote:
| That would work at file level granularity, but I'd like it to
| work at line level.
| masklinn wrote:
| You can usually take the output of the lint tool and filter
| it by the lines modified by the patch (which you get from
| the diff).
|
| Few tools do it out of the box (in the python world flake8
| supports diff filtering out of the box, and there's diff-
| cover which can filter analysis beyond just test coverage)
| but it's easy enough to do with a wrapper, especially if
| the linter has a configurable or even parseable (e.g. json)
| output option.
| jabo wrote:
| Found an ESLint plugin that will only run on staged
| changes: https://www.npmjs.com/package/eslint-plugin-diff
| surfmike wrote:
| This still leads to frustrating behavior, where changing one
| line of a file might end up reformatting all of it (and this
| often obscures the actual change itself).
| klodolph wrote:
| Tools like git-clang-format only reformat the code you've
| changed, not the entire file. A lot of formatting tools
| work this way (certainly not all).
| jabo wrote:
| There's an eslint plugin to run only on staged changes:
| https://news.ycombinator.com/item?id=27643526
| foreigner wrote:
| That looks perfect.
| onion2k wrote:
| There's also the 'lint-staged' NPM package if you want to
| do pretty much anything only on staged files.
| foreigner wrote:
| I guess one way this might work is the lint tool could run
| against the whole file but then cross-check the line numbers of
| any errors found with the git diff. If the diff shows the line
| is unchanged then ignore the error. That doesn't sound too
| hard.
| FalconSensei wrote:
| On IntelliJ, if you use the Save Actions plugin, you can set it
| to only reformat lines that were changed. Although I'm not sure
| if would work for using Prettier
| scns wrote:
| Thank you
| daneel_w wrote:
| What was the point the author tried to get across with the first
| paragraph bringing up the large whitespace commit? It didn't add
| anything to the story, it just ended up as an initial
| distraction.
|
| Also, is he implying that thousands of commits (one for each
| file) would be a better way to convert that large code base? Why
| not solve this problem with the right tools, such as ignorerevs?
| ghshephard wrote:
| The whitespace commit breaks cherrypicking. What seemed like a
| harmless change could cripple automated backports of fixes. I
| thought it was the entire crux of his story.
| daneel_w wrote:
| Yes but the whitespace commit - a change that unavoidably
| affects almost every single line of its target - is possibly
| the worst example one can come up with.
| looperhacks wrote:
| I believe his point is that the commit shouldn't have happened
| at all, keeping the old style instead (or maybe applying the
| new style together with other changes ... Yuck. Don't do this)
|
| It kinda serves as a hook to describe why these changes are
| bad. I think it's a good example for the rest of the article.
| daneel_w wrote:
| I feel it's a very contrived example because there's no
| compromise to how such a change is to be done. If there's an
| argument to make it's about making the decision to switch
| whitespace tactic at all.
| CalChris wrote:
| What holds true for the Linux kernel may not hold true for other
| repos.
|
| In this case, the OP is supporting old Linux releases and
| remarking how naturally cherry picking can work. LLVM by
| comparison, doesn't have long-term stable releases. So this
| conservative approach doesn't necessarily apply for LLVM.
|
| So I like the analysis but I disagree with the generality.
| mojolozzo wrote:
| Agreed! The author makes exactly this point towards the end of
| the post.
| gumby wrote:
| > So I like the analysis but I disagree with the generality.
|
| Any organization that releases software revisions has this
| issue. Yes, we do live in a world where much software is now
| subject to "continuous release" (either applications which are
| auto-updated to latest, on phones and some laptop OSes or web-
| only applications where the latest is pushed to deployment).
| And indeed the specific issue described here is much less
| important.
|
| However I suspect the majority of software in use is not like
| that. On the Enterprise side, OSes and applications are updated
| carefully; an ERP upgrade can be a major undertaking. All sorts
| of machine control software is ether never upgraded, with
| limited patching, or treated the same way as other enterprise
| software. Etc.
|
| The difference between the two is why there is both "git flow"
| and "GitHub flow" -- it's not like one is absolutely superior
| to the other.
| pansa2 wrote:
| > _The commit was about 10 years old, and it replaced every tab
| with 4 spaces._
|
| Reminds me of a project I worked on a while back - whenever you
| looked at the git history of a file, if you went back far enough
| you'd hit a dead end at the point where every source file was
| renamed from *.c to *.cpp.
| iveqy wrote:
| --follow
|
| We have a few files at work that change name with every
| commit...
| alkonaut wrote:
| Git tracks changes across renames though right?
| [deleted]
| muxxa wrote:
| git log -S'text of interest'
|
| has always served me better than blame. It can better jump file
| boundaries and find e.g. prior code that the code in question was
| copied from.
|
| I've always thought though that we need better conflict
| resolution that is code aware. And also better changeset
| specification, e.g. if a commit is equivalent to a s/foo/bar/
| then this information should be included in the commit, or if a
| function signature has changed, then record that fact, rather
| than the dumb line by line diffs.
| mananaysiempre wrote:
| On one hand, I agree that these are all good points. I haven't
| ever played with kernel code, but I have tried to backport years-
| old Glibc commits and it's an absolutely miserable experience
| because "minor" things like small-scale reformatting would
| frequently be bunched together with unrelated changes. They also
| maintained their configure script in version control and
| apparently even merged it manually, because I've found that at
| many points in its history it would be composed of pieces
| generated by different versions of autoconf.
|
| On the other hand, I don't think that the conclusions of this
| story are as inevitable as it makes them seem. Much of the pain
| it describes is due to the fact that Git is, as it proudly
| itself, _stupid_ : it doesn't understand that your spacing or
| line breaking or bracketing changes are incidental to what the
| code is trying to accomplish. I'm not saying that structural or
| otherwise language-aware editing or source control is the silver
| bullet because I'm well aware that so far most attempts at it
| suck, but I think it's important to remember that whitespace
| pedantry is to a large extent a tooling issue.
|
| Could a merge tool coupled less than completely to clang-format
| (or yapf, or black, or gofmt ...) use its settings to improve the
| merges, I wonder?
|
| (Tangentially related: Monticello, Paredit, the "skeleton syntax
| tree" idea from Dylan.)
| laurent92 wrote:
| Next generation of version control will diff the AST of the
| parsed code, so line endings don't show up in the diff ;)
| Bonus, developers will be able to each see the code with their
| own code style or open a Java file in Scala ;)
| mananaysiempre wrote:
| I'd love for that to be true, but it's not like people
| haven't had that idea before (if maybe not with source
| control, because it's become ubiquitous only relatively
| recently). Pushing code beyond a textual representation has
| been _slooow_ and I haven't yet seen a breakthrough large
| enough to change that.
|
| The Scala thing isn't happening, though, -- it's a difference
| of style and approach, not syntax.
| breck wrote:
| > so far most attempts at it suck
|
| Agreed. I'm pretty sure I have it solved with Tree Notation
| though. I've got a thing in the works which I call 3D git. Tree
| languages get it for free. The trick was simplifying the syntax
| so there's nothing left but space, and then your syntactic
| shape matches the semantic shape.
|
| A sneak peak:
|
| https://twitter.com/breckyunits/status/1408546408911695873?s...
| Ericson2314 wrote:
| Yeah absolutely. My personal measure for a the "cohesion-
| prowess" of a society is how much they screw around with plain
| text, opaque binary data, etc. vs take the time to properly
| create rich data structures for interfaces.
|
| Taking the time and deep breath to write nice interfaces rather
| than scrambling and putting all the effort into implementations
| ignoring the larger picture of how human endeavors _compose_
| seems very fundamental. It reminds me also of the slogan
|
| > A developed country is not a place where the poor have cars.
| It's where the rich use public transportation.
| jolux wrote:
| > > A developed country is not a place where the poor have
| cars. It's where the rich use public transportation
|
| This is brilliant, stealing it for my transit advocacy.
| Ericson2314 wrote:
| It is! Credit to
| https://en.wikipedia.org/wiki/Gustavo_Petro
| mananaysiempre wrote:
| I'm not sure it's quite as clear-cut, for three reasons:
|
| - Text (or binary) streams are the ultimate unnormalized
| representation, and you don't always _want_ your data to be
| normalized. When storing it, maybe (though I'm sure the DBAs
| among us will disagree even on this point), but the
| intermediate values during processing, almost certainly not.
|
| - Most representations of structured values are all or
| nothing, whereas in reality most types of structured data are
| usually more conveniently viewed as having a hierarchy of
| supertypes, possibly with things like "chunk of text" on the
| top, maybe "list of chunks of text" below that, and so on;
| otherwise your tools are much less reusable. (Compare
| "generic programming" in the Haskell/Scala not Java/C# sense,
| _i.e._ metaprogramming over algebraic datatypes not
| parametric polymorphism. See also the "nanopass" approach to
| compilers.)
|
| These two (largely overlapping) points are how I explain to
| myself the (empirically if not theoretically evident)
| advantage of Unix's "getta byte" approach to IPC (as Cutler
| sarcastically put it) over the myriads of structured
| approaches that came before and after. By not structuring
| your data exchange, it doesn't force you into a framework
| that's more rigid than it needs to be. People barely
| understand how to make structured data amenable to processing
| by composable tools _now_ , back in the 80s I don't think
| anybody even knew to ask the question.
|
| This is why I mentioned Paredit and Dylan's "skeleton syntax
| trees": they are, respectively, a structural editor and a
| macro system that work where others fail by the virtue of
| being expressed in terms of more than text or even tokens,
| but less than a full syntax tree.
|
| Finally, in this specific case,
|
| - We're talking about programming languages, about making a
| computer understand the programmer's intent to some degree.
| This is a maze where making even the slightest wrong turn has
| Godel and Turing together slapping you across the face with a
| giant NOPE sign. It's kind of endearing to see the luminaries
| on the Algol committee (essentially all of CS at the time)
| state that they want, like, pseudocode, from their papers,
| except, y'know, with a compiler, and not suspect anything can
| go wrong with that. But we are hardly smarter than them, so
| we should probably remember the bitter lessons they've been
| forced to learn and keep in mind the possibility that there
| might not _be_ a good solution to our problem.
| Ericson2314 wrote:
| > and you don't always want your data to be normalized.
|
| I think "normalized" is doing a lot of work in this
| sentence. In the context of data keeping, normalization
| usually doesn't mean something like "like give a unit
| vector", where they both have value and the normalization
| is lossy, but rather cleanups of messy stuff that _ought
| not to occurred in the first place_.
|
| What I am saying is not the original input has no value,
| but we should take the steps to make those problems
| actually not happen in the first place. Rich interfaces do
| that, and are an investment that pays for itself eventually
| so this isn't manifestly unreasonable.
|
| > ...metaprogramming over algebraic datatypes not
| parametric polymorphism...
|
| I worked on a little toy "tree editor" where we did just
| that. The use of a good type system to both operate
| "generically" without "upcasting" and loosing the structure
| for ever is key.
|
| > See also the "nanopass" approach to compilers.
|
| I do like that a lot :) but I also like that there is an IR
| for every pass, like with CakeML. I thought those went hand
| in hand?
|
| > (empirically if not theoretically evident) advantage of
| Unix's "getta byte" approach to IPC
|
| I would argue the (over-)prevelance of HTTP is effectively
| empirically the opposite, namely that people do prefer
| structured things in a vacuum, but coordination failures
| have repeated gotten us stuck in local maxima of already
| widely established things like Unix's various binary
| streams and then HTTP.
|
| > This is why I mentioned Paredit and Dylan's "skeleton
| syntax trees": they are, respectively, a structural editor
| and a macro system that work where others fail by the
| virtue of being expressed in terms of more than text or
| even tokens, but less than a full syntax tree.
|
| I do agree the jump from text to sexprs / json / token
| trees (Rust's name) etc. is the most important step. Fist
| get rid of flatness, then worry about higher invariants.
|
| > We're talking about programming languages, about making a
| computer understand the programmer's intent to some degree.
|
| And surely text doesn't help with that. Too much structure
| makes some edits hard if the user can't temporarily mid-
| edit ignore the rules, but computer should love it.
|
| > It's kind of endearing to see the luminaries on the Algol
| committee (essentially all of CS at the time) state that
| they want, like, pseudocode, from their papers, except,
| y'know, with a compiler, and not suspect anything can go
| wrong with that.
|
| This is about Algol 1968 not Algol 1960? My understanding
| was that was too many features, not too many invariants.
| zmmmmm wrote:
| It reminds me that one of my grievances with whitespace
| significant languages (hello Python) is that I _can 't_
| effectively just skip commits that are pure whitespace changes in
| code review. I am curious how people approach this ...
| lilyball wrote:
| It sounds like the real advice here is "have a strong coding
| style guide that is rigorously enforced so nobody ever has to go
| make style fixes to existing code".
| pdw wrote:
| > At a previous company, there was an "infamous" commit in our
| main repository. The commit was about 10 years old, and it
| replaced every tab with 4 spaces.
|
| If have commits like this, add the ids to a file `ignorerevs`,
| and then tell git about it: git config --local
| blame.ignoreRevsFile ignorerevs
|
| Then at least `git blame` will still give useful results. (This
| is a relatively new git feature, added a year or two ago.)
| forrestthewoods wrote:
| Who gives a shit about a single commit that converts tabs-to-
| spaces? How is this a problem in any way?
|
| The real problem here is that "git blame" is a garbage tool.
| Perforce has "timelapse view" that is radically better than any
| historical view I've seen in Git/Hg.
|
| With respect to this issue, git just needs to suck less.
| xmprt wrote:
| You answered your own question. Git blame is a pretty crappy
| tool but it's a tool that a lot of people use nevertheless.
| If you have a commit that converts tabs to spaces, then the
| blame for pretty much every single line will be lost before
| that point.
| azernik wrote:
| Just git blame -w to ignore whitespace. There's also an
| equivalent git config option.
| shepherdjerred wrote:
| Wow, this is incredible! I convinced my team to adopt Prettier
| for many of our repositories. The only real criticism was that
| our git history would be cluttered since it would reformat all
| of our code. This seems like the perfect solution.
| catlifeonmars wrote:
| Nice! Is there a way to check this into a repository (a la
| .gitigore)?
| banana_giraffe wrote:
| You can specify a file with of such commits with --ignore-
| revs-file, and of course check in that file. Naming that file
| .git-blame-ignore-revs seems to be a convention I've seen
| more than once.
|
| Still need to specify the file yourself, so hopefully someone
| can point out the missing magic to have git pull in .git-
| blame-ignore-revs on its own.
| NegativeLatency wrote:
| A shell alias?
| karlding wrote:
| That's what the blame.ignoreRevsFile [0] config option is
| for.
|
| [0] https://git-scm.com/docs/git-config#Documentation/git-
| config...
| nickysielicki wrote:
| Depending on the size of the team and how much agreement you
| can get on the importance of such a change, I think the better
| way to do this on an older repository is to get everyone to
| nuke their checkouts and use git-filter-branch to rewrite the
| history so that nobody ever used tabs.
|
| https://stackoverflow.com/questions/58042532/how-can-i-clang...
| Kinrany wrote:
| git-filter-repo is now recommended over -branch by the docs
| emmelaich wrote:
| With the unfortunate side effect of invalidating external
| refs for instance in your issue tracker or code review tool.
| bhaak wrote:
| 10 years ago would probably long enough ago to not care for me.
|
| But wouldn't the standard -w be enough to ignore most of this
| commit.
|
| But I used ignoreRevs in repositories that messed up their
| history by using an automatic indenter with atrocious settings.
| gumby wrote:
| Unfortunately -w won't help you with patch, which uses
| character positions. But certainly helps with searching.
| harikb wrote:
| In addition, if their code is python, -w would incorrectly
| ignore legitimate changes
| jrochkind1 wrote:
| do you have a realistic example?
| contravariant wrote:
| A simple one would be: total = 0
| for value in list: total += value
| return total
|
| vs. total = 0 for value in
| list: total += value return total
| tazjin wrote:
| > 10 years ago would probably long enough ago to not care for
| me.
|
| Hm, especially for the case of blaming (mostly to figure out
| why something was done a certain way) I frequently happen
| upon 10+ year old commits. This happens both in open-source
| projects, as well as at work (where we don't use git, but the
| same concepts apply).
| prpl wrote:
| It would be cool if you could mark whitespace only changes at
| least with an option to smart-skip them with `git blame` or
| something
| hnra wrote:
| Isn't this possible with just the normal git blame -w? Or if
| you want to ignore specific commits there is ignore-revs.
| nailer wrote:
| Storing changes as text is a fundamentally bad idea. Not only
| does it make merge conflicts more likely to happen, as transforms
| (me adding a function, you renaming something I use in the
| function) more likely to occur, it also means that formatting is
| committed and discussed rather than being a matter of personal
| preference.
|
| Hopefully the next item in this RCS CVS Subversion Git chain is
| just storing ASTs and transforms on top of them so we can spend
| less time fixing basic conflicts and discussing formatting.
| _ix wrote:
| I've seen the suggestion of VCS/SCM storing ASTs rather than
| plaintext for a decade now. Are you aware of any projects that
| are trying to address this?
| arp242 wrote:
| I heard Unison[1] does this, or something like it.
|
| Never looked at it myself, so not an endorsement. Just came
| up in the Lobsters discussion[2] on this last week where
| someone mentioned it.
|
| [1]: https://www.unisonweb.org/docs/tour/
|
| [2]: https://lobste.rs/s/b9pddy/when_it_comes_git_history_les
| s_is...
| pjc50 wrote:
| Does that mean git would have to support all known programming
| languages, and any language syntax change would require a
| backwards incompatible upgrade of the VCS?
| nn3 wrote:
| As well as any known config file formats.
| globular-toast wrote:
| More likely git would support a common AST format/protocol
| and language tooling would be responsible for providing that.
| iveqy wrote:
| The way we store data is not the same as using that format to
| solve conflicts. Take a look at semanticmerge.com
| _ix wrote:
| In a sibling comment, I was wondering about projects that are
| trying ASTs. I haven't read deeply about it, but I recalled
| that [pijul][1] might be a way forward.
|
| [1]: https://pijul.com/manual/why_pijul.html#comparisons-with-
| oth...
| morelisp wrote:
| Pijul's patch algebra, and storing diffs rather than
| snapshots generally, makes it less amenable to these kinds of
| experiments than Git. Changing your diff/merge strategy would
| be akin to rewriting your entire project history.
| breck wrote:
| In Tree notation the text and ast have the same shape. You can
| then have a semantic git
|
| https://arxiv.org/pdf/1703.01192.pdf
| luffapi wrote:
| Not sure why you are downvoted. This is a genuinely interesting
| idea and seems like the basis for next gen change management.
| cellularmitosis wrote:
| This isn't specific to VCS, but I'll drop a link to this
| thread about structural editors which PaniczGodek has been
| maintaining for a few years now, just to give it more
| exposure:
| https://twitter.com/PaniczGodek/status/1195784199250284545
| morelisp wrote:
| They are being downvoted for not understanding how git works.
| If you have a 3-way diff/merge tool for ASTs you can plug it
| into git and use it _today_ , and you can use it on all
| existing branches and historical changesets.
|
| The "problem" is no one actually wants to resolve merges that
| way.
| luffapi wrote:
| There's been plenty of times my code as been "lost" in a
| later commit because Git didn't know it was the same thing.
| morelisp wrote:
| I don't understand how this relates to what I said
| without more details about what "lost" means, sorry.
|
| If you mean Git had issues finding some specific code
| motion to show in a diff, you can try one of the other
| diff algorithms, or adjust the threshold for rename/copy
| detection. AST-based differs would also suffer this
| issue; a "nice diff" is not a formal problem and does not
| have a universal solution.
|
| If you mean you once had a mis-merge that dropped some
| code you didn't want to drop, this won't go away with
| AST-based diffs. It will just happen at the token level
| instead of the line level.
|
| If you mean it's generally hard to deal with collapsing
| lots of branches with shared history, that's true but
| would also be true with AST-based approaches. This is the
| situation something like pijul could help with, but also
| raises all the other tradeoffs of snapshot vs changeset
| based approaches.
| morelisp wrote:
| Git stores snapshots, not diffs, and therefore could just as
| well be considered storing ASTs. The trick is writing a useful
| diff/merge for them. Programmers also don't think in ASTs,
| arguably even less so than lines. The problem is not as formal
| as it looks.
| luffapi wrote:
| There's great irony in someone who's name is "morelisp"
| saying that programmers don't think in ASTs. Lisp syntax is
| the AST.
| morelisp wrote:
| Which also means the Lisp _AST_ is not great to diff with.
|
| First, it's too weak - you need to at least recognize
| special top-level defun-style forms, or you'll generate
| some minimal diff between two totally different functions
| just because they both use the same cond pattern or
| whatever.
|
| Second, reader macros mean you can't really work on the
| source AST in the first place, unless you also teach the
| diff tool all your reader macros.
| rwbhn wrote:
| > Programmers also don't think in ASTs,
|
| Citation needed
| morelisp wrote:
| T_PAAMAYIM_NEKUDOTAYIM
| cellularmitosis wrote:
| It is a shame that we don't have better tools, and that we are
| still hand-editing text files in order to write programs.
|
| Imagine having an editor which automatically applied your local
| preferences around tabs/spaces, code formatting, variables up top
| vs nearest use, function definitions nested to minimize top-scope
| surface area vs all functions flat at top scope, etc etc etc
|
| And when you are done editing, all of these local changes are
| reversed and you submit the minimal possible diff.
|
| (and if we want to really talk pipe dreams, the dev only sees an
| AST editor and the underlying text is never even exposed in the
| first place)
|
| Our tools are so far behind mostly because everyone's thinking is
| still chained to 1970's hand editing text files mentality. This
| is the flying car which isn't being worked on because everyone is
| still thinking about making better bicycles.
| ajuc wrote:
| The benefit is minimal, the task is hard, there's decades worth
| of tools that won't work, and it's easy to mess up introducing
| "impossible bugs" with behind-the-scenes transformations.
|
| Also - AST isn't THAT important when reading the code. Let's
| say I give you this: X (X X X X; X X X; X X)
| { X (X X X) { X(X X X X X)
| } }
|
| Do you know what this code does? How about this:
| for int i = 0 i < 10 i ++ if i % 2 printf " %d " , i
|
| I've used several graphical languages professionally (not AST-
| based, graph-based, but the problem remains) and the main
| problem was - structure wasn't fully describing what happens -
| the "meat" of the behaviour was still in text form and was
| hidden behind the pretty graphic form - in case of both of
| these languages the meat was in the names of the subprocesses
| called and in the mapping of process variables <-> subprocess
| parameters.
|
| And there were A LOT of these, so you couldn't show them at
| once on the same screen as the graphical view of the process.
| So programming with both of these languages was very
| frustrating - you had to click on each node and look through
| long lists of x:y substitutions to track how parameters flow
| through the system.
| spaetzleesser wrote:
| Agreed. It seems silly that we still have to deal with things
| like tabs vs spaces or formatting in different ways. This
| should be handled by IDEs and editors.
| al2o3cr wrote:
| Our tools are so far behind mostly because everyone's
| thinking is still chained to 1970's hand editing text
| files mentality.
|
| People IN THE LITERAL 1970s were talking about this exact idea.
| We still don't have this, for anything other than highly-
| specialized applications (for instance, equation editors in
| word processors).
|
| It's almost like it's way, way, way harder to deliver software
| than it is to handwave about how much better things would be if
| people just listened to YOU.
| cjfd wrote:
| AST editting would be kind of okay. But never better than kind
| of okay. Thoughtfully formatted text can be quite a bit nicer.
|
| Also, cities where one can get around on bicyles are much nicer
| than where one needs to use cars. Flying or otherwise.
| https://www.youtube.com/watch?v=ul_xzyCDT98
|
| To summarize both points: what looks like progress
| superficially may actually not be progress at all.
| spankalee wrote:
| This isn't good general advice. The vast majority of projects
| aren't going to need to cherry pick commits to such old branches,
| and overly limiting style fixes and refactorings is a good way to
| ossify a code base.
| cypressious wrote:
| This is exactly what the author is saying in the final
| paragraph.
| sharken wrote:
| The same problems exist when you merge between multiple release
| branches, there is an inherent resistance to refactoring as it
| interferes with merges.
|
| To me the solution is to migrate to trunk based development and
| hide new features behind feature flags.
|
| The biggest issue with this transformation seems to be
| implementing feature flags in the code.
| geofft wrote:
| I agree with the author re Git as it exists right now, but this
| feels like an unsolved technical problem that ideally should be
| handled in your VCS than something to be worked around by social
| conventions not to do cleanups.
|
| git blame, for instance, has the "--ignore-revs" and "--ignore-
| revs-file" options which let you specify some commits that can be
| ignored. This is a little helpful for the sort of code spelunking
| the author does (which I do very frequently), but it's a manual
| process. There's no built-in convention to Git for what this file
| should be named and how to have git blame pick it up
| automatically. Moreover, you have to create the commit in order
| to know what its revision is, which means that in squash/rebase
| workflows, you have to land the style cleanup on the target
| branch and only then can you add it to the ignore list.
|
| One option that occurs to me is a special marker string to put in
| the commit message like "Git-Ignore-For-Blame: yes" or something.
| I'm not sure if there's a better way to do it.
|
| More generically, if the commit is _only_ swapping out tabs and
| spaces, "git blame -w" ought to take care of it. For more
| involved reformatting, git diff and git merge both have "ignore-
| all-space" modes, which will handle things like line breaks.
|
| But really what you want for blames and cherry-picks, I think, is
| blames at the _syntax_ level. The Git commands only know about
| lines - git blame runs on lines, git cherry-pick constructs diffs
| based on lines and tries to re-apply them, etc. But at the
| repository level there 's nothing special about line breaks. Why
| can't we run git blame or cherry-pick on an AST, so that it
| operates regardless of formatting?
|
| Are there tools that do this? (They don't need to be built into
| git; a third-party tool ought to be able to do this just fine.)
| bsmedberg wrote:
| In theory you can plug any three-way merge tool into git. I'd
| love to see merge tools that are syntax aware and can merge
| asts more than line text.
| a-dub wrote:
| i think what you (and the OP) are describing goes beyond making
| merges more robust to nonfunctional changes by using something
| like an AST representation. making text edits apply cleanly is
| one part of the problem, but as the OP notes, the harder part
| of these merges is actually when interfaces stay the same but
| behavior changes. i think that to solve that, you'd need some
| kind of static analyzer that could jump from changed
| lines/syntax to execution paths and then to diffs of them.
|
| i bet there are some cool papers out there on trying to do
| this... but, this is essentially a restatement of the halting
| problem, so building something that is guaranteed to be correct
| using only static analysis may be impossible. (but this does
| not preclude a solution that might be good enough)
| catlifeonmars wrote:
| Well in an ideal world, Liskov's substitution principle
| applies and you only ever care about interface changes.
| catlifeonmars wrote:
| I'm this same world, you don't make breaking changes
| without a new version of the interface. I like living in
| fantasy land :)
___________________________________________________________________
(page generated 2021-06-26 23:00 UTC)