hngopher.com

       [HN Gopher] When it comes to Git history, less is more
       ___________________________________________________________________
        
       When it comes to Git history, less is more
        
       Author : maximilianroos
       Score  : 172 points
       Date   : 2021-06-26 16:09 UTC (6 hours ago)
        
 (HTM) web link (brennan.io)
 (TXT) w3m dump (brennan.io)
        
       | foreigner wrote:
       | On this topic, is there a JavaScript lint tool that integrates
       | with Git to enable you to gradually apply style changes to code
       | as it evolves? E.g. if the change is tabs-to-spaces the lint tool
       | would require spaces in new or changed lines of code, but not
       | complain about tabs in unchanged lines of existing code.
        
         | jabo wrote:
         | You could have the linter only run on changed files in a
         | precommit hook, instead of on all files.
        
           | foreigner wrote:
           | That would work at file level granularity, but I'd like it to
           | work at line level.
        
             | masklinn wrote:
             | You can usually take the output of the lint tool and filter
             | it by the lines modified by the patch (which you get from
             | the diff).
             | 
             | Few tools do it out of the box (in the python world flake8
             | supports diff filtering out of the box, and there's diff-
             | cover which can filter analysis beyond just test coverage)
             | but it's easy enough to do with a wrapper, especially if
             | the linter has a configurable or even parseable (e.g. json)
             | output option.
        
             | jabo wrote:
             | Found an ESLint plugin that will only run on staged
             | changes: https://www.npmjs.com/package/eslint-plugin-diff
        
           | surfmike wrote:
           | This still leads to frustrating behavior, where changing one
           | line of a file might end up reformatting all of it (and this
           | often obscures the actual change itself).
        
             | klodolph wrote:
             | Tools like git-clang-format only reformat the code you've
             | changed, not the entire file. A lot of formatting tools
             | work this way (certainly not all).
        
             | jabo wrote:
             | There's an eslint plugin to run only on staged changes:
             | https://news.ycombinator.com/item?id=27643526
        
               | foreigner wrote:
               | That looks perfect.
        
               | onion2k wrote:
               | There's also the 'lint-staged' NPM package if you want to
               | do pretty much anything only on staged files.
        
         | foreigner wrote:
         | I guess one way this might work is the lint tool could run
         | against the whole file but then cross-check the line numbers of
         | any errors found with the git diff. If the diff shows the line
         | is unchanged then ignore the error. That doesn't sound too
         | hard.
        
         | FalconSensei wrote:
         | On IntelliJ, if you use the Save Actions plugin, you can set it
         | to only reformat lines that were changed. Although I'm not sure
         | if would work for using Prettier
        
           | scns wrote:
           | Thank you
        
       | daneel_w wrote:
       | What was the point the author tried to get across with the first
       | paragraph bringing up the large whitespace commit? It didn't add
       | anything to the story, it just ended up as an initial
       | distraction.
       | 
       | Also, is he implying that thousands of commits (one for each
       | file) would be a better way to convert that large code base? Why
       | not solve this problem with the right tools, such as ignorerevs?
        
         | ghshephard wrote:
         | The whitespace commit breaks cherrypicking. What seemed like a
         | harmless change could cripple automated backports of fixes. I
         | thought it was the entire crux of his story.
        
           | daneel_w wrote:
           | Yes but the whitespace commit - a change that unavoidably
           | affects almost every single line of its target - is possibly
           | the worst example one can come up with.
        
         | looperhacks wrote:
         | I believe his point is that the commit shouldn't have happened
         | at all, keeping the old style instead (or maybe applying the
         | new style together with other changes ... Yuck. Don't do this)
         | 
         | It kinda serves as a hook to describe why these changes are
         | bad. I think it's a good example for the rest of the article.
        
           | daneel_w wrote:
           | I feel it's a very contrived example because there's no
           | compromise to how such a change is to be done. If there's an
           | argument to make it's about making the decision to switch
           | whitespace tactic at all.
        
       | CalChris wrote:
       | What holds true for the Linux kernel may not hold true for other
       | repos.
       | 
       | In this case, the OP is supporting old Linux releases and
       | remarking how naturally cherry picking can work. LLVM by
       | comparison, doesn't have long-term stable releases. So this
       | conservative approach doesn't necessarily apply for LLVM.
       | 
       | So I like the analysis but I disagree with the generality.
        
         | mojolozzo wrote:
         | Agreed! The author makes exactly this point towards the end of
         | the post.
        
         | gumby wrote:
         | > So I like the analysis but I disagree with the generality.
         | 
         | Any organization that releases software revisions has this
         | issue. Yes, we do live in a world where much software is now
         | subject to "continuous release" (either applications which are
         | auto-updated to latest, on phones and some laptop OSes or web-
         | only applications where the latest is pushed to deployment).
         | And indeed the specific issue described here is much less
         | important.
         | 
         | However I suspect the majority of software in use is not like
         | that. On the Enterprise side, OSes and applications are updated
         | carefully; an ERP upgrade can be a major undertaking. All sorts
         | of machine control software is ether never upgraded, with
         | limited patching, or treated the same way as other enterprise
         | software. Etc.
         | 
         | The difference between the two is why there is both "git flow"
         | and "GitHub flow" -- it's not like one is absolutely superior
         | to the other.
        
       | pansa2 wrote:
       | > _The commit was about 10 years old, and it replaced every tab
       | with 4 spaces._
       | 
       | Reminds me of a project I worked on a while back - whenever you
       | looked at the git history of a file, if you went back far enough
       | you'd hit a dead end at the point where every source file was
       | renamed from *.c to *.cpp.
        
         | iveqy wrote:
         | --follow
         | 
         | We have a few files at work that change name with every
         | commit...
        
         | alkonaut wrote:
         | Git tracks changes across renames though right?
        
         | [deleted]
        
       | muxxa wrote:
       | git log -S'text of interest'
       | 
       | has always served me better than blame. It can better jump file
       | boundaries and find e.g. prior code that the code in question was
       | copied from.
       | 
       | I've always thought though that we need better conflict
       | resolution that is code aware. And also better changeset
       | specification, e.g. if a commit is equivalent to a s/foo/bar/
       | then this information should be included in the commit, or if a
       | function signature has changed, then record that fact, rather
       | than the dumb line by line diffs.
        
       | mananaysiempre wrote:
       | On one hand, I agree that these are all good points. I haven't
       | ever played with kernel code, but I have tried to backport years-
       | old Glibc commits and it's an absolutely miserable experience
       | because "minor" things like small-scale reformatting would
       | frequently be bunched together with unrelated changes. They also
       | maintained their configure script in version control and
       | apparently even merged it manually, because I've found that at
       | many points in its history it would be composed of pieces
       | generated by different versions of autoconf.
       | 
       | On the other hand, I don't think that the conclusions of this
       | story are as inevitable as it makes them seem. Much of the pain
       | it describes is due to the fact that Git is, as it proudly
       | itself, _stupid_ : it doesn't understand that your spacing or
       | line breaking or bracketing changes are incidental to what the
       | code is trying to accomplish. I'm not saying that structural or
       | otherwise language-aware editing or source control is the silver
       | bullet because I'm well aware that so far most attempts at it
       | suck, but I think it's important to remember that whitespace
       | pedantry is to a large extent a tooling issue.
       | 
       | Could a merge tool coupled less than completely to clang-format
       | (or yapf, or black, or gofmt ...) use its settings to improve the
       | merges, I wonder?
       | 
       | (Tangentially related: Monticello, Paredit, the "skeleton syntax
       | tree" idea from Dylan.)
        
         | laurent92 wrote:
         | Next generation of version control will diff the AST of the
         | parsed code, so line endings don't show up in the diff ;)
         | Bonus, developers will be able to each see the code with their
         | own code style or open a Java file in Scala ;)
        
           | mananaysiempre wrote:
           | I'd love for that to be true, but it's not like people
           | haven't had that idea before (if maybe not with source
           | control, because it's become ubiquitous only relatively
           | recently). Pushing code beyond a textual representation has
           | been _slooow_ and I haven't yet seen a breakthrough large
           | enough to change that.
           | 
           | The Scala thing isn't happening, though, -- it's a difference
           | of style and approach, not syntax.
        
         | breck wrote:
         | > so far most attempts at it suck
         | 
         | Agreed. I'm pretty sure I have it solved with Tree Notation
         | though. I've got a thing in the works which I call 3D git. Tree
         | languages get it for free. The trick was simplifying the syntax
         | so there's nothing left but space, and then your syntactic
         | shape matches the semantic shape.
         | 
         | A sneak peak:
         | 
         | https://twitter.com/breckyunits/status/1408546408911695873?s...
        
         | Ericson2314 wrote:
         | Yeah absolutely. My personal measure for a the "cohesion-
         | prowess" of a society is how much they screw around with plain
         | text, opaque binary data, etc. vs take the time to properly
         | create rich data structures for interfaces.
         | 
         | Taking the time and deep breath to write nice interfaces rather
         | than scrambling and putting all the effort into implementations
         | ignoring the larger picture of how human endeavors _compose_
         | seems very fundamental. It reminds me also of the slogan
         | 
         | > A developed country is not a place where the poor have cars.
         | It's where the rich use public transportation.
        
           | jolux wrote:
           | > > A developed country is not a place where the poor have
           | cars. It's where the rich use public transportation
           | 
           | This is brilliant, stealing it for my transit advocacy.
        
             | Ericson2314 wrote:
             | It is! Credit to
             | https://en.wikipedia.org/wiki/Gustavo_Petro
        
           | mananaysiempre wrote:
           | I'm not sure it's quite as clear-cut, for three reasons:
           | 
           | - Text (or binary) streams are the ultimate unnormalized
           | representation, and you don't always _want_ your data to be
           | normalized. When storing it, maybe (though I'm sure the DBAs
           | among us will disagree even on this point), but the
           | intermediate values during processing, almost certainly not.
           | 
           | - Most representations of structured values are all or
           | nothing, whereas in reality most types of structured data are
           | usually more conveniently viewed as having a hierarchy of
           | supertypes, possibly with things like "chunk of text" on the
           | top, maybe "list of chunks of text" below that, and so on;
           | otherwise your tools are much less reusable. (Compare
           | "generic programming" in the Haskell/Scala not Java/C# sense,
           | _i.e._ metaprogramming over algebraic datatypes not
           | parametric polymorphism. See also the "nanopass" approach to
           | compilers.)
           | 
           | These two (largely overlapping) points are how I explain to
           | myself the (empirically if not theoretically evident)
           | advantage of Unix's "getta byte" approach to IPC (as Cutler
           | sarcastically put it) over the myriads of structured
           | approaches that came before and after. By not structuring
           | your data exchange, it doesn't force you into a framework
           | that's more rigid than it needs to be. People barely
           | understand how to make structured data amenable to processing
           | by composable tools _now_ , back in the 80s I don't think
           | anybody even knew to ask the question.
           | 
           | This is why I mentioned Paredit and Dylan's "skeleton syntax
           | trees": they are, respectively, a structural editor and a
           | macro system that work where others fail by the virtue of
           | being expressed in terms of more than text or even tokens,
           | but less than a full syntax tree.
           | 
           | Finally, in this specific case,
           | 
           | - We're talking about programming languages, about making a
           | computer understand the programmer's intent to some degree.
           | This is a maze where making even the slightest wrong turn has
           | Godel and Turing together slapping you across the face with a
           | giant NOPE sign. It's kind of endearing to see the luminaries
           | on the Algol committee (essentially all of CS at the time)
           | state that they want, like, pseudocode, from their papers,
           | except, y'know, with a compiler, and not suspect anything can
           | go wrong with that. But we are hardly smarter than them, so
           | we should probably remember the bitter lessons they've been
           | forced to learn and keep in mind the possibility that there
           | might not _be_ a good solution to our problem.
        
             | Ericson2314 wrote:
             | > and you don't always want your data to be normalized.
             | 
             | I think "normalized" is doing a lot of work in this
             | sentence. In the context of data keeping, normalization
             | usually doesn't mean something like "like give a unit
             | vector", where they both have value and the normalization
             | is lossy, but rather cleanups of messy stuff that _ought
             | not to occurred in the first place_.
             | 
             | What I am saying is not the original input has no value,
             | but we should take the steps to make those problems
             | actually not happen in the first place. Rich interfaces do
             | that, and are an investment that pays for itself eventually
             | so this isn't manifestly unreasonable.
             | 
             | > ...metaprogramming over algebraic datatypes not
             | parametric polymorphism...
             | 
             | I worked on a little toy "tree editor" where we did just
             | that. The use of a good type system to both operate
             | "generically" without "upcasting" and loosing the structure
             | for ever is key.
             | 
             | > See also the "nanopass" approach to compilers.
             | 
             | I do like that a lot :) but I also like that there is an IR
             | for every pass, like with CakeML. I thought those went hand
             | in hand?
             | 
             | > (empirically if not theoretically evident) advantage of
             | Unix's "getta byte" approach to IPC
             | 
             | I would argue the (over-)prevelance of HTTP is effectively
             | empirically the opposite, namely that people do prefer
             | structured things in a vacuum, but coordination failures
             | have repeated gotten us stuck in local maxima of already
             | widely established things like Unix's various binary
             | streams and then HTTP.
             | 
             | > This is why I mentioned Paredit and Dylan's "skeleton
             | syntax trees": they are, respectively, a structural editor
             | and a macro system that work where others fail by the
             | virtue of being expressed in terms of more than text or
             | even tokens, but less than a full syntax tree.
             | 
             | I do agree the jump from text to sexprs / json / token
             | trees (Rust's name) etc. is the most important step. Fist
             | get rid of flatness, then worry about higher invariants.
             | 
             | > We're talking about programming languages, about making a
             | computer understand the programmer's intent to some degree.
             | 
             | And surely text doesn't help with that. Too much structure
             | makes some edits hard if the user can't temporarily mid-
             | edit ignore the rules, but computer should love it.
             | 
             | > It's kind of endearing to see the luminaries on the Algol
             | committee (essentially all of CS at the time) state that
             | they want, like, pseudocode, from their papers, except,
             | y'know, with a compiler, and not suspect anything can go
             | wrong with that.
             | 
             | This is about Algol 1968 not Algol 1960? My understanding
             | was that was too many features, not too many invariants.
        
       | zmmmmm wrote:
       | It reminds me that one of my grievances with whitespace
       | significant languages (hello Python) is that I _can 't_
       | effectively just skip commits that are pure whitespace changes in
       | code review. I am curious how people approach this ...
        
       | lilyball wrote:
       | It sounds like the real advice here is "have a strong coding
       | style guide that is rigorously enforced so nobody ever has to go
       | make style fixes to existing code".
        
       | pdw wrote:
       | > At a previous company, there was an "infamous" commit in our
       | main repository. The commit was about 10 years old, and it
       | replaced every tab with 4 spaces.
       | 
       | If have commits like this, add the ids to a file `ignorerevs`,
       | and then tell git about it:                  git config --local
       | blame.ignoreRevsFile ignorerevs
       | 
       | Then at least `git blame` will still give useful results. (This
       | is a relatively new git feature, added a year or two ago.)
        
         | forrestthewoods wrote:
         | Who gives a shit about a single commit that converts tabs-to-
         | spaces? How is this a problem in any way?
         | 
         | The real problem here is that "git blame" is a garbage tool.
         | Perforce has "timelapse view" that is radically better than any
         | historical view I've seen in Git/Hg.
         | 
         | With respect to this issue, git just needs to suck less.
        
           | xmprt wrote:
           | You answered your own question. Git blame is a pretty crappy
           | tool but it's a tool that a lot of people use nevertheless.
           | If you have a commit that converts tabs to spaces, then the
           | blame for pretty much every single line will be lost before
           | that point.
        
         | azernik wrote:
         | Just git blame -w to ignore whitespace. There's also an
         | equivalent git config option.
        
         | shepherdjerred wrote:
         | Wow, this is incredible! I convinced my team to adopt Prettier
         | for many of our repositories. The only real criticism was that
         | our git history would be cluttered since it would reformat all
         | of our code. This seems like the perfect solution.
        
         | catlifeonmars wrote:
         | Nice! Is there a way to check this into a repository (a la
         | .gitigore)?
        
           | banana_giraffe wrote:
           | You can specify a file with of such commits with --ignore-
           | revs-file, and of course check in that file. Naming that file
           | .git-blame-ignore-revs seems to be a convention I've seen
           | more than once.
           | 
           | Still need to specify the file yourself, so hopefully someone
           | can point out the missing magic to have git pull in .git-
           | blame-ignore-revs on its own.
        
             | NegativeLatency wrote:
             | A shell alias?
        
             | karlding wrote:
             | That's what the blame.ignoreRevsFile [0] config option is
             | for.
             | 
             | [0] https://git-scm.com/docs/git-config#Documentation/git-
             | config...
        
         | nickysielicki wrote:
         | Depending on the size of the team and how much agreement you
         | can get on the importance of such a change, I think the better
         | way to do this on an older repository is to get everyone to
         | nuke their checkouts and use git-filter-branch to rewrite the
         | history so that nobody ever used tabs.
         | 
         | https://stackoverflow.com/questions/58042532/how-can-i-clang...
        
           | Kinrany wrote:
           | git-filter-repo is now recommended over -branch by the docs
        
           | emmelaich wrote:
           | With the unfortunate side effect of invalidating external
           | refs for instance in your issue tracker or code review tool.
        
         | bhaak wrote:
         | 10 years ago would probably long enough ago to not care for me.
         | 
         | But wouldn't the standard -w be enough to ignore most of this
         | commit.
         | 
         | But I used ignoreRevs in repositories that messed up their
         | history by using an automatic indenter with atrocious settings.
        
           | gumby wrote:
           | Unfortunately -w won't help you with patch, which uses
           | character positions. But certainly helps with searching.
        
             | harikb wrote:
             | In addition, if their code is python, -w would incorrectly
             | ignore legitimate changes
        
               | jrochkind1 wrote:
               | do you have a realistic example?
        
               | contravariant wrote:
               | A simple one would be:                   total = 0
               | for value in list:            total += value
               | return total
               | 
               | vs.                   total = 0         for value in
               | list:            total += value         return total
        
           | tazjin wrote:
           | > 10 years ago would probably long enough ago to not care for
           | me.
           | 
           | Hm, especially for the case of blaming (mostly to figure out
           | why something was done a certain way) I frequently happen
           | upon 10+ year old commits. This happens both in open-source
           | projects, as well as at work (where we don't use git, but the
           | same concepts apply).
        
       | prpl wrote:
       | It would be cool if you could mark whitespace only changes at
       | least with an option to smart-skip them with `git blame` or
       | something
        
         | hnra wrote:
         | Isn't this possible with just the normal git blame -w? Or if
         | you want to ignore specific commits there is ignore-revs.
        
       | nailer wrote:
       | Storing changes as text is a fundamentally bad idea. Not only
       | does it make merge conflicts more likely to happen, as transforms
       | (me adding a function, you renaming something I use in the
       | function) more likely to occur, it also means that formatting is
       | committed and discussed rather than being a matter of personal
       | preference.
       | 
       | Hopefully the next item in this RCS CVS Subversion Git chain is
       | just storing ASTs and transforms on top of them so we can spend
       | less time fixing basic conflicts and discussing formatting.
        
         | _ix wrote:
         | I've seen the suggestion of VCS/SCM storing ASTs rather than
         | plaintext for a decade now. Are you aware of any projects that
         | are trying to address this?
        
           | arp242 wrote:
           | I heard Unison[1] does this, or something like it.
           | 
           | Never looked at it myself, so not an endorsement. Just came
           | up in the Lobsters discussion[2] on this last week where
           | someone mentioned it.
           | 
           | [1]: https://www.unisonweb.org/docs/tour/
           | 
           | [2]: https://lobste.rs/s/b9pddy/when_it_comes_git_history_les
           | s_is...
        
         | pjc50 wrote:
         | Does that mean git would have to support all known programming
         | languages, and any language syntax change would require a
         | backwards incompatible upgrade of the VCS?
        
           | nn3 wrote:
           | As well as any known config file formats.
        
           | globular-toast wrote:
           | More likely git would support a common AST format/protocol
           | and language tooling would be responsible for providing that.
        
         | iveqy wrote:
         | The way we store data is not the same as using that format to
         | solve conflicts. Take a look at semanticmerge.com
        
         | _ix wrote:
         | In a sibling comment, I was wondering about projects that are
         | trying ASTs. I haven't read deeply about it, but I recalled
         | that [pijul][1] might be a way forward.
         | 
         | [1]: https://pijul.com/manual/why_pijul.html#comparisons-with-
         | oth...
        
           | morelisp wrote:
           | Pijul's patch algebra, and storing diffs rather than
           | snapshots generally, makes it less amenable to these kinds of
           | experiments than Git. Changing your diff/merge strategy would
           | be akin to rewriting your entire project history.
        
         | breck wrote:
         | In Tree notation the text and ast have the same shape. You can
         | then have a semantic git
         | 
         | https://arxiv.org/pdf/1703.01192.pdf
        
         | luffapi wrote:
         | Not sure why you are downvoted. This is a genuinely interesting
         | idea and seems like the basis for next gen change management.
        
           | cellularmitosis wrote:
           | This isn't specific to VCS, but I'll drop a link to this
           | thread about structural editors which PaniczGodek has been
           | maintaining for a few years now, just to give it more
           | exposure:
           | https://twitter.com/PaniczGodek/status/1195784199250284545
        
           | morelisp wrote:
           | They are being downvoted for not understanding how git works.
           | If you have a 3-way diff/merge tool for ASTs you can plug it
           | into git and use it _today_ , and you can use it on all
           | existing branches and historical changesets.
           | 
           | The "problem" is no one actually wants to resolve merges that
           | way.
        
             | luffapi wrote:
             | There's been plenty of times my code as been "lost" in a
             | later commit because Git didn't know it was the same thing.
        
               | morelisp wrote:
               | I don't understand how this relates to what I said
               | without more details about what "lost" means, sorry.
               | 
               | If you mean Git had issues finding some specific code
               | motion to show in a diff, you can try one of the other
               | diff algorithms, or adjust the threshold for rename/copy
               | detection. AST-based differs would also suffer this
               | issue; a "nice diff" is not a formal problem and does not
               | have a universal solution.
               | 
               | If you mean you once had a mis-merge that dropped some
               | code you didn't want to drop, this won't go away with
               | AST-based diffs. It will just happen at the token level
               | instead of the line level.
               | 
               | If you mean it's generally hard to deal with collapsing
               | lots of branches with shared history, that's true but
               | would also be true with AST-based approaches. This is the
               | situation something like pijul could help with, but also
               | raises all the other tradeoffs of snapshot vs changeset
               | based approaches.
        
         | morelisp wrote:
         | Git stores snapshots, not diffs, and therefore could just as
         | well be considered storing ASTs. The trick is writing a useful
         | diff/merge for them. Programmers also don't think in ASTs,
         | arguably even less so than lines. The problem is not as formal
         | as it looks.
        
           | luffapi wrote:
           | There's great irony in someone who's name is "morelisp"
           | saying that programmers don't think in ASTs. Lisp syntax is
           | the AST.
        
             | morelisp wrote:
             | Which also means the Lisp _AST_ is not great to diff with.
             | 
             | First, it's too weak - you need to at least recognize
             | special top-level defun-style forms, or you'll generate
             | some minimal diff between two totally different functions
             | just because they both use the same cond pattern or
             | whatever.
             | 
             | Second, reader macros mean you can't really work on the
             | source AST in the first place, unless you also teach the
             | diff tool all your reader macros.
        
           | rwbhn wrote:
           | > Programmers also don't think in ASTs,
           | 
           | Citation needed
        
             | morelisp wrote:
             | T_PAAMAYIM_NEKUDOTAYIM
        
       | cellularmitosis wrote:
       | It is a shame that we don't have better tools, and that we are
       | still hand-editing text files in order to write programs.
       | 
       | Imagine having an editor which automatically applied your local
       | preferences around tabs/spaces, code formatting, variables up top
       | vs nearest use, function definitions nested to minimize top-scope
       | surface area vs all functions flat at top scope, etc etc etc
       | 
       | And when you are done editing, all of these local changes are
       | reversed and you submit the minimal possible diff.
       | 
       | (and if we want to really talk pipe dreams, the dev only sees an
       | AST editor and the underlying text is never even exposed in the
       | first place)
       | 
       | Our tools are so far behind mostly because everyone's thinking is
       | still chained to 1970's hand editing text files mentality. This
       | is the flying car which isn't being worked on because everyone is
       | still thinking about making better bicycles.
        
         | ajuc wrote:
         | The benefit is minimal, the task is hard, there's decades worth
         | of tools that won't work, and it's easy to mess up introducing
         | "impossible bugs" with behind-the-scenes transformations.
         | 
         | Also - AST isn't THAT important when reading the code. Let's
         | say I give you this:                   X (X X X X; X X X; X X)
         | {             X (X X X) {                X(X X X X X)
         | }         }
         | 
         | Do you know what this code does? How about this:
         | for int i = 0 i < 10 i ++ if i % 2 printf " %d " , i
         | 
         | I've used several graphical languages professionally (not AST-
         | based, graph-based, but the problem remains) and the main
         | problem was - structure wasn't fully describing what happens -
         | the "meat" of the behaviour was still in text form and was
         | hidden behind the pretty graphic form - in case of both of
         | these languages the meat was in the names of the subprocesses
         | called and in the mapping of process variables <-> subprocess
         | parameters.
         | 
         | And there were A LOT of these, so you couldn't show them at
         | once on the same screen as the graphical view of the process.
         | So programming with both of these languages was very
         | frustrating - you had to click on each node and look through
         | long lists of x:y substitutions to track how parameters flow
         | through the system.
        
         | spaetzleesser wrote:
         | Agreed. It seems silly that we still have to deal with things
         | like tabs vs spaces or formatting in different ways. This
         | should be handled by IDEs and editors.
        
         | al2o3cr wrote:
         | Our tools are so far behind mostly because everyone's
         | thinking is still chained to 1970's hand editing text
         | files mentality.
         | 
         | People IN THE LITERAL 1970s were talking about this exact idea.
         | We still don't have this, for anything other than highly-
         | specialized applications (for instance, equation editors in
         | word processors).
         | 
         | It's almost like it's way, way, way harder to deliver software
         | than it is to handwave about how much better things would be if
         | people just listened to YOU.
        
         | cjfd wrote:
         | AST editting would be kind of okay. But never better than kind
         | of okay. Thoughtfully formatted text can be quite a bit nicer.
         | 
         | Also, cities where one can get around on bicyles are much nicer
         | than where one needs to use cars. Flying or otherwise.
         | https://www.youtube.com/watch?v=ul_xzyCDT98
         | 
         | To summarize both points: what looks like progress
         | superficially may actually not be progress at all.
        
       | spankalee wrote:
       | This isn't good general advice. The vast majority of projects
       | aren't going to need to cherry pick commits to such old branches,
       | and overly limiting style fixes and refactorings is a good way to
       | ossify a code base.
        
         | cypressious wrote:
         | This is exactly what the author is saying in the final
         | paragraph.
        
         | sharken wrote:
         | The same problems exist when you merge between multiple release
         | branches, there is an inherent resistance to refactoring as it
         | interferes with merges.
         | 
         | To me the solution is to migrate to trunk based development and
         | hide new features behind feature flags.
         | 
         | The biggest issue with this transformation seems to be
         | implementing feature flags in the code.
        
       | geofft wrote:
       | I agree with the author re Git as it exists right now, but this
       | feels like an unsolved technical problem that ideally should be
       | handled in your VCS than something to be worked around by social
       | conventions not to do cleanups.
       | 
       | git blame, for instance, has the "--ignore-revs" and "--ignore-
       | revs-file" options which let you specify some commits that can be
       | ignored. This is a little helpful for the sort of code spelunking
       | the author does (which I do very frequently), but it's a manual
       | process. There's no built-in convention to Git for what this file
       | should be named and how to have git blame pick it up
       | automatically. Moreover, you have to create the commit in order
       | to know what its revision is, which means that in squash/rebase
       | workflows, you have to land the style cleanup on the target
       | branch and only then can you add it to the ignore list.
       | 
       | One option that occurs to me is a special marker string to put in
       | the commit message like "Git-Ignore-For-Blame: yes" or something.
       | I'm not sure if there's a better way to do it.
       | 
       | More generically, if the commit is _only_ swapping out tabs and
       | spaces,  "git blame -w" ought to take care of it. For more
       | involved reformatting, git diff and git merge both have "ignore-
       | all-space" modes, which will handle things like line breaks.
       | 
       | But really what you want for blames and cherry-picks, I think, is
       | blames at the _syntax_ level. The Git commands only know about
       | lines - git blame runs on lines, git cherry-pick constructs diffs
       | based on lines and tries to re-apply them, etc. But at the
       | repository level there 's nothing special about line breaks. Why
       | can't we run git blame or cherry-pick on an AST, so that it
       | operates regardless of formatting?
       | 
       | Are there tools that do this? (They don't need to be built into
       | git; a third-party tool ought to be able to do this just fine.)
        
         | bsmedberg wrote:
         | In theory you can plug any three-way merge tool into git. I'd
         | love to see merge tools that are syntax aware and can merge
         | asts more than line text.
        
         | a-dub wrote:
         | i think what you (and the OP) are describing goes beyond making
         | merges more robust to nonfunctional changes by using something
         | like an AST representation. making text edits apply cleanly is
         | one part of the problem, but as the OP notes, the harder part
         | of these merges is actually when interfaces stay the same but
         | behavior changes. i think that to solve that, you'd need some
         | kind of static analyzer that could jump from changed
         | lines/syntax to execution paths and then to diffs of them.
         | 
         | i bet there are some cool papers out there on trying to do
         | this... but, this is essentially a restatement of the halting
         | problem, so building something that is guaranteed to be correct
         | using only static analysis may be impossible. (but this does
         | not preclude a solution that might be good enough)
        
           | catlifeonmars wrote:
           | Well in an ideal world, Liskov's substitution principle
           | applies and you only ever care about interface changes.
        
             | catlifeonmars wrote:
             | I'm this same world, you don't make breaking changes
             | without a new version of the interface. I like living in
             | fantasy land :)
        
       ___________________________________________________________________
       (page generated 2021-06-26 23:00 UTC)