[HN Gopher] Converting a Git repo from tabs to spaces (2016)
___________________________________________________________________
Converting a Git repo from tabs to spaces (2016)
Author : keybored
Score : 35 points
Date : 2025-05-02 13:06 UTC (9 hours ago)
(HTM) web link (eev.ee)
(TXT) w3m dump (eev.ee)
| Alifatisk wrote:
| Why would you want to convert from tabs to spaces?
| diggan wrote:
| > their mostly-Python codebase had always been indented with
| tabs
|
| Tabs VS spaces isn't usually very important, but what's more
| important is that all the stuff is the same way. So if all the
| other codebases (in the same language) are using tabs, then
| make everything (in that language) use tabs. Consistency
| basically :)
| Alifatisk wrote:
| I agree.
| gwbas1c wrote:
| I used to agree with that, until I read this article. I would
| always use the IDE's default and "not care" as long as the
| code was consistent.
|
| The problem with tabs is that they render as different widths
| in different contexts. For example, Visual Studio shows them
| as 5 spaces, but Github shows them as 8.
|
| Puts me firmly in the spaces camp now.
| diggan wrote:
| > I would always use the IDE's default and "not care" as
| long as the code was consistent.
|
| I mean, "just use the IDE's default" isn't really agreeing,
| unless that's what your entire organization does too, and
| you all use the same IDE :)
| InsideOutSanta wrote:
| _> The problem with tabs is that they render as different
| widths in different contexts_
|
| The funny thing is that this is why I prefer them. It means
| I control how indents render rather than the person who
| wrote the code.
| mabster wrote:
| I would agree, except that only deals with the left hand
| side of the code. We are also making decisions on the
| right hand side of the code to deal with lime width as
| well which only really works if all developers have the
| same tab size.
|
| Nowadays I just chuck format on save on all the code I
| deal with so I don't have to deal with any of this stuff
| anymore.
|
| If we take this to its longer conclusion though, it would
| be pretty good if our tooling supporting a difference
| between the view (using your own preferences) and storage
| (consistent code for committing to git or whatever).
| mmastrac wrote:
| Because it's the one true way, and tabs are WRONG.
|
| Also Vim > Emacs, the new BSG was better than the old BSG, TNG
| is the best Trek, and all the other hashed-out flamewars of the
| 90's and 2000's. :)
| evbogue wrote:
| There's a debate about new BSG being better than old BSG?
| mmastrac wrote:
| I posit:
|
| For every topic of A vs B where A and B are related in some
| way, no matter how small, there exists an argument C where
| two people take increasingly opposed positions about which
| is better.
| HideousKojima wrote:
| I actually love the original BSG. And the new one started
| out strong but the writers clearly didn't have a plan for
| where they wanted things to go despite the opening credits
| insisting the Cylons have a plan.
| mixmastamyk wrote:
| Agreed. Not to mention the original BSG was strangled in
| its crib for costing too much. Something a production in
| the aughts didn't have to worry much about.
| daneel_w wrote:
| Because the latter is universal, and it can always align
| perfectly. # using tabs with tabsize 4
| some_func( eyesore, blah );
| some_func( eyesore, blah );
| some_func( eyesore,
| blah );
| Asooka wrote:
| We use 4-wide tabs and in our code style it would be
| some_func( verylongarg0, verylongarg1
| );
|
| Which I feel is the most readable option. If you have to
| break the args into a vertical list, you want to use the
| least amount of whitespace before each arg. It's also a bit
| easier to read with every term starting at a tab break.
| daneel_w wrote:
| With a large set of arguments broken down to multiple lines
| I prefer to keep them clear of the function name.
| with_long_func_names( this_scheme,
| looks_muddled ); long_func_name(
| tidier, scheme
| );
|
| But my main gripe with tabs is that no one agrees on the
| width.
| mcdonje wrote:
| Not agreeing on width is an argument in favor of tabs.
| jdrek1 wrote:
| > But my main gripe with tabs is that no one agrees on
| the width.
|
| That's the entire point of tabs. One tab means one
| indentation level and you as the user can decide how
| that's displayed. Spaces forces everyone to see the code
| exactly as whoever decided on his favourite width and
| that is in the best case "only" annoying to people with
| different preferences and in the worst case actively
| hurtful to people with disabilities.
|
| The only argument spaces people ever have is "some of my
| colleagues are too stupid to properly indent with tabs
| and align with spaces" and that is trivially fixed by
| either of those:
|
| - don't use alignment, it's useless anyway
|
| - get better coworkers
|
| - educate your coworkers
|
| - use commit hooks to check wrong usage
|
| So basically there is no argument left on the spaces side
| at all^[1]. Meanwhile tabs semantically mean "one
| indentation level", take up less bytes, and most
| importantly allow everyone to have their own preferences
| without affecting other people. And honestly I am
| insanely baffled by how many people don't get the
| importance of that last part. Accessibility like that
| costs you nothing but means the world to other people,
| similarly how we have ramps at public buildings for the
| elder, wheelchair users, strollers, and so on. And not to
| mention the fact that there are a lot of autistic people
| in programming, which often have a harder time dealing
| with things not being as they want them to be. Is there
| any reason to choose an objectively inferior method and
| force that onto those demographics just because "muh
| alignment"?
|
| [1] Okay fine, there is one: "Tools I don't own don't
| display tabs as I want them, for example GitHub with
| their retarded default of 8". But first of all you can
| change that if you're logged in and second you're
| supposed to use your IDE and not a web interface...
| Asraelite wrote:
| I would agree that there aren't any arguments for spaces
| and would be 100% on the side of tabs, except for one
| problem: variable width means you can't enforce a maximum
| column limit.
|
| Some people don't care about column limits, but they're
| important to me because I like to tile multiple editor
| panes side-by-side with no space wasted.
|
| The entire debate is stupid anyway and should already be
| a solved problem. If we used tooling that operates on
| syntax trees instead of source text, then every developer
| could have exactly the formatting they want without
| conflicts. I don't know why that isn't more widespread;
| the only language I know of to do it is Unison.
| aeonik wrote:
| Why can't you just have a linter or a hook check that
| (tabs*2 + chars) < $defined_width
| creeble wrote:
| The beauty is you don't have to.
| Vendan wrote:
| that's the entire point of tabs, they can be customized
| to what the person reading them wants. It's an
| accessibility issue (https://www.reddit.com/r/javascript/
| comments/c8drjo/nobody_t...).
| SoftTalker wrote:
| Well, obviously tabs should always be 8 spaces.
| daneel_w wrote:
| Not sure if you're joking since 8 makes the whole problem
| even worse :)
| DaiPlusPlus wrote:
| > and it can always align perfectly.
|
| I'm firmly in Team Tab, and I want to arrest any
| misconception that us Tabbers would do anything as
| nonsensical as using our precious variable-width tab-stop
| chars for anything like column-aligning identifiers: we
| don't.
|
| My very hard and fast rule is that tabs are for only
| indenting at the block level, while spaces are used for
| alignment after the initial tab chars; tabs must never be
| used on a line if preceded by any non-tab char.
|
| Whereas I can't stand always-using-only-spaces-for-indenting-
| and-alignment - especially because when you're drag-selecting
| text most editors won't snap your selection to the indent
| level, so you get RSI in your wrist from having to make
| micro-movements to _make sure_ you don 't select more - or
| less - spaces than the intended indent. ...or worse: when
| moving the caret via the keyboard and having to tap your
| arrow-keys 4 or 8 times per indent instead of just once.
|
| You spaces-only people are totally spaced out, man.
| daneel_w wrote:
| Sir, you are way out of alignment. Detabulate immediately.
| Cieric wrote:
| I personally agree with this, but a lot of the tools out
| there break this easily. I'm curious if you have any tools
| that handle the formatting like this properly. I've written
| my own tool that will report invalid whitespace when
| following this, but it can't fix any of it automatically.
| The commonly used clang-format also messes up this scheme
| as it will convert alignment space to tabs.
| DaiPlusPlus wrote:
| I'll admit that I spend most of my time in Visual Studio
| which supports my preferences very well, including my
| .editorconfig (which is now the .editorconfig for my
| entire org... it's almost as if good ideas have a
| following ;)
|
| I do understand the appeal and advantages of having
| automated+opinionated re-formatting as part of a gated
| check-in process, because it's about having a normalized
| and consistent representation in the canonical repo; the
| idea being that you'd have a git-hook that would apply
| your own preferred formatting style on checkout which
| would be undone on commit; alas, we're not quite there
| yet.
|
| ...but having a single, normalized format (even if
| everyone hates it for different reasons) is _the_ reason
| why gofmt and clang-format stick to spaces. I remember
| (back in 2017) being forced to submit to gofmt 's
| dominion over my code and it ruining my beautifully
| aligned mass-assignments - and in my frustration I
| complained about this on StackOverflow and almost
| immediately someone replied with a working solution: use
| C-style comments to "protect" whitespace from being
| mangled by gofmt, see here:
| https://stackoverflow.com/questions/46940772/how-can-i-
| use-g...
|
| Also, apparently clang-format now supports tabs with some
| hoops: https://stackoverflow.com/questions/69135590/how-
| make-clang-... - does that work for you?
| Cieric wrote:
| I'm mainly in Visual Studio at my job as well, I was more
| asking for my personal projects since at work the issue
| has been "solved." Sadly the clang-format stuff doesn't
| work, while it looks like it supports tabs on the surface
| all those settings do (at least last I used it a year or
| 2 ago) is convert all of the tabs to spaces, do all the
| formatting it typically does, and then convert all x
| number of spaces back to tabs if they're at the beginning
| of the line. Effectively converting all the alignment
| spaces to tabs (leaving a few spaces at the end if it's
| not an even multiple.)
|
| My tool at this point basically just has a bunch of rules
| like, 1) if tab indentation changes,
| spaces for alignment aren't allowed 2) tab
| indentation can never be off by more than 1 of the prior
| line
|
| Also flags cases of trailing whitespace and I believe
| tabs not at the beginning of a line. Still debating how
| I'd like to handle fully spaced files as my current
| program reports no errors in that case, maybe just throw
| a warning somewhere that the file looks suspicious.
| Ferret7446 wrote:
| > I want to arrest any misconception that us Tabbers would
| do anything as nonsensical as using our precious variable-
| width tab-stop chars for anything like column-aligning
| identifiers: we don't.
|
| The irony is that this is exactly what tab characters are
| used for. Have you wondered why they're called tabs?
| Because they're used for tabulation, making tables. They
| are intended for aligning columns in a table. Not for
| indentation.
| DaiPlusPlus wrote:
| We aren't using typewriters anymore.
| zahlman wrote:
| My style using spaces is some_func(
| eyesore, blah )
|
| which would work just as well with tabs.
|
| Many years ago, I used tabs, and set them to two-space
| indent. The former because the entire point is that tabs
| carry different semantic information - this is a level of
| indentation, not just making things align vertically - and
| allow each developer to set the indentation width to their
| preference. (The other comment from DaiPlusPlus explains the
| proper use of tabs, just as I did it.)
|
| The latter because that makes them more square. Aesthetics
| matter.
|
| I switched mostly out of peer pressure. But one argument I
| did find convincing is that setting some specific limit on
| line length - whether it's 72 or 78 or 80 or 100 or anything
| else - makes sense, and letting people change the amount of
| indentation defeats that purpose. That is: the guy who likes
| 8-space indents can't actually have them, because it produces
| a horizontal scroll for code that "conformed to the style
| guidelines" when written by the 2-space guy.
|
| But now I alias names, break up complex subexpressions etc.
| to avoid questions of how to split code across multiple lines
| - and most lines in my code are nowhere near any such length
| limit. And I write short functions, so there aren't enough
| levels of indentation to matter.
|
| And I use 4-space indents, because standards have value after
| all.
| ooterness wrote:
| To make more money:
|
| https://stackoverflow.blog/2017/06/15/developers-use-spaces-...
| zahlman wrote:
| My best guess: using spaces selects for developers who
| understand how their editor works (which correlates with
| higher overall cluefulness), because they'd go insane
| otherwise.
| y-curious wrote:
| https://youtu.be/oRva7UxGQDw?si=XUvMWQqIF-Uy1gNp
| rascul wrote:
| This is the discussion I came here for.
| mcdonje wrote:
| Because they're deranged control freaks who need to convert a
| single character that is semantically a tab into multiple
| characters that are an opinionated representation of a tab.
|
| Devs: We need to separate concerns and split the view from the
| model.
|
| Also devs: Someone might view the code differently!!1!
| maw wrote:
| A codebase that's formatted notgivingashittily is an
| accessibility issue. It's not just deranged control freakism.
|
| Maybe Yelp's codebase was otherwise clean, but aside from
| golang projects (and the Linux kernel) I've come to associate
| tabs with unreadable slop code. Maybe your experience is
| different.
| smrq wrote:
| Forcing a single opinionated tab width is an accessibility
| issue -- a real one, not a weird heuristic that boils down
| to "tab fans can't format". I've read multiple accounts
| from people who need either very small tab widths (to
| accommodate unusually large font sizes for eyesight reasons
| without cascading off the side of the screen), or very
| large tab widths (to accommodate difficulty in seeing
| indentation differences, again for eyesight reasons).
| Defletter wrote:
| I've been _firmly_ pro-spaces ever since I discovered there was
| an everlasting war over this, and it came about primarily over
| documentation. Say you 're writing documentation within a /***/
| block, so each line is prefixed by three characters. Now say
| your documentation includes a code snippet. Or lets say that
| particular sections of the documentation (such as JavaDoc's
| @see) are indented so each line always starts after the @see.
| You end up with documentation indented with spaces because it's
| the only way to ensure consistency. And if you're doing it with
| your documentation... why not your code too?
|
| However, my conviction has since been tested by Dart which
| opinionatedly forces you to use two-space indentation. There's
| no way to disable this and its IDE plugins enforce the style. I
| just find it so difficult to read, even with Rainbow Brackets.
| I yearn for Dart to use tabs just so I can configure the tabs
| to appear as four-space indentation. Or better yet, stop trying
| to coerce how people write their own code.
| david2ndaccount wrote:
| Tabs are a control character and have no business being in a
| text file. Do you use ascii record separator characters too?
| IvyMike wrote:
| Galaxy brain: indent using U+001F Unit Separator
| mmastrac wrote:
| I wish Git had a way to "skip" a commit for blame for mechanical
| changes like this. It's the one big shortcoming I keep running
| into. A commit should be able to be marked as "blame-free" and
| git blame should then walk up to the parent commit.
|
| It might be expensive to compute but man it would be so useful.
|
| Edit: TIL about .git-blame-ignore-revs. I am the 1 in 10000 for
| this one today, thanks.
| joshstrange wrote:
| `.git-blame-ignore-revs` is probably what you are looking for
|
| Example:
| https://gist.github.com/kateinoigakukun/b0bc920e587851bfffa9...
| y-curious wrote:
| My one gripe with this is that devs need to point their IDE
| to the file in the IDE settings. When I implemented .git-
| blame-ignore-revs, I got a lot of people complaining about
| blame disappearing completely and I had to point them all to
| editing IDE settings
| js2 wrote:
| It does. See `--ignore-revs-file`:
|
| https://git-scm.com/docs/git-blame
|
| You can configure a default: git config
| blame.ignoreRevsFile .git-blame-ignore-revs
|
| GitHub supports it too:
|
| https://docs.github.com/en/repositories/working-with-files/u...
|
| I'm really curious though. This is a feature you've wished for:
| have you never bothered to run `man git-blame`, `git blame
| --help`, or Google for it? Git has supported it for ages and
| it's a trivially easy feature to find. Using your own
| description:
|
| https://www.google.com/search?hl=en&q=git%20skip%20commit%20...
| mmastrac wrote:
| I've been using Git since the early 2010s and this feature
| was released in Aug 2019 (https://github.blog/open-
| source/git/highlights-from-git-2-23...).
|
| You don't think I looked for it for the first 7-8 years of
| using Git at least a few times and came up empty? Seems a
| little uncharitable. Hacker News is a place to learn about
| stuff, not be chided for missing a point note in a release.
|
| Come on man, you've been using HN for almost as long as I
| have. Be curious, treat people's comments with charity,
| continue the life-long learning tradition.
|
| Obligatory XKCD lucky 10,000 link: https://xkcd.com/1053/
| js2 wrote:
| You're right. My apologies. It wasn't meant as a critique.
| I've been using git even longer and my memory was that the
| feature had been there way before 2019. Time flies.
| Relevant commit:
|
| https://github.com/git/git/commit/ae3f36dea16e51041c56ba9ed
| 6...
| keybored wrote:
| Thanks. Git (proj.) commits can be an enjoyable read.
| xiaoyu2006 wrote:
| I like how the length of the commit message is at the
| same order of magnitude as the commit itself.
| IshKebab wrote:
| It would be a lot more usable if you could put that info _in
| the commit_.
| OptionOfT wrote:
| That on its own is a security risk, as it would introduce
| means to hide a commit in the commit itself.
|
| At least with the . file you have to make 2 separate
| transactions.
| IshKebab wrote:
| No it wouldn't. You would still be able to see the commit
| in logs and file histories and if you ran blame without
| the skipping option.
| prepend wrote:
| No. I don't want the author to make that decision for me.
| I'd rather git record everything and then I can choose how
| to view or render it.
|
| Different people have different view preferences.
| jayd16 wrote:
| The annoying thing about git is that you can't really set
| this kind of stuff up globally for a project w/o digging into
| some custom hook solutions. They should really have some kind
| of default config file with all these things. I really don't
| understand why everything needs to be per user settings ONLY.
| McP wrote:
| Nice to see ignore-revs getting some love :)
|
| I originally wrote it because I wanted to do a mass-
| refactoring to llvm-project to change its weird naming
| convention and "it will mess up git blame" was an objection
| that was raised. Getting ignore-revs landed took many
| iterations over several months (thanks Barret!) and at the
| end of it I felt so drained that I didn't have the energy to
| do the mass refactoring I originally planned. Oh well. Maybe
| someday.
| mabster wrote:
| A big thank you! Blame history being correct is something i
| care quite a bit about and I always add one of these files
| when I do formatting changes. I think I'm probably the only
| developer on my teams with this configured on though haha!
| patrickthebold wrote:
| Is .git-blame-ignore-revs what you are looking for?
| braiamp wrote:
| `blame -w` ignores the ones that are described in the article.
| PhilipRoman wrote:
| git-blame-ignore-revs is great, but ultimately a half measure.
| Replace blame with log -L
| kwk1 wrote:
| See also `git blame -w`
| joshstrange wrote:
| > Then, commit! As per Yelp tradition when rewriting every single
| file in the whole codebase, I attributed the commit to Yelp's
| lovable mascot Darwin. It stands out better in git blame, and it
| preserved the extremely critical integrity of my commit stats.
|
| Interesting, I fully expected this blog post to touch on `.git-
| blame-ignore-revs` as a way to not "pollute" the git history but
| I'm not sure when that "came out". I found a Github issue from
| 2021 asking for support to be added to Github so it may just be
| newer.
|
| How do other people feel about this? Massive code changes across
| the codebase? Where I work some people are (understandably)
| concerned about it "ruining" `git blame` or IDE tools to blame.
| It's not useful to see "Converting to spaces!" on every line you
| want more context on. Yes, you can step further back but that's
| always been a little awkward for me (at least in IntelliJ) but
| maybe I'm missing something. I just find it incredibly helpful to
| understand the context of why a line was last changed and I'd
| want to skip over any edits like tabs->spaces.
| matsemann wrote:
| What if one instead rewrote the last commit for each line to
| use spaces for that line? Or just rewrite the whole history to
| have used spaces. Might break something in the history if one
| were to check out an old commit, though. And makes it hard to
| revert if something breaks due to changing to spaces
| (impossible to find the offender in the diff).
| _Algernon_ wrote:
| >Or just rewrite the whole history to have used spaces.
|
| Ah, yes. The 1984 approach to coding
| woodrowbarlow wrote:
| `git blame -w` ignores whitespace-only changes, for what it's
| worth.
| johnmaguire wrote:
| Added to Git in 2019:
| https://github.com/git/git/commit/209f0755934a0c9b448408f9b7...
|
| Supported on Github in 2022:
| https://github.blog/changelog/2022-03-24-ignore-commits-in-t...
| zahlman wrote:
| > I fully expected this blog post to touch on `.git-blame-
| ignore-revs` as a way to not "pollute" the git history but I'm
| not sure when that "came out".
|
| Per https://news.ycombinator.com/item?id=43869828, it appeared
| August 2019 - so, indeed too late for OP.
|
| e: Also, FTA:
|
| > Blame is not, in fact, permanently ruined. git blame -w
| ignores whitespace-only changes.
| gwbas1c wrote:
| FYI: If you're in the .net ecosystem, you can choose your tabbing
| style (tabs or spaces) with an .editorconfig file. Then running
| "dotnet format" will change everything for you. (And, if you use
| github, you can create actions to assert that the .editorconfig
| is followed.)
| diggan wrote:
| FWIW: EditorConfig isn't a ".net ecosystem" thing but works
| across a ton of languages, editors and IDEs:
| https://editorconfig.org/
|
| Also, rather than using GitHub Actions to validate if it was
| followed (after branch was pushed/PR was opened), add it as a
| Git hook (https://git-scm.com/docs/githooks) to run right
| before commit, so every commit will be valid and the
| iteration<>feedback loop gets like 400% faster as you don't
| have to wait for Actions to finish.
| gwbas1c wrote:
| Git hooks require environment-specific configuration. CI
| enforcement makes sure that everyone follows the rules, even
| if they "forget" to set up the git hook.
|
| Also: dotnet format is kind of slow, which is why they aren't
| used where I work.
| diggan wrote:
| > CI enforcement makes sure that everyone follows the
| rules, even if they "forget" to set up the git hook.
|
| Yeah, my wording was a bit poor (shouldn't have said
| "rather"), both are needed, one just helps you fix stuff
| faster :)
|
| And if you write your hook in a language that can cross-
| compile and can easily deal with multiple platforms (Go,
| Rust, NodeJS, many options [probably .net too?]), it's
| really easy. Just need to make the setup of them part of
| the onboarding.
| gwbas1c wrote:
| > One way or another, you must get this block in your devs' Git
| configuration
|
| Uhm, things like this should be enforced in CI. IE, as a rule
| that must pass in order for a pull request to be merged.
| kgwxd wrote:
| I never understood why programmers universally like fixed width
| fonts, but then about half want just 1 of those characters to be
| batshit crazy.
| gwbas1c wrote:
| One funny anecdote: I once did a similar cleanup on a codebase
| that was _mostly_ spaces, but a few tabs slipped in. (I just did
| a find and replace on \t - > " ")
|
| Suddenly, one unit test broke. On closer inspection, whoever
| wrote it put a tab character into a string. I changed the test to
| use \t.
| baobun wrote:
| I would just approach this like text. Something like:
| find -type f -name '*.py' -exec sed -i 's/^\t/ /' {} \+
|
| , until you don't see a diff
|
| Seems simpler to adjust that general approach to whatever
| codebase and replacement.
| s09dfhks wrote:
| what is this furry tomfoolory
| user9999999999 wrote:
| whitespace is a terrible block scope definition, its literally
| using 'invisible' characters to determine block scope! just use
| semi colons. LONG LIVE SEMI COLONS ;;;;;;;;;;;
___________________________________________________________________
(page generated 2025-05-02 23:01 UTC)