[HN Gopher] Commit often, perfect later, publish once: Git best ...
___________________________________________________________________
Commit often, perfect later, publish once: Git best practices
(2013)
Author : hidden-spyder
Score : 230 points
Date : 2021-07-04 05:45 UTC (17 hours ago)
(HTM) web link (sethrobertson.github.io)
(TXT) w3m dump (sethrobertson.github.io)
| globular-toast wrote:
| One of the more challenging parts of git for people to understand
| is that many of its tools and concepts serve many purposes.
| Commits are a good example of this. To git, a commit is a very
| simple and precisely defined thing. But what they represent is
| basically up to you.
|
| One thing they could represent is your current work as of this
| minute. This is really useful to you while you're working in case
| you break something and need to revert. But it has very little
| use to anyone else. Nobody needs or wants to know the history of
| a project down to the minute. How long it took you to develop a
| feature and how many times you messed up is completely irrelevant
| to the project a week from now. So these commits should remain
| private.
|
| Another thing they could represent is _versions_. What 's a
| version? It's a fully valid and working copy of the project that
| anyone could use. These are the commits that should end up on the
| master branch. Sometimes it's possible to write these first time
| (for bug fixes or trivial stuff), but most of the time they need
| to be curated and constructed from the work done using the
| previous type of commit.
|
| If history is worth keeping, it's worth maintaining. If you don't
| think it's worth, delete your history.
| letmeinhere wrote:
| > Does this mean one per product, program, library, class? Only
| you can say. However, dividing stuff up later is annoying and
| leads to rewriting public history or duplicative or missing
| history. Dividing it up correctly beforehand is much better.
|
| Got it, simply devise the correct level of modularization for an
| increasingly complex project at the beginning of time, to avoid
| annoyances. Easy peasy. /s
|
| Git actually has tools to separate paths into their own
| repositories with intact histories, so I recommend the opposite.
| phone8675309 wrote:
| > Got it, simply devise the correct level of modularization for
| an increasingly complex project at the beginning of time, to
| avoid annoyances. Easy peasy. /s
|
| I know you're being sarcastic, but this is the "you know what
| every late project has in common - it should have started
| earlier" observation, just applied to program design. A killer
| thought terminating cliche.
| mtVessel wrote:
| > "...your work will not be lost for at least two weeks unless
| you really work at it"
|
| What happens after two weeks?
| phone8675309 wrote:
| git will prune inaccessible objects after 2 weeks by default:
| https://git-scm.com/docs/git-gc#Documentation/git-gc.txt-gcp...
| laurent92 wrote:
| GDPR and GIT: Should we have personally identifiable
| usernames/emails in Git?
| tremon wrote:
| Yes, because of
| https://en.wikipedia.org/wiki/Moral_rights_(copyright_law):
|
| > The moral rights include the right of attribution, the right
| to have a work published anonymously or pseudonymously, and the
| right to the integrity of the work
|
| Attribution and integrity are guaranteed by Git's committer and
| hash fields, so it's just complying with the law here. If the
| artist wants to publish anymously or psuedonymounsly they can
| do so by choosing the committer name and e-mail appropriately,
| but to suggest that git is somehow violating the GDPR here is a
| stretch.
| dahart wrote:
| I'd love to hear the case against, what reasons are you
| thinking of, and how would you imagine git working without PII
| (personally identifiable information)? In the general case,
| since you're asking about _all_ git, I'm not sure it makes
| sense, but there probably are some cases where it might. (And
| maybe you're stumbling onto a business opportunity...)
|
| My reasons for thinking the default status-quo username/email
| setup is not generally an issue wrt GDPR are:
|
| - GDPR is for EU residents, not git repos that don't have any
| contributors residing in the EU.
|
| - GDPR mainly applies to "professional or commercial"
| organizations of more than 250 people, insofar as they sell
| within the EU or "monitor" EU residents.
|
| - Git is a tool anyone can use, and it's not a company or
| organization, so there isn't any way to enforce GDPR against
| git per se.
|
| - Many git repos are private, only accessible within a company
| or organization, so no need to keep out PII.
|
| - Emails & usernames in git are already opt-in, mostly. People
| using github can use throwaway identifiers, if they want. But
| it usually makes more sense to want your public git
| contributions to be personally identifiable and publicly
| associated with you, so that people can reach you, for people
| building resumes with OSS contributions, etc.
|
| It might make sense for public repos that are associated with a
| company or professional org to hide emails (and maybe usernames
| too, but usernames are not automatically PII, and anyway you'd
| need _some_ kind of author identifier).
|
| Your question does potentially bring up what I think is an
| interesting problem with GDPR which is the ability to revoke
| permission to PII. Someone could opt-in to a company's public
| repo with their email, and later want to revoke it, which would
| be problematic with git in a public repo, to say the least.
| dang wrote:
| Discussed a bit at the time:
|
| _Commit Often, Perfect Later, Publish Once: Git Best Practices_
| - https://news.ycombinator.com/item?id=6138928 - Aug 2013 (4
| comments)
| andreineculau wrote:
| I haven't read it all but this could be great material as a git-
| mindset booklet for a team. Great to see someone taking the time
| to put down common sense notes about working with git, rather
| than yet another dry "this is how git works" or "this is how you
| should use git (because i told you)".
| 29athrowaway wrote:
| This is not a best practice.
|
| Detecting problems is finding a needle in a haystack.
|
| If you want to be efficient at finding the needle in the
| haystack, you make the haystack smaller.
|
| That is what code review is about. You do not scan the entire
| source code for defects, only the new code being added.
|
| You can commit often to your own branch, and submit for code
| review once it's stable.
| darepublic wrote:
| Create a branch, do frequent commits with messages like "wip",
| "bug city", "wtf", prior to merge squash and create message
| referencing the objective of the commits and what they
| accomplished.
| war1025 wrote:
| Was with you for the first half.
|
| For me it's lots of XXX commits to save work in progress,
| followed by a `git reset <base>` and several passes of `git add
| -p` to build up a set of reasonable standalone commits.
| sanmartin65 wrote:
| Are you working for startups or make startups here I curate deals
| from 25+ platforms on software, saas and many more must check
| [Way Below](https://waybelow.omkarbirje.com)
| chiefalchemist wrote:
| I use Git the best I can (read: I am by no means Mr Git). I try
| to read these type of articles and I too often feel inadequate
| (i.e., there's not enough explanation on what, why, and why that
| matters). Real life examples would help a lot. Else I have to try
| to imagine the appropriate situation for each recommendation; and
| frankly I can't always do that. There are more possible scenarios
| than I have experiences.
|
| There's got to be a better way. Does learning and/or using have
| to feel so complicated?
| cjfd wrote:
| I kind of feel that this kind of git advice is way beyond the
| point of diminishing returns. As a conscientious developer we
| have a lot of work. We write code of good quality. We refactor
| that code regularly. We write automated tests. We test the
| program manually. We use linters and type checkers. We talk to
| people to find out whether what they requested is actually what
| they need. But the day only has 24 hours. At some point one has
| to say that enough is enough. I really want to put the 'enough is
| enough' point before worrying about a good looking commit
| history. Some years ago we all went from svn to git and I am not
| really sure the improvement was worth it. Sure, git is
| objectively the better version control system. One can do a lot
| more things in git. But then the disadvantage is that one
| actually gets to think about all of these 'a lot more things'.
| One thinks about questions like should one merge or rebase and so
| on. I seems like an activity that falls under what is commonly
| called 'bike shedding'. But like I was asking before: is this
| really worth it?
| TimTheTinker wrote:
| > Some years ago we all went from svn to git and I am not
| really sure the improvement was worth it.
|
| I used svn for years before switching to git. Trust me, git is
| a _huge_ usability improvement, even for use as an individual
| developer or small team.
|
| Merging in svn was _significantly_ harder. Cherry-picking was
| unheard of, not to mention diffs and patches.
| tovej wrote:
| Some things are more worth it than others, and you can choose
| how much effort you put in. Small commits are definitely worth
| it when debugging. I've lost hours, maybe even days because of
| long commits that were a pain to bisect. The help you get from
| singling out a commit of 3 changes compared to one with 20
| changes is immense.
| LAC-Tech wrote:
| I agree 100%. I wish we could use a version control tool that
| didn't require so much attention. I don't want to read article
| after article for something that should just get out of my way.
| sharken wrote:
| A set of basic Git commands for everyday use is a good
| starting point and of course knowing a bit about the staging
| area and remote/upstream sources as well.
|
| But if you move into the enterprise with multiple releases in
| active development and multiple teams, then it's hard to keep
| Git out of the way.
| bonzini wrote:
| It's out of your way _if you so desire_ , just squash and
| commit. What you pay the price for isn't satisfying git's
| whims; it's a better debugging or learning experience months
| or years down the line, which is something that git enables
| you to do.
| t-writescode wrote:
| This is honestly where I think good git GUI tools help, and
| for those I go to Git Extensions, or did until I had to use a
| Mac for work.
|
| I learned enough using it that I can work through most issues
| in a regular and natural way nowadays.
| simonw wrote:
| I've realized that as software engineers our work isn't writing
| code: it's changing code.
|
| As such, the unit of work that we produce is the code change.
|
| Which means crafting a good commit - with a good commit message
| - is key to our craft.
| veidr wrote:
| I think your question is two questions: is git (with its
| massive additional complexity over svn worth it), and then is
| the level of git fuckery described in the article worth it?
|
| Obviously, most of the world thinks git as a whole is worth it.
| Otherwise, it wouldn't have so thoroughly and totally dominated
| the software revision control system space.
|
| OTOH, probably most of us are using a tiny fraction of the full
| git feature set, and would perhaps have a different opinion if
| we were forced to learn and use all of it.
|
| But is it worth it to learn, remember, and follow all the
| guidelines in this article?
|
| YMMV and it depends on the project, but personally I would say
| no, unless it is a project of unusual quality and importance.
| Life is short and it probably won't matter at all in the end.
| BugsJustFindMe wrote:
| > _But the day only has 24 hours_
|
| 8. The day only has 8 hours. I'm not getting paid for more than
| that.
| smartbit wrote:
| 6. you might get paid for 8 hours, but _effectively_ work 6
| hours. The rest of the _working hours_ are for mundane tasks.
| war1025 wrote:
| If we're going down that road, I generally try to be
| productive from roughly 2pm to 4:30. The rest of the day is
| for waiting to see if something unexpected catches fire and
| figuring out what it is I should be working on for my two
| and half hours of focus time.
| avar wrote:
| You're right, for a lot of people it isn't worth it. The linked
| article contain way more detail than I daresay the vast
| majority of users of git need to know about these days.
|
| If you're a programmer who heavily uses it in your workflow you
| might find this sort of advice useful, but I've helped plenty
| of programmers with some issue in git that wouldn't have
| required my help if they had even intermediate knowledge of it.
|
| Does that mean that their time would be best spent knowing more
| about it? Maybe, but maybe not. In some cases I'd say
| definitely not. A lot of people are productive with it knowing
| no more than "git status/add/commit/push/pull".
|
| I've got expert-level knowledge on git, but I use plenty of
| tools that I've got at best a novice or beginner-level amount
| of knowledge of. You can't know everything. The tricky part is
| that until you know something you can't know what you're
| missing.
| dkarl wrote:
| I disagree. Like code, commit history is read much more often
| than it is written, so care in writing is repaid over time.
|
| You don't even have to take that much care and effort. The vast
| majority of the time, you only need to rebase WIP commits
| together into a single commit with a descriptive message. The
| effort of this is insignificant compared to the value of a
| meaningful, readable commit history.
| Retric wrote:
| I personally spend significantly more time writing git commit
| messages than reading them. Spending lots of time looking at
| old commits is IMO a sign of poor code quality. Similarly,
| summarizing each change deserves some real thought so when
| you are reading these messages their actually helpful.
| vngzs wrote:
| Ever spend significant time spelunking in the Linux kernel
| commit history [0]? They have a guide [1] on submitting
| good patches, and lots of care is taken to commit message
| quality.
|
| The commits are quite helpful, not just for diagnosing
| errors but explaining philosophy and reasoning behind code
| changes [2]. I'm tremendously thankful for the amount of
| effort those developers put into this, because it helps
| newcomers come up to speed and follow development. In that
| kind of codebase, it would be virtually unapproachable
| otherwise.
|
| [0]: https://git.kernel.org/pub/scm/linux/kernel/git/torval
| ds/lin...
|
| [1]:
| https://www.kernel.org/doc/html/v4.10/process/submitting-
| pat...
|
| [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torval
| ds/lin...
| SergeAx wrote:
| When your codebase is 5+ years old and was written by 10+
| engineers, 2/3 of those are not working here anymore, all
| you have is commit history linked to task tracking
| database. So it is naturally the first step of working on a
| new feature: get to know the subdomain, find corresponding
| sub-namespaces, look into commit history of those and find
| related task and specs.
|
| Or you may just punch in a lot of code in hope it will find
| somehow it's place in codebase and be consistent with other
| code and readable by your successors and won't break some
| edge cases not covered by autotests.
| Retric wrote:
| Interesting, I favor reading the entire codebase when
| possible assuming I am going to be working with it for a
| few months. I only start really looking at old commits
| only when I need to edit code that's particularly brittle
| or incomprehensible.
| Quikinterp wrote:
| If I ever have to fix bugs I'll go and look at the commit
| history to see where it likely started. Especially if it
| was code that belonged to me but someone inadvertently
| commited something that broke it, that's why I like a
| nice commit history.
| SergeAx wrote:
| Let's take exactly 10 engineers and let them work on the
| project for exactly 5 years. That would be roughly 2500
| mythical man-months. Let's say 2000 due to vacations,
| holidays, non-coding tasks, other tasks and Googlesque
| 20% pet project time.
|
| Let's say every engineer produces 100 lines of code
| including blanks and comments per day and deletes another
| 50. This is anecdotal, you may check your own numbers.
| This is +250 LOC per week and for our team it means that
| entire codebase will be 0.5m LOC in 5 years.
|
| I saw a codebase this size once. In 2 years I've read
| about 1/10 of it.
| aelzeiny wrote:
| > a sign of poor code quality
|
| Or really complex software. Those of us who work in B2B
| SaaS products know what it's like to understand what a
| block of code does, but not comprehend why it was written.
|
| However, I don't disagree with this sentiment. I think what
| most devs fail to understand about git-history is that it's
| just another place for sparely populated documentation.
| Your commit history might be AMAZING, but chances are that
| nobody knows to look for it. There's just too many places
| where one has to reiterate intent and purpose (doc-string,
| inline comments, project specs). It's a lot to keep up with
| - especially for non-technical folk.
|
| Solutions:
|
| What worked best for my team to was a simple git precommit-
| hook shell script that just auto-tags the ticket number
| onto the start of every commit message. So `git blame` just
| points back to JIRA/Asana/Trello/Whatever you're using to
| document intent. That way, there's a single source of truth
| for developer documentation that is accessible by everyone.
|
| At Google if there's something that needs explaining you'll
| see a tiny-url-link that just points to the Google Doc as
| an in-line comment (i.e: go/my-project-specification-doc).
| Commit messages do the same. This saves on copy-pasting
| multiple lines of purpose on every file and line of code.
| [deleted]
| jackhab wrote:
| As someone who also switched from SVN to git many years ago I
| understand this "is this really worth it" thing. It came to my
| mind many times in the beginning when I said to myself "do we
| really need a distributed SCM if everyone are always working
| against the same server anyways".
|
| But putting git's technical advantages aside, for me, one of
| its most important values is that it has become de facto
| industry standard. It's like IP/TCP/UDP protocols which
| everyone understands be it a tiny IoT device or 10K-core
| cluster.
|
| With all this enormous amount of programming languages,
| frameworks and tools we have in the industry it's so nice we've
| managed to agree, at least, upon one very important element of
| our work.
| tharkun__ wrote:
| This just reminded me of when I last didn't use git. It's
| been so long and really worth it.
|
| It must have been around 2010. I was with a client and they
| were using SVN in the projects I worked on (and some other
| obscure - to me - versioning system that they literally had
| "build masters" for that were the only ones allowed to merge
| and solve conflicts for the entire company, even though they
| never ever worked on the code itself). I had been using git
| before that already for some time and couldn't fathom why
| anyone would not use that.
|
| They didn't believe my raving about git. Almost needless to
| say that they were always complaining about merge conflicts,
| about not being able to do X or Y or Z or that things were
| slow or error prone (like branches) or that they messed up a
| conflict resolution upon merging and needed help because they
| forgot to copy their entire source tree to a different
| location before doing so etc.
|
| I had zero problems like that because I simply used `git-
| svn`.
|
| Why is this significant? Because I hear so many people
| complain about how hard a "trunk based development workflow
| with rebases" is. Well guess what, that's basically what git-
| svn is. Nobody in their right mind used SVN any other way but
| with a single trunk, because branches sucked so much (except
| for release branches and yes, if you had to bring in
| something from `trunk` it wasn't just a simply cherry-pick).
| And because of SVN being what SVN was, before committing to
| SVN git-svn would simply rebase my work on the current SVN
| trunk automagically and the issue the commit to SVN. Any
| conflicts could be solved locally on git and because I had to
| commit anyway before it could do this rebase. In those two
| years I had literally one conflict and it was easy to solve,
| while everyone around me kept having problems. Heck even the
| feature branches were easy for me, because it's just a
| different 'trunk' to rebase onto and I actually _could_
| cherry pick for them :)
|
| I learned the one lesson I think everyone just has to keep in
| mind with git: Commit before you do anything else and you
| won't ever loose work (save bugs in git or you doing stupid
| low level stuff). You can always just reset back to your
| commit and redo the botched attempt to resolve conflicts and
| this time around it should be easier with the knowledge you
| gained in the first try. That's also why I never use git-
| stash to move stuff around. There's no commit to go back to
| if you accidentally did a `git stash pop` instead of `git
| stash apply`. Heck there was even this one guy who got some
| rebasing and squashing so wrong that he force pushed a
| completely different branch onto his branch. "Lost" all of
| his week's work. He was so so happy when he learned how to
| find the commit hash for the 'lost work' and that all
| branches are in git are labels for those commit hashes and we
| magically brought it all back by reattaching this label to
| the right commit and force pushing again.
| sharken wrote:
| As someone about to move from Mercurial to Git i absolutely
| agree, even though Git has its quirks, you can always find
| some help for a specific problem.
|
| Refactoring was mentioned and i think a good Git branching
| strategy is vital in that regard. If you have multiple
| branches and merge between them, then refactoring tends to
| not happen due to developers not wanting to have difficult
| merges.
|
| The obvious choice is Trunk-based Development, but it's
| almost a bigger transition than moving to Git.
| seba_dos1 wrote:
| Isn't the answer to "is this worth it" quite obvious when doing
| code reviews? Properly and logically split commits can make it
| so much easier and more pleasant, and things like "commit
| often, perfect later" are in my experience less, not more work.
|
| The rest seems like a pretty basic set of tips that I mostly
| learned over the years because I needed them, so it would be
| definitely nice to read such article several years ago to save
| me some time ;)
| danny_sf45 wrote:
| Do people actually do code reviews per commit? I know it's a
| thing, but in all my years of experience, I haven't met
| anyone who actually does this. The usual practice is to
| review all the changes at once (e.g., go to the "Files
| changed" tab of a PR in GitHub and start reviewing the
| changes). This, of course, means, that PR are "small". If a
| PR is too "big" then one politely asks the author to split
| the PR in many.
| alecbz wrote:
| > Do people actually do code reviews per commit?
|
| This is sort of a meaningless question without an
| understanding of how often people commit. Some people keep
| a single commit for a single PR (and just constantly update
| that one commit), others make tiny little commits for every
| individual change.
|
| The only real question is: what is the granularity you
| should review code changes at? The number of git commits
| that that maps to doesn't really matter.
|
| IMO: the ideal is to try to keep pull requests as small as
| possible while still having each PR be a coherent,
| justifiable change on its own. I don't know if it's
| realistic to treat that as a hard rule, but I think it's
| the right thing to aim for.
| angrais wrote:
| Yes, but the OP asked if people are reading specific
| commits in a PR.
|
| The answer for me at last is no. Because the history in a
| PR should not matter but only the final commit. That's
| why we choose to squash commits in my current role.
| sime2009 wrote:
| Not really. PRs have become the new commits. I only pay
| attention to the changed files tab in GH. I encourage other
| people to commit early and often and share their progress
| with the rest of the team.
| dkubb wrote:
| My current team does it, and it is so enjoyable.
|
| We practice atomic commits that change only one thing, and
| in one way. We separate Fix, Refactor, Change, Add, Move
| and Remove commits from each other. Our commit summaries
| begin with those keywords so we can tell what "type" the
| change is. Each commit must pass CI 100% on it's own.
|
| There are specific characteristics we've noticed. For
| example, for us, a "pure Refactor" commit either touches
| code or tests, but not both at the same time. If we touch
| too much we try to split it, or we call it a Change and it
| gets even more/different scrutiny.
|
| Reviewing by commit allows me to give my full focus to each
| without having to keep all state in my head at once. I can
| step through and understand the series of changes. It's
| like telling a story, once I've stepped through each commit
| I am better able to understand the whole picture. If I only
| look at the summary some of the finer points are missed.
|
| Also we've found we can extract commits more easily and get
| those merged early while we work on our main branches. If
| we see a Refactor, we can just do it, extract it and then
| keep our main PR focused.
| vlovich123 wrote:
| When I was at FB, there's no PR as defined at GitHub. Each
| independent commit was a separate review. So in that sense,
| all of the 10s of thousands of engineers at Facebook daily
| review "PRs" at the commit level.
| angrais wrote:
| In that case, where you then recommended to only make one
| commit? Are there guidelines for the average/ideal commit
| length? Did this add to extra workload in creating
| "perfect" commits? Likewise, did this mean that pushing
| "WIP" commits to remote was rare?
| Izkata wrote:
| For me, it's useful to peek into the thought process behind
| the commits if I notice something odd or interesting in the
| aggregate review. On occasion I end up with a totally
| different set of comments because I find they'd already
| tried something I was going to suggest.
|
| More important to me, the "logically split" commits are
| less useful when hunting down a bug months or years later.
| They explain the final intent, but not the path to get
| there, when the path to get there reveals when the bug was
| actually introduced, and following the thought process
| using those original commits reveals what the developer was
| intending instead of just what the end result was.
|
| For example, was the bug introduced in the
| refactoring/cleanup? This happens a lot, and an earlier
| state of the code reveals what it should be doing. Was it
| introduced during initial development? If so, is it just an
| edge case they didn't think about (because it should have
| been impossible and a recent change elsewhere made it
| possible), or is it a remnant of some earlier version of
| the feature? ..etc
|
| (That remnant-of-an-earlier-version one has actually
| happened to me and straight removing some code rather than
| trying to fix it was the correct way to handle it)
| jrockway wrote:
| It really depends on the timescale of your changes. Change a
| few lines in an afternoon? Nobody cares about the history. But
| if you ever have a long-running branch, you will care about
| your local history. You'll merge in the main branch, and it
| will conflict. Someone already renamed the thing that you're
| renaming in your branch. An auto-formatted changed its mind.
| The API changed. Some stuff was refactored. The difference
| between "hmm, annoying" and "I'll just delete this work and
| start over" after struggling for a week pretty much comes down
| to the history -- remembering what you did and why, and being
| able to apply pieces of that relative to the new state of the
| main branch.
|
| This is one of those insidious things that will only affect
| you. Nobody else has your working copy, and nobody cares what
| you do in it. But they will be doing their thing while you do
| yours, and to thrive in that environment, you can take
| advantage of tools.
|
| (As for keeping PRs/CLs focused, it is sure nice when you hit
| some weird bug 3 months later and can identify the candidate
| commit with a bisect. You don't NEED the history to debug
| something; just debug it. But it can sure help with the "why"
| and "how" and get you from debugging to fixing much faster.
| Bisecting is at the top of our debugging checklist -- do it no
| matter what once you have a reproduction. It saves so much time
| that it's not even worth thinking about.)
| jollybean wrote:
| " Change a few lines in an afternoon? Nobody cares about the
| history."
|
| If you changed a couple lines of code as part of a module
| refactor or something i.e. ongoing changes, then sure.
|
| But if you were fixing a bug or doing some very specific
| thing, those are actually the times where hyper-specific
| contextual commits are necessary.
|
| Git is this generic tool for which we still have not yet
| learned all the best practices and idiomatic usages.
|
| Probably there should be a handful of types of commits, not
| more than 3-5 i.e. 'change' 'bug fix' 'upgrade' 'internal
| release' 'major release' (these are probably bad examples).
| ChrisMarshallNY wrote:
| The old Perforce Mainline Model[0] prescribes regular merges
| from the mainline into your working branch, in order to
| reduce the hit, when you merge back down, into the mainline.
|
| Basic common sense, and it also applies to git. With git,
| those regular up-merges are a lot easier.
|
| Personally, I have been using git for years, and have never
| looked back at Perforce, but learning on more primitive VCSes
| taught me a disciplined, careful approach.
|
| I did have to "unlearn" a few things (mostly, relaxing, and
| having more faith in the tool), but the transition has been
| fairly smooth.
|
| [0] https://www.perforce.com/video-tutorials/vcs/mainline-
| model-...
| marton78 wrote:
| I agree with this advice in spirit, but I prefer rebasing
| regularly onto master, rather than merging. Especially in a
| long lived branch this keeps the list of changes small.
| Imagine that someone performed a huge refactor in master,
| which conflicts with some small thing you did in your
| branch. If doesn't help to have another commit of the huge
| refactor in your history: the merge commit. I very much
| prefer to see only the logical changes I did in my branch.
| ChrisMarshallNY wrote:
| That makes sense, as long as the work I do isn't that
| "huge refactor," for someone else.
| cjfd wrote:
| You talk about 'long running branch'. I would say that as
| soon as you have a 'long running branch' you have already
| lost. Good git practices or bad git practices don't really
| matter. You will suffer anyway. The only way to make it
| better is to find a way to do your changes while avoiding the
| long running branch. I have seen a long running branch that
| lived some five years or so. It was horrible. And it would
| have been possible, in this case, to avoid most of the
| problems using a feature switch instead.
| sethammons wrote:
| Five years?! Wow. I was thinking that long running was like
| a week or two.
| danielheath wrote:
| I find this sort of process useful, but not necessarily for the
| obvious reasons.
|
| Picking apart changesets after the face helps me find issues
| that are likely to crop up in review, because I am looking at a
| patch.
|
| It also helps me separate contentious changes from obviously
| correct ones, which means the clearly good stuff isn't held up
| waiting for approval on the complex.
| morpheos137 wrote:
| A human edit goes some way to limiting the set of
| controversial changes - it means there is a small human
| oversight that we can read through to determine.
|
| All those are valuable, and it's only now starting to get the
| sort of support that would allow me to say "No, none of those
| changes are contentious, and no, you can approve it.". In the
| absence of that, some changes are contentious enough that I
| can't approve it. We can and should argue about the edge
| cases (especially with some form of non-subjective metrics
| for controversiality).
|
| However, given we are looking to be useful at work, I think
| the anti-semantic review process is what will get us there.
|
| From a technical POV, the major issues would be to introduce
| syntactic analysis of the text, to support dictionary.
|
| I think that is probably fine, given that it won't happen
| anyway, and I suspect that the direction of that work will
| help with parsing and analysis.
|
| Of course, we could move past the scope of the syntax.
| Shacklz wrote:
| > I really want to put the 'enough is enough' point before
| worrying about a good looking commit history.
|
| I see where you're coming from, but I'd like to add a counter-
| argument to that. I'm currently working on a (mono-) repository
| with 40-something devs working on it, and we've recently
| switched from a "everything goes"-commit-history-approach to
| enforced linear history (while only a handful of people are
| allowed to directly commit on main-lines without a pull
| request).
|
| The main reasoning was this: It became almost impossible to
| understand why a build broke on the mainline by just looking at
| the commit-history itself. It was always needed to go through
| build-logs and such to get the picture of the how and why, and
| often even to get to the 'who', because the commit-history
| itself was just riddled with merge-commits. For the few devs
| who took care of that, this was a huge issue, while everyone
| else was just happily committing away.
|
| So going to the linear-history-approach made "analyze,
| understand" a breeze, but made "insertion" harder. We had to
| put in quite a bit of effort to get everyone up to speed
| (rebase, squash, reset, cherry-pick etc.) and set up some
| tooling for basic sanity-checks (pre-push-hooks etc.), but it
| was well-worth it, and a lot of devs were actually happy to be
| guided through this because for them it's clear that this will
| also be useful further down the road (in other jobs), not just
| for the current task at hand.
|
| And at last: It's really not that big of a deal. Just before
| opening a pull-request (or whatever your equivalent is), have a
| look through your change-set, run a bunch of commands if
| necessary, and done. Once you get the hang of it, it's pretty
| straight forward. It might not be worth it for you personally,
| but if you work on a repository with many other devs, there
| might be others who are grateful for that.
| [deleted]
| tharkun__ wrote:
| This, so much this. Same boat here (for long-ish values of
| "recent").
|
| I can only second all of what you've said. Rebasing and
| squashing really aren't that hard. If you ask me, selecting
| who you want to work with simply based on whether they can be
| taught to rebase and squash is a really good filter. If
| someone can't manage that, it is very very likely that you
| won't be happy to talk to them about small commits (easy to
| PR), good code hygiene and maintainable code, continuous
| deployments throughout the day (yes, OMG, you have to keep
| master green at all time, you have to follow up etc.) and a
| bunch of other things. That's fine by me, but I'd prefer not
| to work for the same small company as them or at least a few
| departments away in a larger one.
|
| All of these practices have so many advantages but many
| people don't or don't want to understand. You can generally
| teach this to people but it does need everyone to understand
| and pull on the same string. You can't have a bunch of people
| in a such a system that just never check the master build
| after merging.
| jheriko wrote:
| this is the opposite of my experience at any large workplace.
|
| you can look at merge commits on the history to achieve the
| same without losing power or data.
| Nullabillity wrote:
| And how does avoiding merge commits help you here? All you've
| done is throw away useful context.
|
| If you want the squashed history, just do a --left-only
| traversal.
| kevin_b_er wrote:
| The way the information is organized can often be just as
| important. Good data keeping practices can be applied to the
| history of changes in the software code as well.
|
| The problem is we're still discussing(arguing) over what good
| practices are.
| colordrops wrote:
| Yes, it is worth it. Yes the learning curve was somewhat long,
| but once you get it, it becomes background. Working with SVN or
| perforce or CVS or VSS was such a chore and prevented
| experimentation or simultaneous work on the same code. I
| suspect it's not even a debate as to whether it's worth it
| among many of us.
| globular-toast wrote:
| Like anything, this becomes easier with practice. Once you get
| into the swing with a good workflow, it no longer requires
| thought. Make the investment once, reap the rewards later when
| you have to track down a regression that was introduced some
| time in the past year.
| alecbz wrote:
| You should only ever do things when they're worth it. Which
| sounds obvious, but a lot of people argue for "correctness" of
| things in the abstract without tying back to why that
| correctness matters, how much its worth, and what the cost of
| getting it is.
|
| In the case of git, I think
|
| a) a good "published" code history can actually be pretty
| valuable
|
| b) git's UX is bad
|
| The fact that a tidy history is valuable should be totally
| independent from the fact that git's UI doesn't make it easy to
| do that.
| peakaboo wrote:
| Worth it for a few companies, but not for the majority. Most
| tests are also useless. Going for 100% test coverage is
| insanity. I think there is a lot of people in tech with
| autistic personalities who really hate when things are not
| perfect.
| kossTKR wrote:
| True, i can also get OCD'ey vibes, myself included, i think
| it comes with mindset one fosters when looking at a screen
| for long. Same thing can also happen when doing Music or
| Digital Art, you can fell into a perfectionism loop, where
| you spend an exponential amount of time on smaller and
| smaller tasks, that no one will ever care about.
|
| For lots of projects writing some quick end-to-end or story
| based tests is sufficient, maybe with some randomisation
| thrown in.
|
| I honestly think the current framework hell we're in has
| pushed so many hyper-complex best practices that we just
| waste way too much time on perfecting process over actually
| coding - tests being one of the paradigms that has gone
| overboard.
| phone8675309 wrote:
| A widely used, large framework having really good code
| coverage is important, especially if not everyone uses all
| of the framework.
| runawaybottle wrote:
| Couple that with adhd meds which induces a state of
| compulsiveness where one obsesses and marvels delusionally at
| their code writing ability.
| rendall wrote:
| Some great advice. Interesting the explicit references to IRC and
| email, which is (unfortunately? perhaps?) out of date in these
| days of github issues and Slack.
|
| The original article is 2012 with PRs back to 2016.
|
| My current gig is part of a microfrontend/microservices scheme
| where each team owns the entire vertical from concept and design
| through full stack, and there are, gee, 40+ services working in
| concert with more planned. All of these merge into a single web
| application. It all works shockingly better than might be
| expected, but it takes a culture of constant vigilance and care
| along with automated integration including lerna, yalc, renovate,
| artifactory, et. al.
|
| For our team, the branch management scheme is 1 branch per JIRA
| ticket, then PR with automated checks including tests, lints,
| code review. The dev decides whether to merge, squash or rebase
| to master with some input from the lead. That master is just for
| the microservice, which may or may not need to be merged into
| another parent service.
| prionassembly wrote:
| > each team owns the entire vertical from concept and design
| through full stack, and there are, gee, 40+ services working in
| concert with more planned. All of these merge into a single web
| application.
|
| My startup is developing that way, although with more like 9
| services owned by 3 developers. (We have no bus factor). Is
| there anything interesting I can read about your methodologies
| -- I mean standard references, books, tutorials? (Lerna? Yalc?)
| rendall wrote:
| It's a really unusual setup, so I'd be surprised if there
| were a reference book on it. What makes it work is the
| engineering culture there, and not so much the tech stack
| recipe.
|
| The unusual features are that you're expected to prioritize
| helping out a fellow engineer if they ask for help, even from
| another team; and there is a culture of being blunt with
| criticism. You're expected to be straightforward. I've heard
| a junior engineer bluntly tell the team that we're wasting
| time in a meeting, and everyone agreed, apologized and got
| back on track. This is on top of the standard cultural best
| practices of no-blame, psychological safety, team autonomy,
| mandatory work-life balance, etc.
|
| The downside is a long onboarding process.
| heurisko wrote:
| I don't know if anyone felt the same way, but I felt I "knew" how
| to use git before I read tutorials about how to use git.
|
| Sometimes I felt tutorials were making it seem harder, more
| mystical, than it really was, or relied on "marble diagrams" with
| arrows pointing backwards, which I felt was unintuitive.
|
| I had used SVN quite a bit, and found you can use git in pretty
| much the same way, but branching was easier.
|
| And if you're using branches more, then "rebasing" your branch
| made sense, and the fact that you could "rewrite history", which
| I remember to some people seemed controversial, but the idea was
| it was fine, if you kept your branch private, or added caveats.
| Igelau wrote:
| I had a similar feeling of "finally, _this_ makes sense " after
| enduring ClearCase and SVN for a big chunk of my career.
| daitangio wrote:
| My simple strategy was to avoid fast-forward on merge and to use
| "git pull --rebase" when my changes are not critical. History
| rewriting? Hum I do not know....I do not like changing history
| SergeAx wrote:
| I beleive that in feature-branch workflow it is totally okay to
| rewrite history of the branch before merging it into master. The
| only case when it is obviously wrong is when several engineers
| are working on the same feature branch, which is against all
| methodologies known to me (except the duration of code review,
| when reviewers may pull the branch to run tests or use IDE on
| it).
|
| Having atomic semantic idiomatic changes in history beats short
| term branch immutability with one hand tied.
| lukicdarkoo wrote:
| In my workflow, I typically commit often and use the commits as
| personal checkpoints. Once a pull request is ready I simply
| squash the commits and merge. That way, the history in the main
| branch is clean and I have my checkpoints. I assume that is a
| typical workflow for many teams.
| juped wrote:
| Assuming you're referring to something like the github "squash
| and merge" button, your history is far from "clean". You make
| it significantly less useful by destroying information and
| creating megadiffs incorporating many different changes.
|
| Instead, do what you want with personal checkpoints, but
| refactor them into logical steps before publishing and merging
| (with a real merge) them.
| fuzzy2 wrote:
| > You make it significantly less useful by destroying
| information and creating megadiffs incorporating many
| different changes.
|
| You'll rarely review single commits anyway. However, I'd
| rather have a single "giga-commit" instead of dozens of
| commits that are not correctly divided plus dozens of
| "remarks from code review" commits because what's rebase.
|
| Many of my colleagues view using Git not as part of their
| core work but an inconvenient chore.
| juped wrote:
| > You'll rarely review single commits anyway.
|
| This is just because Github and its imitators are bad
| software - which isn't really git's fault. git and Linux
| practice only commit-level review.
|
| > Many of my colleagues view using Git not as part of their
| core work but an inconvenient chore.
|
| Many people don't care about version history, and ignorance
| of how git works (or adherence to superstitious rulesets)
| on the part of the people who _do_ care provides them cover
| for trashing the history.
|
| Many people don't care about code quality or
| maintainability. However, these people are more likely to
| be prevented from trashing the codebase itself than the
| ones who don't care about history are from trashing the
| history.
|
| Commit history is just as subject to review as the contents
| of diffs.
| fuzzy2 wrote:
| > git and Linux practice only commit-level review.
|
| I don't think that's an accurate way to put it. AFAIK you
| just send in patch files. They create a single commit,
| yes, but I see it as equivalent to a PR. The rules of
| what can be in a single patch could be stricter than
| typical PRs on other projects, dunno.
|
| > Commit history is just as subject to review as the
| contents of diffs.
|
| I wish. Maybe I'll work on a better team on the future.
| juped wrote:
| what I mean is, take any arbitrary patch thread off the
| front page of https://public-inbox.org/git/ and look at
| how people review them. they reply to the relevant
| commit. if you reply to the cover letter it's not a code
| review at all, but a more general comment on the whole
| branch (e.g., "do we want this feature" or something).
|
| you definitely don't send your branch as a single patch
| (unless it is small and really is best expressed as a
| single patch). if you did you would be asked to break it
| up.
| lukicdarkoo wrote:
| The squash and merge button is convenient, but not the only
| way. As you mentioned, sometimes I do squashing locally into
| multiple meaningful commits. It's important that my mess
| doesn't end up in the main branch. Also, another approach
| would be creating smaller pull requests.
| juped wrote:
| the size of your branch doesn't matter, if it really is
| logically separable into a lot of changes. what matters is
| usefulness to the future reader (you, or someone else).
| lowering the number of commits for its own sake makes
| things less useful, not more, by destroying useful
| information.
|
| you are correct to realize that e.g. "my working directory
| as of this timestamp when i got up from my computer" is not
| useful information, and should not be published. however, a
| diff that does too many things at once is also not useful
| information - you are forcing the reader to separate it
| themselves, possibly wrongly, every time they read the
| diff. very few branches in actual practical work are
| organically one commit long in their most useful
| expression.
|
| finally, the merge to upstream is useful information, as
| the only source of a high-level view of the progression
| towards a release, and if you (i'm only guessing because of
| the use of the "clean history" shibboleth) avoid actual
| merges (with merge commits) to upstream, you're destroying
| important information about the integration work of the
| project, making the history significantly messier (from the
| perspective of someone reading it, which is the only
| perspective that matters).
|
| (at the advanced level of writing-history-for-usefulness,
| you also realize that random-place-on-master branch
| starting points are not useful information, and learn where
| to start them from, but this is low-impact in comparison to
| not destroying branches. still, learning things like "base
| a bugfix on the commit that introduced the bug" are real
| force multipliers.)
| bcrosby95 wrote:
| > If you think about it, movies are made this way. Scenes are
| shot out of temporal order, multiple times, and different bits
| are picked from this camera and that camera. Without examining
| the analogy too closely, this is similar to how different git
| commits might be viewed. Once you have everything in the "can"
| (repository) you go back and in post-production, you edit and
| splice everything together to form individual cuts and scenes,
| sometimes perhaps even doing some digital editing of the
| resulting product.
|
| Except the end product is a movie, and my end product is
| software, not a commit history. If you want to draw an analogy to
| movies, then rebasing your history would be like movies throwing
| out raw camera footage because they have the end product. They
| keep _all_ of the _raw_ footage. They do _not_ throw it out by
| "rebasing".
| Dumblydorr wrote:
| I'm a beginner at Git. I coded in a hospital system where I was
| told not to use Git. Upgraded jobs, now I'm using it for the
| first time.
|
| Wow! What a difference. I have to say the ability to see what
| I've done over time, to not have endless files labeled with the
| date and my initials, not having to manually write down what I'm
| up to...this is heavenly.
|
| My main gripe would be the opaqueness of git. It really wasn't
| intuitive, I had to use it a few days before the vocabulary made
| sense, push, pull, commit, add, origin, they didn't logically
| click for me that fast.
| banana_giraffe wrote:
| I'm confused: Did you use a revision control system at all? It
| kinda sounds like your describing the difference between RCS
| and no RCS.
| bobbylarrybobby wrote:
| What was the rationale for not using git?
| copirate wrote:
| > Personally, I commit early and often and then let the sausage
| making be seen by all except in the most formal of circumstances.
| [...] For a less formal usage [...] I let people see what really
| happened.
|
| > Whenever I pull, under most circumstances I git pull --rebase.
|
| These 2 statements are contradictory. By doing a "pull --rebase"
| you hide the (maybe important) fact that your commits were
| written in another context.
| juped wrote:
| This is still pretty superstitious, even if it's better than a
| lot of the stuff Github has unfortunately trained people to do.
| Maybe I'm naive, but I don't think git usage has to be based on
| superstition and fear. I have never not been able to clear up
| people's random-internet-post-induced / github-induced confusion
| in about half an hour of personal communication.
|
| Your goal in making a history is to make a meaningful, useful
| history that expresses information. I don't think this is hard. I
| think a lot of people don't want to do it, which isn't my problem
| (unless I have to work with you), but I also think a lot of
| people do want to do it and are stymied by posts like this, or by
| services like Github.
| jdowner wrote:
| I am beyond sick of hearing about 'best practices' for
| everything. It is such an obnoxious statement! So often it is
| used as replacement for 'in my opinion' because it brooks no
| argument -- these are the BEST practices!
| alecbz wrote:
| I can't stand when I ask someone "why?" and they answer "it's a
| best practice". Yeah, I'm asking _why_ you think this is a best
| practice.
| fuzzy2 wrote:
| I agree with most of TFA. However, I urge everyone to start with
| a monorepo. Splitting a single project (developed by one team)
| into multiple repos will seriously slow things down later. I've
| seen it happen.
| Hackbraten wrote:
| It gets really bad when two repositories depend on one another,
| and you forget to track which commits are meant to work
| together.
| lamontcg wrote:
| > Once you git push (or in theory someone pulls from your repo,
| but people who pull from a working repo often deserve what they
| get) your changes to the authoritative upstream repository or
| otherwise make the commits or tags publicly visible, you should
| ideally consider those commits etched in diamond for all
| eternity.
|
| I've broken this rule multiple times per day for the past 10
| years.
|
| On your own feature branches, rebase your fucking shit and force
| push. I see so many people creating ungodly messes because they
| never want to erase the history of PRs that they've submitted and
| its just a nightmare of merge commits pulled back into their
| branch from master.
|
| I've watched a decade of git n00bs practice this "never under any
| circumstances rewrite history" advice and it fucks them up over
| and over and over again.
|
| Nobody actually cares about the exactly commit process you went
| through to fix the bug. Squash everything and rebase. Leave a
| SUMMARY of why you did what you did in the PR and/or commit
| message. Humans have this amazing ability to write stories about
| what they did after the fact.
|
| I'll frequently leave my-future-self notes on closed and merged
| PRs as I think about them post-merge, where I document what
| approaches were rejected, and what approaches might be worthwhile
| if the change isn't sufficient. Stuff like "I could not do WWWW
| because of <great sadness>, so instead we must do XXXX, if this
| is not sufficient because <unlikely edge condition turns out to
| be not so unlikely after all> then we must consider that YYYYY
| will be less preferable due to <stuff i was thinking about hard>
| and we should consider doing ZZZZZ first". If all you do is
| capture "what did I change" and don't capture your thinking and
| what you view to be all the different alternatives while your
| mind is still fresh with the problem then you're just throwing
| away useful information, which is what the "preserve your git
| history" approach does.
|
| In the future, when I read my PRs I just don't care about how I
| got there. I care about what I was thinking about. So I write
| down, long-form, what I was thinking about. A git history is
| about going from A to B, it might document why you bailed on
| going to C, but it probably doesn't capture the stuff you
| rejected the whole time like D E and F and why. And really I'd
| PREFER to read a good note on why C sucked as a solution. I don't
| need to see the code that went down the route of C until it
| turned into a mess and then had to be backed out.
| dahart wrote:
| I interpreted the part you quoted to mean avoid rebase once you
| push into a public stream other people are using.
|
| Otherwise I totally agree with you; git was designed with
| rebase use in mind. There's a misleading meme about rebase
| being a "lie" that just can't die soon enough. It's done more
| damage than good. The problem is that it's specious - tempting,
| persuasive, and easy to believe, even if it's wrong and/or
| misguided - and the narrative of rebase being bad is supported
| and spread by respectable people like SQLite's author. What
| some people don't consider is that the story about rebase being
| harmful is frequently a sales pitch for a _different_ DVCS
| entirely - the message isn't to not rebase, it's to not use
| git.
|
| Don't rebase public/main branches (except in emergencies). Do
| rebase your local work before push. If using rebase in feature
| branches, use it (along with communication) in inverse
| proportion to how many other people are using it, because they
| have to force pull and so nobody stomps on anyone's work.
|
| The idea that the exact order of every character typed is
| sacrosanct and should be immutable is strange. But there is a
| valid point behind some of the rebase criticism, which is that
| git does not have the best facilities for controlling how
| history is presented, and if it did, rebase might not be needed
| to the same degree that it is now. Some DVCSs are designing
| ways to have a plumbing history, and a separate porcelain
| history, to use git terminology. That seems like a genuinely
| good idea, and maybe in the future git can incorporate
| something like it.
| xyzzy_plugh wrote:
| > On your own feature branches, rebase your fucking shit and
| force push. I see so many people creating ungodly messes
| because they never want to erase the history of PRs that
| they've submitted and its just a nightmare of merge commits
| pulled back into their branch from master.
|
| Strong agree. This is probably my sole complaint about Git, in
| that changes to branches are not tracked through history in an
| accessible manner. Being able to see "revisions" of a branch
| would be very useful, rather than erasing history.
|
| Gerrit works around this with metadata in the commit message
| but it would be nice for this to be a first class citizen.
| Knowing when branches were modified and by whom would be very
| useful -- author and committee fields are insufficient.
| mattgreenrocks wrote:
| > [don't] use checkout in file mode
|
| What is the alternative? Sometimes I edit a file and don't need
| the changes after all.
| eikenberry wrote:
| Many of those misc "do"s/"don't"s read as more good beginner
| practices than as general lessons. It recommends against that
| one particularly because it is undoable. Sort of like
| recommending a *nix newb to map `rm` to `rm -i`.
___________________________________________________________________
(page generated 2021-07-04 23:02 UTC)