hngopher.com

       [HN Gopher] The High-Risk Refactoring
       ___________________________________________________________________
        
       The High-Risk Refactoring
        
       Author : ben_s
       Score  : 33 points
       Date   : 2024-03-17 16:25 UTC (6 hours ago)
        
 (HTM) web link (webup.org)
 (TXT) w3m dump (webup.org)
        
       | simonw wrote:
       | A friend once told me that any time he takes on a big refactoring
       | project at work he works with the assumption that the project
       | could get cancelled at any moment, so his goal is to ensure that
       | if it DOES get cancelled he'll still be leaving the code in a
       | better place than it was when he started.
       | 
       | I think this is a really smart strategy.
        
         | bigkm wrote:
         | how does that assumption work? I would have thought that
         | assumption would lead one to care less.
        
           | Jtsummers wrote:
           | It works if simonw means the
           | refactoring/rearchitecting/redesign project and not the
           | broader encompassing project. In the former case, it means
           | making sure your changes are mergeable (ideally merged) all
           | along the way so that if that effort is canceled the broader
           | project can still benefit from it. In the latter case, then
           | you're correct it doesn't matter. If the encompassing project
           | is canceled who cares what state the change effort is in
           | since everything is getting tossed.
        
         | seadan83 wrote:
         | That strategy is something the heart of "agile development."
         | Ensure you can do a little bit and then walk away and work on
         | something else while still having delivered some value.
         | 
         | A crux though for "big refactoring project" - first, that is
         | kinda misnomer. It's not refactoring, it's re-architecture.
         | Second, a number of refactorings can make things worse before
         | they are better. For example, fixing bad abstractions,
         | sometimes the first step is to inline stuff and copy/paste the
         | abstractions to first "flatten the code" before restructuring
         | it. That re-flattening step can lead to worse code, but it's in
         | a place where it can be then made better (one step back to
         | enable three steps forward)
        
           | nradov wrote:
           | That should be done on a branch and not merged to the trunk
           | until there is a net improvement.
        
             | seadan83 wrote:
             | Maybe, though as a hard and fast rule I strongly disagree.
             | 
             | Delaying merge increases risk. It can also make the diff
             | impossible to review. Overhead can be increased by way of
             | merge conflict and re-doubling of efforts that would not be
             | necessary had the update been merged.
             | 
             | Other times, sometimes you want the one step to be launched
             | so you can ensure there is no regression. Sometimes the
             | next steps are going to take a long time and themselves are
             | multi-pronged and would be merged in multiple parallel
             | phases.
             | 
             | Sometimes you just want to get rid of the inappropriate
             | abstraction so that someone can complete task X where
             | inappopriate abstraction was hindering that. In other
             | times, "worse" is in the eye of the beholder, WET'ing the
             | code first can sometimes be the right thing to do, even if
             | it is never re-DRY'ied.
        
             | seadan83 wrote:
             | One more example, the phrase "consistently bad is better
             | than inconsistently good"
             | 
             | Sometimes you have 10 modules that all need the same update
             | to be better. Updating just one module, while that module
             | might be a lot better - breaks consistency and makes the
             | codebase harder to understand overall. It is worse, even
             | though that one module is better. Though, updating all 10
             | modules at the same time might be just impractical and
             | super risky. It can make sense to do that in phases and do
             | just one at a time. Sometimes the effort for the one module
             | will take a few weeks. Holding all that up for most of a
             | year does not make a lot of sense.
             | 
             | At the same time, trying to do it all across the whole
             | code-base all at once, at some point that is no longer even
             | refactoring:
             | 
             | "Refactoring isn't a special task that would show up in a
             | project plan. Done well, it's a regular part of programming
             | activity. When I need to add a new feature to a codebase, I
             | look at the existing code and consider whether it's
             | structured in such a way to make the new change
             | straightforward. If it isn't, then I refactor the existing
             | code to make this new addition easy. By refactoring first
             | in this way, I usually find it's faster than if I hadn't
             | carried out the refactoring first."
             | (https://refactoring.com/)
             | 
             | Of note, sometimes that initial refactoring _should_ land
             | as its own change first. Though, that is a philosophical
             | disagreement that is okay to have. Some reviewers want you
             | to fix up existing problems in code you touched, and they
             | want that in the same diff as the bug fix and /or feature
             | update. Other reviewers want those updates to be completely
             | separate and done one-by-one.
        
               | mcsimmons wrote:
               | I like this take a lot.
               | 
               | At my work we often do a lot of "pre-commits" for
               | projects that require any significant amount of work. A
               | pre-commit is usually pure moving/repackaging of code or
               | small re-writes to how we do things now to make the PR
               | (or PRs) for the actual feature the most straightforward
               | and pure business-logic-only changes that we can. So, I
               | usually tell people we rarely refactor, but maybe we do
               | kinda what you're recommending here.
        
               | loloquwowndueo wrote:
               | "Make the change easy, then make the easy change"
        
           | marcosdumay wrote:
           | > It's not refactoring, it's re-architecture.
           | 
           | If you change the code without changing the functionality,
           | it's a refactoring.
           | 
           | If you change the main organization of the code, it's re-
           | architecturing.
           | 
           | It being one of those has no impact on whether it's also the
           | other.
        
             | chr1ss_code wrote:
             | 100% my thought ;p
        
         | dheera wrote:
         | > works with the assumption that the project could get
         | cancelled at any moment
         | 
         | If I have that feeling, I have very, very little incentive to
         | contribute to the project until I get better clarity on whether
         | leadership wants to continue supporting the project and see it
         | to launch.
         | 
         | I like working on tech because I want to see it launched or at
         | LEAST open-sourced if not launched, not die in a NDA graveyard.
        
         | latenightcoding wrote:
         | Which makes cancelling the project an easier decision. In the
         | beginning of my career I was only accepting short term
         | projects, so I learned to always leave the code in a state
         | where anybody can pick it up. I still code in that way, but I
         | know it makes me more replaceable.
        
           | jrajav wrote:
           | If anyone at any company judges an engineer who crafts code
           | that is easy for other engineers on their team to immediately
           | pick up and start work on as more replaceable, that person is
           | hopelessly incompetent. That would normally be a basic skill
           | every engineer is expected to pick up before getting too far
           | in their career.
           | 
           | If anything, an engineer with a clear tendency to try to
           | write themselves job security is the one that should be
           | replaced ASAP.
        
         | 0xferruccio wrote:
         | I love this thanks for sharing this
        
       | sinuhe69 wrote:
       | For me, refactoring means simple, no-risk code re-organization
       | only. It should be executed often during the production and it
       | should serve foremost code-reuse and better readability. Anything
       | going further than that is for me rewriting and should be
       | considered appropriately.
        
         | ck45 wrote:
         | You are absolutely right, see the definition on
         | https://refactoring.com/ Unfortunately, it has become common to
         | refer to any kind of code changes as refactorings (not backed
         | by any proof, just an observation)
        
         | begueradj wrote:
         | Indeed. Re-writing is often dangerous:
         | https://www.joelonsoftware.com/2000/04/06/things-you-should-...
        
       | IshKebab wrote:
       | I have yet to work anywhere where people refactor anywhere near
       | as much as they _should_ , so I think this is really sending the
       | wrong message. People massively overestimate the risks of
       | refactoring and massively underestimate the risks of _not_
       | refactoring - which many people incorrectly assume don 't exist.
        
       | mjr00 wrote:
       | Refactoring and related maintenance activities, like
       | library/framework upgrades, are high risk and low immediate
       | reward. In my experience, they're also one of the hardest overall
       | software development tasks; large-scale refactors will end up
       | touching a large percentage of existing code, and will require a
       | ton of reading/understanding existing code to figure out what it
       | does, as well as the domain knowledge to know what it _should_
       | do. Often during refactoring I 've found hairy bits of code which
       | seem like they'll be intractable, until I stepped up a level and
       | realized it's for a half-implemented feature that never got
       | released, and can be completely removed.
       | 
       | If you try to measure it on an incremental level, refactoring is
       | never worth it. Your code changes can and will cause unexpected
       | bugs; customers get no perceivable value; it's engineering time
       | not spent on revenue-generating features, and per the previous
       | paragraph, you should be getting your _best_ engineers doing
       | large refactors. But if you look at it across a longer horizon,
       | it makes a massive difference in the maintainability of your
       | codebase. It 's like cutting out 300 calories/day from your diet;
       | you won't notice a difference if you weigh yourself on two
       | consecutive days, but if you do it for 6 months it has a
       | noticeable impact.
        
         | BurningFrog wrote:
         | The way to make refactoring safe is to have a solid test suite.
         | 
         | With a good enough test suite, if all tests still pass, you can
         | be pretty confident.
         | 
         | To me, making refactoring easier/safer is probably the biggest
         | benefit of having tests!
        
       | keybored wrote:
       | What I did last time for a moderate refactoring was to do
       | everything step by step in the commits. Intermediate broken state
       | was fine. Them purposefully commit in such a way that the tools
       | could verify that I did the intended change. Like line moves:
       | verify with `git diff --color-moved`. Then I told the reviewers
       | about my breadcrumbs. Then finally for the merge I could rewrite
       | the history.
       | 
       | Ideally I want to not just post a 500-line diff as a pull
       | request. Ideally I want to really write a program/script which
       | does as much of the changes as I did (intermediate commits for
       | the changes that need to be done manually). Refactorings are
       | often done through and IDE so _hopefully_ the IDE can output a
       | script of the refactoring steps. Then the reviewers don't even
       | have to look at the diff (really): they can look at the script,
       | see if it is reasonable, then run it themselves and verify that
       | it produces the same output (the same tree).
       | 
       | Coccinelle is a tool for C (and C++?) which lets you write
       | "semantic patches" like "replace these boolean expressions with
       | this one". Maybe replace some considered-bad C standard library
       | calls with what you consider to be better. Then you can leave
       | those patches in and check that people don't check in new code
       | with the old pattern.
       | 
       | All of this is just for refactoring though. Once you start
       | talking about _rewriting_ I feel like you have moved beyond that
       | point.
        
       | jimbokun wrote:
       | One of the subtler but quite dangerous risks of refactoring, is
       | making the output more "correct" but different from what it was.
       | 
       | Clients of the application likely have adapted to the less
       | correct output in ways that will break in unanticipated ways if
       | it changes. I've seen this lead to production issues, and the
       | explanation that the new behavior was more correct did not go
       | over well.
        
       ___________________________________________________________________
       (page generated 2024-03-17 23:00 UTC)