hngopher.com

       [HN Gopher] I don't buy "duplication is cheaper than the wrong a...
       ___________________________________________________________________
        
       I don't buy "duplication is cheaper than the wrong abstraction"
       (2021)
        
       Author : Akronymus
       Score  : 127 points
       Date   : 2023-08-29 12:31 UTC (10 hours ago)
        
 (HTM) web link (www.codewithjason.com)
 (TXT) w3m dump (www.codewithjason.com)
        
       | gpderetta wrote:
       | I think it is worth distinguishing proper opaque abstractions,
       | that are defined by a contract, from convenience macro-like
       | "abstractions" that are defined by their implementation.
       | 
       | The former are for abstracting different implementations behind
       | and interface and/or decoupling, and require thought, planning,
       | and careful consideration for their evolution.
       | 
       | The latter are purely for convenience, to save some typing, some
       | mental overhead when understanding code (although they can
       | increase it just as well) and to centralize minor bug fixes or
       | common features. For long term evolution and divergence, these
       | abstractions should simply be macro-expanded instead of trying to
       | refit them for the new requirements.
       | 
       | Of course reality is a continuum.
        
       | vendiddy wrote:
       | Does reducing duplication make your code easier to maintain? Or
       | does it not? Make your decision accordingly.
       | 
       | Start treating duplication as a means to means to an end. It's
       | not an end in itself.
        
       | groby_b wrote:
       | Duplication is neither "cheaper than the wrong abstraction" nor
       | is it "one of the most dangerous mistakes in coding".
       | 
       | There's a cost to abstraction. There's a cost to duplication. Our
       | job, as engineer, is to stop applying blanket statements and
       | instead reason about the tradeoffs. And no, they aren't static
       | tradeoffs either, because requirements and constraints don't stay
       | static.
        
       | waffletower wrote:
       | I think the answer here can be different depending upon the
       | ecosystem. I confidently believe that abstraction is better
       | instrumented and practiced in functional programming languages
       | than those of the still-dominant object-oriented paradigm.
       | Awkward abstractions are much easier to grow and stumble upon
       | when the basic unit (an object) encourages private, greedy,
       | encapsulation of data and method implementations. In functional
       | languages, living up to DRY (don't repeat yourself) is a much
       | more immediate and clear proposition.
        
       | lvncelot wrote:
       | > Except in very minor cases, duplication is virtually always
       | worth fixing.
       | 
       | I disagree with the severity of this, and would posit that there
       | are duplications that can't be "fixed" by an abstraction.
       | 
       | There are many instances I've encountered where two pieces of
       | code coincided to look similar _at a certain point in time_. As
       | the codebase evolved, so did the two pieces of code, their usage
       | and their dependencies, until the similarity was almost gone. An
       | early abstraction that would 've grouped those coincidentally
       | similar pieces of code would then have to stretch to cover both
       | evolutions.
       | 
       | A "wrong abstraction" in that case isn't an ill-fitting
       | abstraction where a better one was available, it's any (even the
       | best possible) abstraction in a situation that _has no_ fitting
       | generalization, at all.
        
         | MilStdJunkie wrote:
         | I might have to say some unkind things here, but statements
         | like:                 instead of "duplication is cheaper than
         | the wrong  abstraction", I would say "duplication is cheaper
         | than confusing code littered with conditional logic".
         | 
         | seems like it's looking at this problem from an extremely
         | narrow context.
         | 
         | The truth is that the phrase "wrong abstraction" is (more or
         | less) unquantifiable, which makes the original phrase, as
         | employed, sort of like a koan. It addresses the very human
         | tendency to see patterns in noise, and our ability to
         | "transmit" such hallucinations to other humans via natural
         | language and other means.
         | 
         | The closest I can get to - given my at-best-apprentice status
         | as a formal programmer - is the quantitative test I developed
         | for CCS (conditional content systems), where the abstraction
         | lies in the SNS[1], and the de-duplication mechanism is
         | applicability[2]. Since each applicability statement carries
         | its own overhead, there's a limit on how much "abstraction" the
         | model can take before it's using quantitatively more keystrokes
         | than duplication.
         | 
         | The test goes like this: take the flat text procedures for ALL
         | the configurations, and add it together. Now, take the
         | conditionalized, applicability-laden procedure that unifies the
         | procedure, and measure its file size. If the latter is LARGER
         | than the former, then you're using the wrong SNS/applicability
         | model for rolling up this content.
         | 
         | Thing is, this is _inevitable_ if you throw enough dissimilar
         | configurations at a CCS, because each configuration has its own
         | overhead, and eventually that outpaces the content itself.
         | 
         | You can address this in a bunch of ways - like adding a
         | containing pseudo-product that has all the configurations
         | inside of it - but the actual real Product Management might not
         | let you build on the applicability like that, because the
         | Product itself isn't sold that way. Any other abstraction isn't
         | available to you, because in the end this is natural language,
         | which - unlike structured language - resists first order
         | abstractions _really well_. This is one of those instances
         | where, yes, the abstraction of the SNS /Applicability is
         | _worse_ - quantifiably - than duplication. All that complexity
         | would be better handled via version control fork /branch
         | relationships - _far outside_ of the realm of natural language.
         | 
         | [1] standard numbering system, a sort of numeric designator of
         | functional systems, the primary way that content is designated
         | as semi-independent modules.
         | 
         | [2] conditional "chunks" that turn on and off depending on the
         | applicability statement
        
           | feoren wrote:
           | It'd be wonderful if we could measure the utility of software
           | engineering choices by counting keystrokes or measuring file
           | sizes or putting them in a turbo encabulator and seeing which
           | one has more modial interaction with its magneto-reluctance.
           | Unfortunately, reality is just too complicated, with far too
           | many tradeoffs to be balanced. I'd recommend deep thought and
           | discussion about the domain over looking at a graph of your
           | codebase's sinusoidal repleneration.
           | 
           | > All that complexity would be better handled via version
           | control fork/branch relationships
           | 
           | Probably not.
        
             | MilStdJunkie wrote:
             | Holy smokes, my turbo-sarcasmo detector just broke! But
             | yeah, that's more or less the TLDR of my point. The phrase
             | "wrong abstraction" does some heavy lifting, but it's not a
             | bad concept, even if largely a qualitative one. No one
             | should use a single metric to toss ginormous architecture
             | decisions - they're tools to inform educated judgement, not
             | replace it.
             | 
             | Re: fork/branch shenanigans, no, you're right, that's not
             | an optimal way to handle variance . . in a normal
             | programming language. In the context of natural language,
             | it's not the same kettle of fish, because, well, lots of
             | reasons, probably the most prominent being the "messy
             | unidirectionality" of NL that's all mish mished with its
             | extremely complex grammar vs constructed languages.
             | Chopping up giant documents into tiny pieces a la CCS[1]
             | systems has made this a stew of problems, but for some
             | reason Leadership is fond of the idea. It's not unlikely
             | that specialized on-prem LLMs are going to nuke the CCS
             | concept from orbit in the next five years, except for those
             | cases where the CCS is a contractual requirement for doing
             | the work.
             | 
             | [1] component content systems
        
         | rdedev wrote:
         | You got a good point about code evolution. Has anyone taken a
         | look at it from a biological perspective? Seems like such
         | problems can occur in genetics and nature might have come up
         | with some tricks we can use
        
         | ilyt wrote:
         | > There are many instances I've encountered where two pieces of
         | code coincided to look similar at a certain point in time. As
         | the codebase evolved, so did the two pieces of code, their
         | usage and their dependencies, until the similarity was almost
         | gone. An early abstraction that would've grouped those
         | coincidentally similar pieces of code would then have to
         | stretch to cover both evolutions.
         | 
         | Then you split that abstraction again. It's very cheap and very
         | quick.
         | 
         | Many people talk about the issue like it was an absolute in the
         | code, but that's wrong approach. If you end up writing 4
         | functions that are the same, by all means, merge it into one.
         | 
         | If then you need to add a parameter only this code path uses
         | and rest doesn't care about, by all means split it back. Moving
         | blocks of code around is cheap.
        
           | oxfordmale wrote:
           | Splitting the abstraction is never cheap and quick, mostly
           | because of politics. With duplicated code you often can
           | assign a single responsible owner to each duplication.
           | 
           | However, once abstracted, the code may suddenly be used by a
           | number of different teams. You will need to get this work on
           | their roadmap, increasing the friction to get this done. In
           | many companies, this will also end up in endless discussions
           | about the new approach.
        
             | yellowapple wrote:
             | Solution there would be to make the abstraction "opt-in",
             | such that a team can elect to duplicate or abstract as
             | desired. Also helps if the "main" abstraction is itself
             | composed from smaller abstractions, from which downstream
             | teams could then pick-and-choose rather than having to
             | either fully abstract or fully duplicate.
        
             | esafak wrote:
             | This is a good point. Following Conway's Law, a team may
             | choose to duplicate code or do thing theoretically sub-
             | optimally simply to avoid having to deal with other teams.
        
           | jrumbut wrote:
           | I think the key here is the oft repeated but often poorly
           | understood maxim to favor composition ("has a") over
           | inheritance ("is a").
           | 
           | If you have a mixin (or other means of composition) that you
           | use in several places and one diverges, it's easy to remove
           | it. If you use inheritance, it's going to be more painful.
           | 
           | A language that offers OOP via prototypes instead of classes
           | like JS can (sometimes) give you the best of both worlds, but
           | it will confuse a lot of devs who aren't familiar with that
           | kind of OO design.
        
         | BWStearns wrote:
         | Agreed. Abstractions also tend to be more resistant to change,
         | both from a technical level, and a social level.
         | 
         | At a technical level an abstraction will have more call sites
         | to worry about in different contexts, the more wrong the
         | initial abstraction the harder it will be to change.
         | 
         | The social level is maybe even more problematic. Abstractions
         | seem more important than calling code and will experience more
         | friction in code review. This change friction can also increase
         | with the "wrongness" of the initial abstraction. The starting
         | point makes less sense so a reviewer needs to work more to
         | understand the context. If the abstraction is gnarly enough
         | then it's possible that the reason for the abstraction is
         | almost obscured. Even someone who knows _how_ it works might
         | have lost the forest through the trees and push back on changes
         | that simplify it or improve it if the change is a sufficiently
         | large departure from the initial state. In this case you can
         | often see small incremental changes get added easier but this
         | just makes the shared code a bit gnarlier for next time.
        
           | tetha wrote:
           | > At a technical level an abstraction will have more call
           | sites to worry about in different contexts, the more wrong
           | the initial abstraction the harder it will be to change.
           | 
           | As I recently called it, infrastructure and systems lose
           | agility as they gain dependency and move down the stack.
           | 
           | If you have like 1 customer and they have good retries,
           | honestly: fuck everything. Deploy master, in fact, deploy
           | every keystroke to prod. It'll be fine.
           | 
           | At the same time, about 30k - 40k FTEs of our B2B customers
           | depend on one of my Postgres instances during business hours
           | and about twice of that during different holiday seasons.
           | Honestly? Nothing touches the system-level settings of these
           | database systems unless we have pondered a change for 2
           | weeks. And even then we will schedule an approved change over
           | 4 weeks across applicable postgres clusters. The carnage a
           | bad change at this level can cause is ridiculous enough to
           | not be.
        
           | ljm wrote:
           | This is my beef with naively applied DDD, separation of
           | concerns, and design patterns.
           | 
           | Usually what happens is the 'clean' code ideal comes first,
           | and then the implementation is squeezed into it. This then
           | informs the organisation (or architecture) of the rest of the
           | codebase and your software design has become a matter of
           | putting pegs into the right-shaped holes.
           | 
           | I have _never_ found that kind of highly abstracted code
           | easier to work with than some simple procedural alternative
           | that is easy to delete and easy to refactor, so long as
           | effort was put into writing it well.
           | 
           | Of course, the patterns have a purpose and do help when used
           | nicely - a lot of code you write will fall into some of those
           | patterns even without you explicitly mentioning it. It's
           | just...doing it for the sake of it is a problem.
        
         | yellowapple wrote:
         | > An early abstraction that would've grouped those
         | coincidentally similar pieces of code would then have to
         | stretch to cover both evolutions.
         | 
         | In that case, my takeaway would be that it ain't the
         | abstraction itself that's wrong, but the unwillingness to get
         | rid of it (or decompose it) when it no longer serves its
         | purpose.
        
         | bcrosby95 wrote:
         | Given a long enough timeline, every abstraction turns wrong.
         | 
         | The answer isn't to not abstract, the answer is to tear it out
         | when it turns wrong. That was actually the original point of
         | the popular article that streamlined this view - that we
         | shouldn't be afraid of tearing them out, not that we shouldn't
         | make them in the first place. Most people just read headlines
         | though.
        
           | lolinder wrote:
           | The resistance to tearing out a bad abstraction isn't just
           | cultural: combining two different functions into one is a
           | lossy operation, which makes splitting an abstraction harder
           | than creating it in the first place.
           | 
           | While the functions are distinct the call sites are self-
           | documenting. You know which calls are for which purpose
           | because the names are different. After combining them to
           | deduplicate the code, you've lost that information, and to
           | disentangle the abstraction now requires you to infer and
           | reintroduce that lost information.
           | 
           | It's not that it can't be done, but there is real friction
           | that doesn't just exist in people's heads.
        
             | abathur wrote:
             | I think the difficulty of making the right decisions
             | without this lost information is well-observed.
             | 
             | I wrote a short post in roughly this idea space last year: 
             | https://t-ravis.com/post/doc/what_functions_and_why_functio
             | n...
             | 
             | It feels like the same thread you're describing, but I
             | guess it's pulling on the other end of it. It's thinking
             | about how to name things in a way that makes it easier to
             | see that the implementations might diverge later, and
             | simplify actually doing so (by preserving more of this
             | intentional context).
        
         | mostlylurks wrote:
         | > An early abstraction that would've grouped those
         | coincidentally similar pieces of code would then have to
         | stretch to cover both evolutions.
         | 
         | This seems to be the underlying assumption behind most uses of
         | the "duplication is cheaper than the wrong abstraction" quote,
         | but the assumption is simply incorrect. You should almost never
         | try to expand abstractions in this manner. If you don't treat
         | the abstractions relating to the thing you want to change in
         | your codebase as "the" place where you need to make your
         | change, and instead eagerly make new abstractions and throw old
         | ones away as required, you won't really run into this problem.
         | 
         | In fact, this predominant mindset where creating abstractions
         | is strongly discouraged leads to the very problem that mindset
         | is based on, as it will simply encourage junior developers and
         | the like to modify the existing abstraction, creating the
         | aforementioned kind of mess where abstractions become
         | complicated through repeated modification, instead of creating
         | new abstractions when appropriate, because creating
         | abstractions has a stigma attached to it.
         | 
         | Additionally, if someone has made a "wrong" abstraction based
         | on something silly like two pieces of code simply being similar
         | in terms of their structure and those use cases start to drift
         | apart, you should feel eager to simply split apart the
         | abstraction, be it into bare implementations or two new
         | abstractions, or any other combination. Abstractions are cheap
         | as long as you don't give them special significance.
        
           | jameshart wrote:
           | When an abstraction evolves to a point where it needs to be
           | split into two separate implementations to meet diverging
           | needs...
           | 
           |  _you will need to replace that abstraction with
           | duplication_.
           | 
           | Which is the right thing to do because that duplication is
           | cheaper than maintaining the wrong abstraction.
           | 
           | I think this post makes the mistake of thinking that the only
           | way in which duplication comes up is that it is discovered in
           | the codebase, and we have the choice of abstracting it away
           | or keeping it.
           | 
           | On the contrary, duplication can - and should - be
           | consciously introduced to fix bad abstractions when we find
           | _them_ in the codebase.
        
             | cratermoon wrote:
             | > When an abstraction evolves to a point where it needs to
             | be split into two separate implementations to meet
             | diverging needs... _you will need to replace that
             | abstraction with duplication._
             | 
             | Hard disagree. When the formerly common parts of an
             | abstraction evolve to no longer be common, then that
             | duplication no longer exists. There now exists two
             | abstractions, one for each of the diverging needs. There
             | may be some leftover commonality that can be abstracted
             | out, but it's no longer the original abstraction.
        
               | hannasanarion wrote:
               | The point is that they were never actually common in the
               | first place, only superficially similar.
               | 
               | You're saying we should look for duplications, abstract
               | them, and then every time a change needs to be made to
               | the abstraction to suit only one of the use cases,
               | refactor the codebase to de-abstract and re-duplicate,
               | undoing the work we did in the name of DRY in the first
               | place.
               | 
               | That is a lot more work and a lot more confusion and a
               | lot more headache for maintainers and reviewers than
               | copy-pasting the thing the first time, having realized
               | that the duplication was incidental, not structural.
               | 
               | Let's take this line of reasoning to its extreme:
               | 
               | I notice that there's a section of my code that's
               | repeated twice where we add one to a value, so I abstract
               | it into a function called add1(x:int). Some time later,
               | at places where add1 is used we sometimes need to
               | actually add a value other than one, so we need to make a
               | decision: do we refactor everything and re-duplicate, or
               | do we stick the DRY principle and make our abstraction
               | more accomodating? The path of least resistance is to
               | stick to DRY because it's a smaller and more
               | comprehensible commit, so we add an optional arg, add1(x:
               | int, operand?: int). Some time later one of the callers
               | to this function needs to pass a vector instead of a
               | single value, so we need our add1 function to have
               | polymorphism and conditional logic in it now, and
               | potentially more arguments. Sooner or later we have a
               | frankenfunction that's hundreds of lines long and
               | branches a bazillion ways and might as well be a turing
               | machine in itself.
               | 
               | Dogmatic adherence to DRY leads to madness.
        
               | cratermoon wrote:
               | > You're saying ... refactor the codebase to de-abstract
               | and re-duplicate, undoing the work we did in the name of
               | DRY in the first place.
               | 
               | That's the exact opposite of what I'm advocating for, but
               | perhaps I didn't express myself well.
               | 
               | > Sooner or later we have a frankenfunction that's
               | hundreds of lines long and branches a bazillion ways and
               | might as well be a turing machine in itself.
               | 
               | Yeah, that's not a good abstraction, and not at all what
               | I meant.
        
               | seadan83 wrote:
               | To some extent I agree, though I don't think DRY means to
               | remove all similar looking lines of code and put that
               | behind a procedure. Generic code vs abstractions are
               | different.
               | 
               | Instead, any given task (which already is an abstraction)
               | should exist in only one place. That is DRY, I would
               | paraphrase it to mean any given abstraction should be
               | done in one place (and combine with SRP to say further
               | that one place should only do that one abstraction)
               | 
               | If one place can be updated independently of another, it
               | argues it is not the same task to begin with. DRY'ing
               | that code is a misnomer IMHO, instead that code is being
               | put behind a procedure and is being made generic (and not
               | necessarily more abstract. Abstracting hides details,
               | putting a block of code behind a procedure with full
               | parameterization is not hiding details, it's just a
               | procedure [and let us hark back to the days of procedural
               | programming and ways that can become mess])
               | 
               | DRY and SRP (single responsibility principle, AKA the DnD
               | principle) need to be considered together.
        
               | ajuc wrote:
               | In many cases you cannot see the correct abstraction
               | without introducing the duplication back. When working
               | with particularly messy code I often do sort of
               | https://en.wikipedia.org/wiki/Karnaugh_map of important
               | variable states to see what actually happens before I can
               | refactor it.
               | 
               | This is basically introducing the duplication back.
               | 
               | Whether you keep the duplicated code or refactor it in a
               | different way is another question, what matters for the
               | "duplication is cheaper than wrong abstraction" to be
               | true is just the fact that by introducing abstraction
               | early you wasted time refactoring one way and back.
               | Refactoring isn't free. So in fact leaving the
               | duplication there would have been cheaper - Q.E.D.
               | 
               | It doesn't mean you should never risk it, but it does
               | mean you should think hard before you do it.
        
           | sweezyjeezy wrote:
           | I think there's a middle ground here. The original quote does
           | not mean DRY=bad, abstraction=bad. The point is there is a
           | non-zero cost to these things. A bad abstraction can, as you
           | say, accumulate to something terrible through inertia or
           | inexperience. A bad abstraction, even if caught early, was
           | probably not worthwhile - I mean, it took time just to make
           | the original one right? This does not mean that we should be
           | scared of abstraction in general, but in my opinion
           | abstractions that are purely for the sake of reducing
           | duplication should be viewed with an extra level of
           | apprehension.
        
           | TOGoS wrote:
           | > You should almost never try to expand abstractions in this
           | manner
           | 
           | But somebody on your team /will/.
           | 
           | > you should feel eager to simply split apart the abstraction
           | 
           | Sure, but it's going to be a lot more work at this point than
           | if we had avoided the mess in the first place.
        
         | bheadmaster wrote:
         | There's a very good quote on a programming blog [0] that I
         | enjoyed reading:                   Repeat yourself to avoid
         | creating dependencies, but don't repeat yourself to manage
         | them.
         | 
         | [0] https://programmingisterrible.com/post/139222674273/write-
         | co...
        
         | CuriouslyC wrote:
         | Duplication can sometimes be useful, for instance if you have
         | many small variations on a central process. Trying to make one
         | process with all the edge cases baked in leads to overly-
         | complex, hard to reason about, expensive software.
         | 
         | In my experience, the right way to handle this sort of
         | situation is to create a functional mini-DSL for the process
         | that handles all the implementation details, then create a
         | "default" process which serves as a template. If a process
         | needs slightly different logic, just copy the template, update
         | the DSL to support any new logic, and update the template with
         | the new DSL statements. This approach lets you give semantic
         | meaning to implementation details, and you can see where all
         | the different custom logic is at a glace by looking at all the
         | template copies. As long as the template is only calling out to
         | DSL actions with no internal logic of its own and process flow
         | is correctly encapsulated in the DSL, you should never need to
         | update templates to change behavior, only update the DSL.
        
           | augustk wrote:
           | DSL = Domain-specific language (I guess).
           | 
           | Always a good idea to expand an abbreviation the first time
           | it's used.
        
           | amalcon wrote:
           | This way of doing things (which I agree is often the correct
           | way) is the reason for Greenspun's Tenth Rule:
           | https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule
           | 
           | Though it's less true today and in languages that are not C
           | or Fortran. Even something like C++ or Java has the template
           | method pattern, which gets you 80% of the way there. Dynamic
           | languages like Python or Ruby tend to have pretty reasonable
           | facilities for building DSLs, as do more modern languages
           | like Scala and Rust.
        
           | crazygringo wrote:
           | > _is to create a functional mini-DSL_
           | 
           | Exactly. Is there a formal term for this?
           | 
           | Instead of one gigantic function with 50 parameters, you have
           | 100 "template" functions, that all make use 60 different
           | "helper" functions (what you're calling the DSL).
           | 
           | Instead of castles-of-logic abstraction, it's nuts-and-bolts
           | or grass-roots abstractions. I've never come across a name
           | for this development style.
           | 
           | But it generally works extremely well when building processes
           | for tens/hundreds of data formats or customers or what have
           | you.
        
             | fiddlerwoaroof wrote:
             | This sounds like what lispers call something like "language
             | driven design" or "growing a language"
        
             | consilient wrote:
             | > I've never come across a name for this development style.
             | 
             | Libraries designed like this are sometimes called
             | "combinator libraries".
        
             | code_biologist wrote:
             | Embedded DSLs is the term I've seen in the Haskell, Scheme
             | and Ruby communities.
        
           | tracker1 wrote:
           | This is generally my approach to data ingress/egress (ETL)...
           | I'd rather have a hundred similar, small scripts for each
           | data source than try to create one complex (monstrosity)
           | application to handle them all.
        
         | codegeek wrote:
         | Also, you become a better programmer if you write duplicate
         | code and then learn how to abstract it for cases that make
         | sense. I also don't believe that dupe code is always a bad
         | thing. Like everything else in software engineering, IT
         | DEPENDS.
        
         | bena wrote:
         | The answer, as always, is "sometimes".
         | 
         |  _Sometimes_ duplication is cheaper than the wrong abstraction.
         | 
         | And
         | 
         |  _Sometimes_ it 's better to abstract away a duplication rather
         | than let it lie.
         | 
         | And that's the mark of becoming a master at the craft. Being
         | able to recognize all of these various slight permutations of
         | state and what to do about them.
        
           | rightbyte wrote:
           | Rule of thumbs really need to be told like this. Or they will
           | be missused. Either by newbies that doesn't know any better
           | or unpleasant programmers that will show their dogmatic
           | beliefs down your throat with the common wisdom as excuse.
        
         | awkward wrote:
         | A good example of this is operations type stuff, like the pile
         | of shell scripts or terraform files or whatever that get used
         | to deploy your app. These scripts benefit greatly from a one to
         | one relationship between the thing you're creating and the
         | written text describing it. Not having a situation where
         | changing one thing breaks everything else is a huge help there.
        
         | capableweb wrote:
         | As a FYI, just as it's OK to abstract away duplication in code,
         | it's OK to do the opposite, remove abstraction and add
         | duplication.
         | 
         | So in your particular case, it could have been possible to
         | abstract away the code _at that point in time_ and once they
         | diverge, remove the abstraction and duplicate, then adjust one
         | of the duplicates (which no longer is a proper duplicate
         | really).
         | 
         | But, might be more work than it's worth. YMMV.
        
           | patrick451 wrote:
           | > As a FYI, just as it's OK to abstract away duplication in
           | code, it's OK to do the opposite, remove abstraction and add
           | duplication.
           | 
           | > So in your particular case, it could have been possible to
           | abstract away the code at that point in time and once they
           | diverge, remove the abstraction and duplicate, then adjust
           | one of the duplicates (which no longer is a proper duplicate
           | really).
           | 
           | This sounds nice in theory, but the reality is that the
           | effort required to make these two kinds of changes is not
           | symmetric. It's about 10 times easier to get a PR approved
           | and merged that combines similar looking code into a function
           | than vise versa. If you any suspicion at all that an
           | abstraction you're making may need to be removed and
           | duplicated in the future, you're better of just never
           | abstracting in the first place.
           | 
           | It sucks pushing a change which unwinds an abstraction like
           | that through code review. It's usually a lot easier to just
           | never abstract it in the first place.
        
         | mdiesel wrote:
         | Equality doesn't necessarily mean Equivalence.
        
         | Gibbon1 wrote:
         | My problem always is often when writing a function to remove
         | duplication brings up the question of where to put it. If its
         | only called inside one module doesn't matter really. But if not
         | you've created a dependency. Which is bad.
         | 
         | I think how much you hate that may depend on your language and
         | the program. Some big enterprise Java monolith is a garbage
         | dump of thousands of small files. So who cares. In C without
         | name spaces and the need for headers you care more.
        
         | schwartzworld wrote:
         | The problem is that "duplication is cheaper than the wrong
         | abstraction" is basically an excuse that lazy devs use not to
         | engineer their code.
         | 
         | The other one I hear a lot is "it's not realistic to reach 100%
         | test coverage / type safety" when submitted code with `any` all
         | over it and zero tests.
        
         | __alias wrote:
         | I buy into the same belief as you here, but I guess you could
         | easily argue that you could create a suitable fitting
         | abstraction earlier on with the understanding that you can
         | "detach" them once the point that they're fundamentally
         | different comes
        
           | consilient wrote:
           | The point of abstraction is to reduce the number of concepts
           | in play. If you're still tracking which old concept is
           | "really" being used every time, you haven't actually
           | abstracted over anything, you're just naming things badly.
        
             | capableweb wrote:
             | > The point of abstraction is to reduce the number of
             | concepts in play.
             | 
             | I'm not sure I agree with this. For me, the point of
             | abstraction is divide the number of concepts between the
             | layers you introduce, effectively to hide concepts from the
             | layers where you don't want to have to care about them.
             | Often times, abstractions adds the total number of concepts
             | at play, but hides them beneath/above the layers.
        
           | fluoridation wrote:
           | The problem is that there's an impetus to continue working on
           | top of established facilities, because it's usually
           | incrementally less work than reworking a piece of code into
           | something else. Plus it's difficult to recognize ahead of
           | time when something is about to become a problem, rather than
           | fix something that's already a problem.
        
         | feoren wrote:
         | You're absolutely right that it's important to look beyond how
         | two modules superficially look right now, and look instead at
         | how they _change_. However, if you 've always defined your
         | abstractions based on what their consumers _need_ rather than
         | what their implementations _have_ , then you shouldn't ever
         | need to stretch them. They're not trying to "cover" both cases,
         | they're trying to solve a problem that both cases have. Your
         | two cases are not implementations of the abstraction, they are
         | consumers of it. If one case grows to not have that problem, it
         | just stops asking for that abstraction. If it grows to have
         | more problems, it just asks for more abstractions. The original
         | abstraction, if based on a common need, doesn't have to change.
         | 
         | That's not to say abstractions never change -- they do. But
         | they change because your understanding of the sub-problem
         | they're solving has changed, not because their implementations
         | or consumers have changed.
        
         | peeters wrote:
         | I try to think about whether two concepts are innately similar
         | or incidentally similar. Computing compounding interest for a
         | home equity loan and a mortgage might be innately similar. A
         | desired change to one will probably make a desired change to
         | the other. Computing growth of a fruit fly population and
         | computing compounding interesting for a loan might be
         | incidentally similar. Until you change your
         | "computeExponentialGrowth" function to now handle occasional
         | decimations from environmental sources, and anyone looking at
         | the code wonders what the heck that looks like for a loan.
        
           | HWR_14 wrote:
           | As interest rates go back up, paying off (part of) a mortgage
           | early might come back.
        
           | adammarples wrote:
           | If you've got your abstractions correct, then the exponential
           | growth term and the decimation term will be partial
           | differentials which will compose together nicely
        
           | AnimalMuppet wrote:
           | For a loan, maybe it looks like payments on the principal?
           | 
           | But your overall point is _very_ correct. Don 't make an
           | abstraction because of coincidence.
        
         | msluyter wrote:
         | I think one example where duplication > abstraction is in
         | tests. I personally find tests that have a ton of extra helper
         | classes/functions to do stuff like set up fixtures or do
         | assertions to be painful to deal with. Taken to an extreme you
         | end up with a mini test framework that obscures the actual test
         | cases and is as hard to understand as the code in question.
         | 
         | I'm not against shared test fixtures or some utility functions,
         | but IMHO, it's better to have some duplication but clearer
         | tests.
        
           | echelon wrote:
           | Fully agree.
           | 
           | I would add that you should duplicate the common, cross-
           | cutting setup (eg. faked/mocked dependencies that don't
           | matter), but make the test conditions themselves explicit.
           | 
           | You get a feel for the correct granularity the more tests you
           | write within the codebase. If you try to be too clever in
           | saving boilerplate, you'll cause pain for future
           | modifications and maintainers. Sometimes fixing "clever"
           | tests takes longer than the code change itself.
        
           | abcdaiojjdfoj wrote:
           | I like it when you have nice, composable utility functions.
           | Ideally each test contains a short preamble setting up the
           | appropriate context for the test to run. The preamble
           | elucidates what the tests are actually testing. It can also
           | serve as documentation on how to use those functions.
           | 
           | There will probably be _some_ duplication across tests, but
           | if the utility functions are idempotent /composable, they're
           | usually pretty easy to read/understand and equally mechanical
           | to write/update.
        
           | ezekg wrote:
           | > I personally find tests that have a ton of extra helper
           | classes/functions to do stuff like set up fixtures or do
           | assertions to be painful to deal with.
           | 
           | I think it depends on the context. For example, I typically
           | agree, but when I was writing authz tests [0], I ended up
           | writing a DSL so that 1) I'd more more inclined to write the
           | thousands and thousands of tests, and 2) I'd be able to focus
           | on the actual authz assertion and not on verbose setup.
           | 
           | I couldn't imagine writing those policy tests without that
           | abstraction. I would have lost my mind with all of the
           | repetition, and would have almost assuredly made mistakes.
           | 
           | [0]: https://github.com/keygen-sh/keygen-
           | api/blob/master/spec/pol...
        
             | HumanOstrich wrote:
             | Thank you for the link. This is inspiring. Do you have any
             | resources you could link to that would explain some or all
             | of the style for these tests?
        
         | crabbone wrote:
         | So... you kept modifying the two similar pieces of code until
         | they became dissimilar. Why do you think that you wouldn't be
         | able to modify the abstraction if you saw that it doesn't fit
         | anymore?
        
           | convolvatron wrote:
           | I think part of the issue here is that a fair number of
           | programmers work in shops where they have very limited
           | agency. They are tasked with making the minimum defensible
           | change to add a feature or fix a bug. They are not allowed to
           | change the tests or suggest refactoring. So those things just
           | don't occur.
        
         | AnimalMuppet wrote:
         | In that situation, the correct thing to do is, when the two
         | pieces drift away from each other, to recognize that they are
         | no longer the same abstraction and to break the connection.
         | That may be painful - you have to look at everywhere that
         | abstraction is used and figure out which thing it really is,
         | and change the code to reflect it.
         | 
         | But if that's going to happen, then in the early days, a little
         | duplication was probably better.
        
           | [deleted]
        
         | lukeramsden wrote:
         | > There are many instances I've encountered where two pieces of
         | code coincided to look similar at a certain point in time. As
         | the codebase evolved, so did the two pieces of code, their
         | usage and their dependencies, until the similarity was almost
         | gone
         | 
         | https://connascence.io/
        
       | NewEntryHN wrote:
       | The author did not understood the idea.
       | 
       | His description of his understanding does not include any
       | reference to the "wrong"-ness of abstractions that shouldn't
       | exist. If I read him as-is, I should conclude that the idea is to
       | never make any abstraction at all. It obviously cannot be it
       | since that would be stupid.
       | 
       | "Wrong" abstractions are already bastardized, from their first
       | iteration. Developers decide to code them nonetheless because
       | they estimate that their "awkwardness" is worth it in comparison
       | to code duplication. What they fail to realize is that, to the
       | contrary that code duplication which just "is there", the
       | awkwardness of the abstraction will compound.
       | 
       | Duplication is the last resort, when one has established that he
       | couldn't find any non-wrong abstraction.
        
       | gumby wrote:
       | An important context is the use case. Grossly speaking, business
       | applications tend to have a shorter lifetime and faster cycle
       | time than system code like, say, the Linux kernel or gcc. So the
       | cost of refactoring in the latter case is amortized over longer
       | timescale; when you have rapid business needs it can often be
       | better to just pmake the change in two or three places and move
       | on because in a few years the whole thing will be replaced.
       | 
       | We all know of exceptions to those examples (quick-and-dirty code
       | that survives decades later) but I think that's the way to think
       | about it.
        
       | waffletower wrote:
       | I have my umbrella at the ready for a downvote hailstorm: it
       | makes perfect sense that the OP is hearing this repeated in the
       | Rails community, as they are already enmired in the wrong
       | abstraction -\\_(tsu)_/-
        
       | cochne wrote:
       | > Not every piece of code is an abstraction of course. To me, an
       | abstraction is a piece of code that's expressed in high-level
       | language so that the distracting details are abstracted away. If
       | I were to see a confusing piece of code littered with conditional
       | logic, I wouldn't see it and think "oh, there's an incorrect
       | abstraction", I would just think, "oh, there's a piece of crappy
       | code". It's neither an abstraction nor wrong, it's just bad code.
       | 
       | The wrong abstraction isn't crappy code itself. It is a
       | reasonable looking piece of code that will force the next person
       | into writing crappy code to accommodate it.
       | 
       | Edit: I think the entire project of TensorFlow is a good example
       | of this. They built the library around a "graph" entity, and
       | anything you did had to be shoehorned to fit that. That worked OK
       | for some straightforward neural networks and situations for a
       | while. As the area evolved though, it proved very burdensome.
       | They tried to evolve it into TensorFlow 2.0 which was more
       | forgiving, but by that point it was too late, the ecosystem
       | became a mess. PyTorch stole the thunder because they didn't make
       | the wrong abstraction (though I'm not sure if "duplicating" is
       | what helped them do that)
        
       | Strilanc wrote:
       | One of the major shifts in my coding style over the past ten
       | years has been to increase the amount of duplication. My
       | threshold for "I should really dedupe that" increased from ~3:7
       | lines to ~10:50. Looking back this was driven by two main
       | factors: testing and performance optimization.
       | 
       | The testing side is just that tests become awful much faster than
       | normal code if you dedupe them. Unit tests are supposed to be
       | simple and independent, but deduping makes them correlated and
       | complex. You think you'll make things simpler by extracting the
       | common setup from twenty tests into one method, but instead
       | you've coupled the tests so they can't individually be tweaked
       | and laid the seeds for a monster incomprehensible test object to
       | grow from.
       | 
       | The performance side is that often improving performance requires
       | removing abstraction layers so everything is in one spot,
       | allowing irrelevant cases to be removed. Adding the abstraction
       | layers ahead of time makes performance worse to start with, from
       | all the jumping and "paper over one more difference" flag
       | checking, and also makes performance improvements harder later.
       | 
       | If two things are supposed to behave analogously, I'm nowadays
       | much more likely to enforce this by testing the analogy rather
       | than by sharing the implementation.
        
       | yafbum wrote:
       | Let me give an example bad abstraction that isn't due to littered
       | conditionals, but still very bad.
       | 
       | One time company A had a database, and code that loaded persisted
       | object state from them. Some of the objects could be soft
       | deleted. Rather than check various objects for soft deletion, the
       | team decided to check all objects for soft deletion, regardless
       | of their type, by querying a table where objects had to be listed
       | if they were still live (not soft deleted).
       | 
       | Fast forward a few years, everybody follows this pattern, and
       | there is massive hotspotting of that central "object lifetime"
       | table that has basically two columns (object_id, is_deleted) that
       | becomes a latency bottleneck because absolutely everything is
       | joining on it all the time.
       | 
       | Truth is, it made it convenient to code with this, because you
       | never had two ways of checking whether an object was live, and by
       | construction you could never make the mistake of operating on a
       | soft deleted object or forgetting to implement lifecycle
       | deletion.
       | 
       | But man was that a poor abstraction. It was probably redundant
       | with database functionality. It gave soft deletion capabilities
       | even to things that didn't need soft deletion. It had a
       | significant latency cost. But everybody adding a new object type
       | just picked it because it was the way the company has decided it
       | would do soft deletion.
        
         | mannykannot wrote:
         | I feel you are describing an implementation that was once fine
         | but is no longer satisfactory, rather than an abstraction,
         | which perhaps could have been made easier to fix with a bit
         | more abstraction: a function to do the soft deletion if
         | possible, with a better-performing (albeit probably more
         | complex) way of determining whether soft deletion was an
         | option.
        
         | harrisonjackson wrote:
         | > Fast forward a few years
         | 
         | Sounds like it was just what was needed at the time and worked
         | better/longer than most abstractions.
        
           | RandallBrown wrote:
           | > Sounds like it was just what was needed at the time
           | 
           | The problem I've seen often in codebases is that as an
           | abstraction or pattern grows more unwieldy, they don't take
           | the time to update it.
           | 
           | They often don't get revisited until they're so bad that they
           | can't be ignored.
           | 
           | Handling a something with a switch or if/else is fine if
           | there's only 2 or 3 options, but people will often just keep
           | piling on. When it's 10 things, changing it becomes much more
           | work so people will continue to add to it. Then when it
           | breaks at 20 things, someone will come in and say "Why did we
           | write it this way in the first place? It doesn't make any
           | sense!"
           | 
           | I'm often torn between pragmatically writing the simplest
           | code possible and being proactive about abstracting early to
           | prevent an eventual breakdown of the pattern.
        
             | edgyquant wrote:
             | Why is a switch bad? Python uses a giant switch statement
             | to run it's opcodes
        
             | Pannoniae wrote:
             | How does a switch break at 20 items? Any respectable
             | compiler or interpreter should handle that fine. If it was
             | 32k cases, I could imagine why it would raise an error. But
             | 20? Seriously?
             | 
             | Often, writing more cases into a switch statement is way
             | easier and less boiler-plate-y than abstracting it out to
             | subclasses or a dictionary or whatever.
        
       | arein3 wrote:
       | It depends. If it's not some core area of the code, but more like
       | a script, some code that lives at the periphery, it might be
       | better to "duplicate" almost similar code that is hard to
       | abstract.
       | 
       | I saw attempts to remove "duplication" that made the code so
       | hairy and hard to read, as opposed to very readable. I put
       | duplication in quotes, because code might be similar, but not
       | 100%.
       | 
       | Some code is easy to deduplicate.
       | 
       | Some code might be hard, and if the overengineering is done to
       | remove 2 occurrences at some code periphery, is not worth it.
        
       | geoguess wrote:
       | I'm a copy+paste programmer, and proud of it. It's quicker,
       | easier, and most importantly: someone else's problem to fix, if
       | they're the type of developer who disagrees with this coding
       | style.
       | 
       | I'll keep churning out duplicated code and you guys can keep
       | refactoring against it. We all get paid so what's the problem?
        
         | disintegore wrote:
         | Pragmatic genius
        
         | Bjartr wrote:
         | For some, there's more to job satisfaction/QoL than just money.
        
         | seattle_spring wrote:
         | I have a feeling you and I are not paid the same, so there's
         | that.
        
       | saurik wrote:
       | As someone who has made a good life over the years by taking
       | advantage of the security bugs (either to build my embedded
       | empires--aka, jailbreaking--or to directly collect bounties)
       | caused by all of the people who hate abstraction so much (or are
       | merely so bad at doing it that they don't know how to do it well)
       | that they vehemently argue that duplication is not merely a
       | temporary pragmatic decision to incur potentially-dangerous
       | architectural debt which you intend to come back and fix later
       | but is somehow _better_ than even _trying_ to address it, I guess
       | I find this discussion thread of people almost 100% tearing into
       | this article 's fundamental premise... kind of fun? ;P
       | 
       | So, yes, yes: _please_ do continue to ensure you have so much
       | boilerplate in your  "flat and easy to understand" code that you
       | eventually make a fatal mistake (potentially simply while doing a
       | merge commit), refuse to factor your safety checks out into
       | abstractions that prevent you from making the same mistake twice
       | due to your refusal to "obfuscate the underlying API everyone
       | knows how to use", and (my true favorite) litter your code with
       | multiple implementations of the same algorithms that have _very
       | subtle_ differences in them (so called  "parser differentials")
       | as you insist on every single programming language in use having
       | its own copy of the algorithm "for ergonomic reasons, as IPC/FFI
       | would be crazy when I can just import a second one off-the-
       | shelf".
        
         | rightbyte wrote:
         | How do you know the code you reverse engineered was flat and
         | simple and not Best Practice with Scrum on top?
        
       | 1vuio0pswjnm7 wrote:
       | Abstractions that are software language idioms are more palatable
       | than bespoke ones.
        
       | amelius wrote:
       | What is wrong with duplication if you can just ask Copilot to
       | deduplicate it whenever you want?
        
       | eweise wrote:
       | DRY to me means having a single authoritative source. So for
       | instance, if I need to define a person data structure then I use
       | protobuf. I can add validation rules, and types to it. I can
       | generate bindings for java, go, ruby, etc and they can all rely
       | on the same person structure, with the same validations. Code is
       | technically copied but there is still a single authoritative
       | source.
       | 
       | If I need handle bank transactions, then I will create a single
       | "microservice" that knows how to create a transaction and update
       | the account balance. I wouldn't want that logic duplicated in
       | multiple places.
        
       | bingemaker wrote:
       | Wrong abstractions make the abstraction configurable, and it is a
       | slippery slope. Keep adding more arguments, configuration
       | options, and there is no end to it. Sometimes duplication is
       | indeed cheaper
        
       | geophile wrote:
       | Don't generalize too soon. But don't wait too long either. If you
       | have to choose one, wait too long and then refactor.
        
       | 49531 wrote:
       | I think mislabeling something as a duplication is where most of
       | these issues stem from.
       | 
       | Humans love to pattern match, we find patterns in things that
       | often have no real pattern. It is not uncommon in my experience
       | to see patterns in code, label the code as not DRY, and attempt
       | to DRY it up. If the "duplication" detected was, in fact, not a
       | duplication but rather code that just happens to be similar, the
       | abstraction will often go awry.
       | 
       | My rule-of-thumb is to prioritize maintenance over authorship. Am
       | I writing this code in a way that makes it easier for future me
       | or another programmer to change it, or am I optimizing for a
       | sleek diff in my code review? I think our code can look like
       | breadboards instead of a bespoke printed circuit board, we have
       | compilers for that.
        
       | cm2012 wrote:
       | Not coding related, but the worst sins of corporate life (like
       | strict procurement teams) stems from efforts at deduplication.
        
       | shadowfoxx wrote:
       | In my career as a software dev I've found one thing to be true -
       | Every Paradigm I ingest that opens new windows of opportunity are
       | great at first pass and as I learn more the more narrow the scope
       | they can be applied. (This is kinda true in life, too. Like when
       | people say, "Its econ 101 or bio 101." etc. What seems like a
       | statement about 'common' knowledge is actually an indication of
       | how shallow your knowledge is!)
       | 
       | Specifically related to this topic is a talk by Dan Abramov
       | called, "The Wet Codebase" - He says it better than I can sum up
       | and has visual aids : https://www.youtube.com/watch?v=17KCHwOwgms
       | 
       | Other have pointed out code that is similar in function vs
       | similar by coincidence and I think that thought alone is worth
       | chewing on.
        
       | linuxftw wrote:
       | If working in a solitary codebase, this problem isn't very
       | interesting. Do whatever makes your life easier.
       | 
       | If you're working on any kind of code that serves as a library to
       | other code, don't mutate the signatures of your public
       | methods/functions. Once that signature is released, the only
       | changes to it's output should be bug fixes. If you have a need
       | for two very similar functions, you should use 2 wrapper
       | functions with the common code in the 3rd.
        
       | mcqueenjordan wrote:
       | The crux of the argument is:
       | 
       | > I think "the wrong abstraction" is a confused way of referring
       | to poorly-de-duplicated code.
       | 
       | But I believe this is similar to a no true Scotsman fallacy. "If
       | you just make the right abstraction, de-duplicating is fine!"
       | 
       | Yes, if you're good at making the right abstraction, it's not
       | worse! Those are the cases when I definitely do the refactoring:
       | when I know for sure I know the right abstraction. Otherwise, I
       | defer the decision for an older, smarter, wiser me (or future
       | maintainer).
        
         | jkubicek wrote:
         | It _is_ the  "no true Scotsman" fallacy.
        
       | vlunkr wrote:
       | > To me, an abstraction is a piece of code that's expressed in
       | high-level language so that the distracting details are
       | abstracted away
       | 
       | That might be what an abstraction is to the author, but it's not
       | a correct definition. Abstraction has nothing at all do with the
       | high or low level languages.
       | 
       | https://en.wikipedia.org/wiki/Abstraction_(computer_science)
        
       | TheOtherHobbes wrote:
       | It's curious there's no formal concept of "unduplication" -
       | splitting a single abstraction originally created to avoid
       | duplication, now littered with conditionals and spaghetti, into
       | separate abstractions that now do something unrelated.
        
       | itsafarqueue wrote:
       | This code golfing always boils down to "it depends". Senior
       | engineers by definition nod sagely and everyone else looks around
       | nervously. It's a tough break. Both approaches are correct and
       | wrong. It depends.
        
       | msie wrote:
       | When I was younger I was more productive when I didn't
       | contemplate such matters. Maybe I wrote a lot of junky code but I
       | got a lot of working stuff done. Now my time is wasted reading
       | clout chasers and their opinions. Reading about coding is such a
       | bad habit when it stops you from coding.
        
       | twodave wrote:
       | I have been down both roads. I've seen unwieldy abstractions
       | reduce a codebase down to a giant pile of edge cases, and I've
       | seen codebases where making a single change to the design has
       | required editing dozens of files. Where I've ended up over the
       | years is to abstract the "big" things. The types that represent
       | your domain. The pieces of the data layer that need to be exactly
       | the same every time. After that, solve for large classes of
       | problems. This may be an abstraction, a usage pattern, or just a
       | function. Transaction management, logging, etc.
       | 
       | Know that if you try to wrap ANYTHING in an adapter "in case we
       | want to swap it out later" that this almost never happens, and
       | when it does the abstraction you came up with is probably
       | inadequate. Transaction handling in one tech is different than
       | another. Or logging context is handled via disposable scopes
       | instead of as part of the log entry. For those cases, if someone
       | isn't already maintaining a good abstraction (like MassTransit)
       | then it probably doesn't exist.
        
       | EugeneOZ wrote:
       | There is a simple merit: if some code is complicated enough to
       | make you think twice before modifying it because you'll need to
       | modify all the copies (and you realize that it will be not easy)
       | - then it is better to make this code DRY.
       | 
       | There are some simple pieces of code that are cheap to copy and
       | modify later. And nothing wrong will happen if you do not apply
       | future modifications to every copy. A code like this doesn't have
       | to be DRY.
        
       | sozin wrote:
       | I think the author gets it wrong.
       | 
       | The cost of DRY (Don't Repeat Yourself)-ing up your code can be
       | high, in that it increases the coupling of your code, and
       | potentially lowers its cohesion.
       | 
       | Consider function def foo(a: int), called from call sites C1 and
       | C2. Eventually C1 wants something out of foo() that it doesn't
       | offer, but, critically, something that C2 _doesn't care about or
       | need_. The author of foo() adds a new default argument: def
       | foo(a: int, b:int = 0), and then there is a conditional block in
       | foo() that deals just with this new b argument.
       | 
       | You've now potentially broken callsite C2, by exposing it to
       | changes that it doesn't care about it. Put another way: you
       | should only deduplicate the code of _all_ the call sites will
       | _always_ change for the same reason. Otherwise, you're lowering
       | code quality of the code by increasing coupling and lowering
       | cohesion. Copy and pasting the code in this case makes sense,
       | because C1 and C2 both have entirely different needs out of
       | foo(). Overtime, foo() will accumulate more and more default
       | arguments as the author stridently attempts to keep everything
       | DRY, and the overall code base becomes more and more fragile.
        
         | jackcviers3 wrote:
         | So you make:                   // new foo block          //
         | using b:int stuff,          // calling fooInternalsX,
         | // fooInternalsY, etc.         // for common functionality
         | foo(a: int, b:int) = ...
         | 
         | and                   // stays the same         // except
         | replacing         // common functionality with         // calls
         | to fooInternalsX,          // fooInternalsY, etc.
         | foo(a: int) = ...
         | 
         | and enough                   private fooInternalsX(a:Int) = ...
         | private fooInternalsY(a: Int) = ...
         | 
         | methods to cover the common functionality.
         | 
         | Your code is still DRY, and you are using polymorphism (foos of
         | different type signatures) instead of if/else, the _behavior_
         | of foo(int) doesn 't change, so you don't require additional
         | tests for _foo(int)_ , the fooInternals<X,Y,Z> aren't public,
         | and you have now added tests for foo(int, int). You aren't
         | paying any additional costs in terms of maintenance. You aren't
         | increasing behavioral risk at C2 for calling foo(int). You are
         | _only_ paying more for foo(int, int), and those are costs that
         | you would have to pay regardless of if foo(int, int) literally
         | duplicated the body of foo(int) for common pieces or refactored
         | the common pieces out. You save cost for maintaining both
         | foo(int) and foo(int, int) if the common pieces need to change,
         | as you are adding tests for the behavioral changes to both
         | foo(int) and foo(int,int) tests, but are only making a single
         | change in the common code.
         | 
         | Also, when doing this, the abstraction is the original
         | foo(int), not the new, additional foo(int, int). Abstraction is
         | the assumption of some parameterized behavior via hard-coding.
         | Here, the new, additional parameterized behavior introduced by
         | the second b:int parameter is abstracted away in the original
         | foo(int), not in the new foo(int, int). That doesn't make the
         | original foo(int) abstraction _wrong_ , because it is used in
         | at least one call site (C2).
         | 
         | Only when all call sites must change to accommodate something
         | that a new parameter allows through _more than one change-set_
         | can you begin to call an abstraction wrong. Otherwise, it is a
         | simple bug that was fixed by a single change-set.
        
       | mannykannot wrote:
       | _" To me, an abstraction is a piece of code that's expressed in
       | high-level language so that the distracting details are
       | abstracted away. If I were to see a confusing piece of code
       | littered with conditional logic, I wouldn't see it and think "oh,
       | there's an incorrect abstraction", I would just think, "oh,
       | there's a piece of crappy code". It's neither an abstraction nor
       | wrong, it's just bad code."_
       | 
       | Of course, if bad code is not an abstraction, then there can be
       | no such thing as a bad abstraction!
       | 
       | More to the point, code littered with conditional logic might
       | well be both good code and a good abstraction. There's a somewhat
       | well-known article out there claiming that Netscape shot itself
       | in the foot by deciding to rewrite the browser from scratch. As
       | an example of how that went wrong, the author mentions the
       | hapless developer trying to write code to work with some hardware
       | component (the great many different dial-up modems that were out
       | there at the time, IIRC), discovering that most of them had
       | unique quirks that had to be respected, even when they nominally
       | conformed to the same spec.
       | 
       | The thing is, you can no more apply abstraction to a program
       | until everything is simple than you can apply compression to a
       | file until its down to a byte. What's really at issue here, as
       | Fred Brooks noted many years ago, is the difficult problem of
       | satisfying the demands of the context's essential complexity
       | while keeping a lid on the implementation's accidental
       | complexity.
        
         | tikhonj wrote:
         | There are a lot of ways for good code to express bad
         | abstractions. The abstraction could be inconsistent with other
         | parts of the system, inconsistent with the concepts it is meant
         | to represent, inconsistent with its own observable behavior,
         | inherently complex or hard to reason about, inconvenient to
         | actually use, poorly suited to whatever people _actually_ use
         | it for...
         | 
         | I've seen a lot of code that is perfectly clean and "well-
         | organized" _as code_ but organized into _absolutely awful_
         | abstractions.
         | 
         | None of that goes against your core point, I just think that
         | seeing the code and its abstractions separately is an important
         | perspective for understanding code design.
         | 
         | On the flip side, it's also totally possible to have bad code
         | but a good abstraction. Some of the best abstractions I've
         | worked with have painful implementations, and it didn't impinge
         | on the quality of the abstraction itself! Of course, the bad
         | code made life a lot more painful for the people responsible
         | for implementing and maintaining the abstraction, and I'm sure
         | it required some real skill and experience to keep that from
         | manifesting to users of the abstraction, but they managed it.
        
       | joshstrange wrote:
       | The whiplash I get from reading this article is massive. One
       | second they agree that bad abstraction (filled with conditionals)
       | is bad but then say:
       | 
       | > So instead of "duplication is cheaper than the wrong
       | abstraction", I would say "duplication is cheaper than confusing
       | code littered with conditional logic". But I actually wouldn't
       | say that, because I don't believe duplication is cheaper. I think
       | it's usually much more expensive.
       | 
       | (emphasis on the last sentence)
       | 
       | I couldn't disagree more. In fact it's an incredibly "junior dev"
       | mindset that sees 2 pieces of similar (or _even identical_) code
       | and is compelled to abstract it. Unless there are at a _minimum_
       | of 3 implementations I think it's always better to duplicate.
       | I've watched too many "common" functions grow over time with way
       | too many arguments, too many conditionals, and way too confusing
       | for anyone to easily follow. The most egregious is different
       | return values based on arguments passed in. I'm not talking
       | "array of strings" or "null" but "array of strings" or "single
       | string" (or worse).
       | 
       | Abstraction can be fun to write and it feels like you are doing
       | something to help "future proof" (also XKCD 927 [0]) but in
       | reality it boxes people in (especially if you try to abstract
       | with less than 3 real implementations) and leads to overly
       | complicated code, or worse "clever" code.
       | 
       | As I've grown as a dev I'm less and less inclined to write
       | "magic" or highly abstracted code and prefer dealing
       | with"boilerplate" that I can tweak as needed for the individual
       | use-case. Only once I have a clear pattern of code that's been
       | deployed and used for a good bit of time do I reach for
       | abstraction/reusable code.
       | 
       | [0] https://xkcd.com/927/
        
         | mostlylurks wrote:
         | > I've watched too many "common" functions grow over time with
         | way too many arguments, too many conditionals, and way too
         | confusing for anyone to easily follow.
         | 
         | This is not the fault of the abstraction. This is the fault of
         | (especially junior developers) treating abstractions as sacred
         | and non-disposable, which is itself the result of a mindset in
         | which creating abstractions is discouraged. You should almost
         | never modify an abstraction. Don't modify abstractions to cover
         | new use cases, and you more or less won't run into any of these
         | issues. If you need to, create new abstractions and throw old
         | ones away.
         | 
         | > Unless there are at a _minimum_ of 3 implementations I think
         | it's always better to duplicate.
         | 
         | This is a silly rule to follow, except for the most
         | inexperienced of developers, perhaps. It doesn't take long to
         | gather enough experience to know be able to recognize in most
         | cases whether some instance of duplication is coincidental
         | (structurally similar by happenstance, which could be
         | "abstracted" in a macro-like manner, resulting in something
         | quite fragile to changes) or if you're actually encoding some
         | piece of knowledge into an abstraction. Advice like waiting
         | until a piece of code repeats three times encourages developers
         | to think about abstractions in terms of structural similarity,
         | which is exactly the opposite of how abstraction should be
         | considered.
        
           | joshstrange wrote:
           | > This is a silly rule to follow, except for the most
           | inexperienced of developers, perhaps.
           | 
           | Perhaps you'd consider me inexperienced though I don't
           | consider myself to be so. I've learned enough times that
           | neither I, nor my colleagues, can accurately predict the
           | future and every time we think we know the cases that code
           | will need to handle in the future we guess wrong more often
           | than not.
           | 
           | What I'm trying to say is until you are sure a piece of code
           | is literally the same or with tiny differences that you can
           | cleanly abstract you shouldn't try to guess how future code
           | will use the abstraction. It's the same rule of mine where I
           | try to never proactively add functionality to a
           | function/piece of code. You think that you are saving your
           | future self (or peers) time but too many times I've see
           | people guess wrong at what extra functionality we will need
           | and then that code never gets touched and/or gets
           | migrated/updated for years before someone realizes there is
           | no calling-code that uses that functionality but we have been
           | dragging it along this whole time.
           | 
           | Could you check everywhere and make sure it's not being used
           | and thus can be removed? Maybe but I understand the desire to
           | make as few changes as possible and preserve the
           | functionality as it was when you first went to edit the code.
           | Overall that's a good idea when making changes and sometimes
           | you don't always know what params all the clients are passing
           | to an endpoint to be sure of if something is still in use or
           | not.
        
         | jamil7 wrote:
         | > I couldn't disagree more. In fact it's an incredibly "junior
         | dev" mindset that sees 2 pieces of similar (or _even
         | identical_) code and is compelled to abstract it. Unless there
         | are at a _minimum_ of 3 implementations I think it's always
         | better to duplicate. I've watched too many "common" functions
         | grow over time with way too many arguments, too many
         | conditionals, and way too confusing for anyone to easily
         | follow. The most egregious is different return values based on
         | arguments passed in. I'm not talking "array of strings" or
         | "null" but "array of strings" or "single string" (or worse).
         | 
         | I agree with you here and tend to rather, if possible,
         | deduplicate subsystems or sub-functions of similar
         | looking/identical code and keep the duplicate public surfaces.
        
           | joshstrange wrote:
           | > I agree with you here and tend to rather, if possible,
           | deduplicate subsystems or sub-functions of similar
           | looking/identical code and keep the duplicate public
           | surfaces.
           | 
           | Completely agree, take the small parts that are
           | standalone/discrete and abstract them. I greatly prefer
           | something like                   function1() {
           | commonCode1();             commonCode2();
           | commonCode4();         }              function2() {
           | commonCode2();             commonCode3();
           | commonCode4();         }
           | 
           | Over something like (assume I've inlined the commonCodeX
           | logic):                   functionCommon(branchingParam) {
           | if(branchingParam) {                 commonCode1();
           | }             commonCode2();             if(!branchingParam)
           | {                 commonCode3();             }
           | commonCode4();         }
        
         | jkubicek wrote:
         | > As I've grown as a dev I'm less and less inclined to write
         | "magic" or highly abstracted code and prefer dealing
         | with"boilerplate" that I can tweak as needed for the individual
         | use-case.
         | 
         | This is part of creating abstractions to benefit the reader,
         | not the writer of the code.
         | 
         | I'm currently refactoring a python package that was designed to
         | make writing ETLs very elegant (it worked!), but as a
         | consequence, when something goes wrong, figuring out what
         | happened involves pouring through 4 different modules, class
         | hierarchies and trying to track variables through multiple
         | layers of abstraction. It's a nightmare for debugging.
         | 
         | Simple boilerplate is repetitive and boring, but man would it
         | be so much easier to read
        
           | joshstrange wrote:
           | > Simple boilerplate is repetitive and boring, but man would
           | it be so much easier to read
           | 
           | Yep, and I'll fully admit when I first started out I hated
           | this idea and wanted everything to be super-DRY but I've
           | swung back in the opposite direction (or at least to a good
           | mean). I had a developer ask why we had some boilerplate
           | semi-recently when the function in question was simply
           | calling another function on the parent class, why not just
           | call the parent function directly (it was protected, they
           | wanted to just make it public). I explained that yes, right
           | now we were doing a straight pass-through essentially (this
           | was for a CRUD layer) but that we had learned over and over
           | that over time we needed to add in things like business
           | logic, validation, or data migrations and this way we just
           | needed to change our "intermediate" function instead of
           | adding one later and having to change all the places that
           | were calling the "direct" function. Same idea as with
           | getters/setters, yes you don't "need" them always when you
           | first write them but having those hooks are invaluable down
           | the line.
        
       | jongjong wrote:
       | I wish people would have this saying in the Node.js and
       | JavaScript community. I disagree with OP about this topic.
       | 
       | Abstractions are like the foundations of a building. Imagine that
       | you're building an apartment block and your job is to build the
       | foundations but you're unsure about how tall the building will
       | be.
       | 
       | If you build it on mud, that might be fine for a one story
       | construction but once other builders start adding additional
       | storeys on top, it will become totally unsuitable and the whole
       | thing will have to be rebuilt from scratch. Not only that, the
       | costs will begin to materialize immediately because those who
       | build on top of your foundation will make all sorts of bad
       | decisions because of your poor judgment; they might decide to
       | build the walls out of cheap wood instead of bricks simply
       | because it's lighter and they don't want the building to tilt and
       | sink into the mud... Then because wood was chosen as the
       | material, there may be a termite infestation and builders will
       | have to apply a special varnish on the entire surface of the
       | building... Then the varnish will turn out to be toxic and will
       | need to be removed. Every inch of the building will have to be
       | polished with sandpaper and painted over... And when the next
       | storey will need to be added, they will be forced to make it out
       | of cardboard... Then the tenants on the top floor will want their
       | money back and the whole building will need to be destroyed
       | anyway; all that back and forth will have been nothing but a
       | waste of time. You would have saved an entire decade and millions
       | of dollars if the foundation had been laid on solid bedrock in
       | the first place. Just one small sub-par decision which triggered
       | an avalanche of terrible decisions.
       | 
       | I think duplicating code makes sense and can be a wise decision
       | early in the project because it's essentially a refusal to lay
       | the foundation until there is more clarity about the scope of the
       | project. It's a lot easier to refactor and combine duplicated
       | code into a new abstraction than it is to refactor one
       | abstraction into a different abstraction. Not to mention that
       | developers become very attached to abstractions (including
       | incorrect abstractions) and it tends to upset people once they're
       | invested in it.
        
       | ekidd wrote:
       | I dislike code duplication. But do you know what I like even
       | less?
       | 
       | Giant functions with 12 keyword arguments passed up and down a
       | call stack, because those functions have many callers which want
       | _slightly_ different things.
       | 
       | Choosing the wrong abstraction often leads to endless kludges and
       | special cases. Two warning signs are functions with 12+ keyword
       | arguments, and strange class hierarchies full of callbacks that
       | only interact with a few functions.
       | 
       | The problem with all programming advice is that it needs to come
       | with George Orwell's classic advice to "Break any of these rules
       | sooner than say anything outright barbarous."
       | 
       | If programming advice makes your code look obviously gross,
       | ignore the advice.
        
         | CharlieDigital wrote:
         | That's not what abstraction is.
         | 
         | Abstraction tends to shift logic from _procedural_ to
         | _structural_.
         | 
         | Rather than 12 keyword arguments and 12 branches in 1 big
         | function, it should be 12 small classes (in OOP) or 12 small
         | functions (in FP) that each handle one of the branches. All
         | organized in some way that thelogic of executing those parts is
         | shared in the structure of the code.
        
           | ekidd wrote:
           | > _Rather than 12 keyword arguments and 12 branches in 1 big
           | function, it should be 12 small classes (in OOP) or 12 small
           | functions (in FP) that each handle one of the branches._
           | 
           | I mean, sure, you could convert your library into 12 little
           | classes, or a collection of purely-functional combinators.
           | Sometimes that helps. Sometimes it makes the situation even
           | worse.
           | 
           | Some of the most terrifyingly inappropriate abstractions I've
           | seen in my career involved complex class hierarchies, or
           | worse, things like "abstract interpretation over the free
           | monad."
           | 
           | There's no substitute for asking, "Are these things I'm
           | trying to abstract over actually similar in any fundamental
           | way?" And "Is this code actually just horrible?"
        
             | CharlieDigital wrote:
             | > "Are these things I'm trying to abstract over actually
             | similar in any fundamental way?"
             | 
             | I'm not sure why this is even a point; if there's no
             | similarity of the things that are being abstracted, why
             | would one even discuss abstraction in the first place? The
             | point of abstraction is that there is some fundamental
             | similarity that the abstraction addresses. `ICloudStorage`
             | abstracts `GoogleCloudStorage` and `AwsS3Storage` because
             | at some level, they both have the same abstract operations:
             | read, write, delete, etc.
        
         | overgard wrote:
         | Ah, but the solution is to turn that 12 argument function into
         | a class that does one thing (runs the function), and dependency
         | inject all those arguments. It still totally sucks, but you can
         | pretend you're writing "clean" code by obfuscating the
         | parameter passing.
        
         | gpderetta wrote:
         | Worse, for those 12 parameters (invariably booleans or if you
         | are lucky enums) functions is that usually only a small subset
         | of all possible flag combination is tested (or even
         | meaningful). Worst part is when the flags are directly or
         | indirectly under user (or configuration) control and the
         | application can go into uncharted territory.
         | 
         | Worse still, good luck refactoring those functions when you
         | have no idea which combinations are actually meant to be
         | supported and what their original semantics where.
        
       | ozim wrote:
       | I would rewrite it in a bit different way:
       | 
       | Fixing a bug in one place and being sure only one place was
       | affected and being sure that one place was really fixed is
       | cheaper
       | 
       | -
       | 
       | than fixing a bug in one place affecting 20 where in 15 places it
       | was a proper fix, in 5 places it will break in unforeseen way
       | when user does something different and somehow additional 2
       | places totally broken because no one ever knew these were
       | affected.
        
       | theknocker wrote:
       | [dead]
        
       | sergiotapia wrote:
       | I disagree with OP. You cannot abstract after two or three dupes
       | because you don't even know what you have or need yet.
       | 
       | Let it breathe, let it stink for a bit - THEN make an informed
       | decision about what to refactor and abstract. You're just jerking
       | off otherwise, and I hate working with code that's been
       | abstracted early for no reason.
        
       | peeters wrote:
       | > I don't see how it can be said, without qualification, that
       | duplication is cheaper than the wrong abstraction.
       | 
       | I mean, this statement _IS_ qualified. The word  "wrong" is doing
       | some heavy lifting. Part of what makes an abstraction wrong is
       | when it is expensive to use as tiny differences emerge in the
       | requirements.
        
         | hinkley wrote:
         | It's also wrong when active epics contain a third
         | implementation of the "same" pattern.
         | 
         | It's been a while since I've seen as much time wasted as trying
         | to a tract the second implementation only to be proven wrong by
         | the third. So instead of being for instance 8, 8 and 16 points
         | to implement, it ends up barely squeaking by as 8, 16 and then
         | 16 again.
         | 
         | It's one thing to fight the Rule of Three for things that might
         | happen. It's quite another when it _will_ happen.
        
       | lcuff wrote:
       | In about 1990 I got tasked with building an installation and
       | configuration system for the hardware and software package my
       | company built. It was an Ethernet card and a TCP/IP suite being
       | added to the PCs of the era (that had an AT/ISA bus where you had
       | to find a free address block, then jumper the card to have the
       | correct address, lotsa fun.)
       | 
       | I wrote the first system targeted at AT&Ts Unix for the 386.
       | After it was completely done, I was assigned to do the same for
       | Xenix. After that was completely done, I got assigned to do SCO
       | (Santa Cruz Operation) Unix. After that, Interactive Systems
       | (ISC). Each system had its own architecture for installation and
       | configuration. I didn't know in advance anything about the
       | different systems, nor any knowledge that the other systems were
       | on the horizon. As I was writing the second system, I was
       | refactoring like mad to avoid duplicating code, and feeling very
       | proud due to previously learning the horrors of duplicate code. I
       | can't remember details, but among other things files had to be
       | placed in a specific directory hierarchy for each system, and
       | various files had to perform certain (different) functions on
       | each system. When I turned to the third and fourth target
       | systems, the refactoring just became weirder and more
       | complicated, but I was determined to avoid duplication.
       | 
       | Historically it turned out we never revised these releases. With
       | 20-20 hindsight, it's a case where the refactoring was completely
       | pointless, and code duplicated 4 times would have been way faster
       | to create, and easier to maintain if we had made new releases. I
       | think part of Sandi's point is that YAGNI applies as well ... a
       | higher level abstraction may accommodate changes that never
       | arrive, or the changes may be so large that NO abstraction will
       | cover it.
       | 
       | On the opposite end of the spectrum, in 1980 (yeah, I'm really
       | old) as a summer-hire, I'd written in HP-Basic this very funky
       | single-purpose very primitive data base system. When I returned 9
       | months later after graduating, two full time guys had made small
       | changes to the system, but one guy had made a breaking change,
       | the other guy got pissed off and duplicated _the entire program_
       | (a single file, to be sure) and made one small change. Thereafter
       | I had to maintain two versions of the thing. Gaaack. It was the
       | ultimate lesson in "don't duplicate code". (It was also in the
       | days before we had a version control system or diff, so backing
       | out and correcting the change wasn't practical.) Mel, where are
       | you now?
        
       | Kapura wrote:
       | I've never so viscerally disagreed with a link on this website.
       | Particularly this point:
       | 
       | > Duplication is bad. In fact, duplication is one of the most
       | dangerous mistakes in coding.
       | 
       | This to me reads insane, fanatical. One of the biggest benefits
       | of duplication that the author fails to identify the locality of
       | logic. When, not if, things break, there's a large benefit to
       | having all of the logic contained to a few heavy-lifter classes
       | that contain bespoke logic and are fit-for-purpose.
       | 
       | "The wrong abstraction" in this case is bending over backward to
       | fit your data into another API just to cut down on code
       | duplication; it is better to have code with clean, uninterrupted
       | data flow than code that frequently needs to re-translate the
       | data to be consumed by different APIs, then decode the results
       | back to useful logic. The translation/decoding steps are new
       | places to introduce bugs, and the more translation or decoding
       | required, the more bug-prone the code will be.
       | 
       | A good abstraction to de-duplicate code should not add complexity
       | to the existing call sites. If you've squinted and decided that
       | two systems are close enough that they can be abstracted
       | together, you're likely making one or both of those code paths
       | much more treacherous.
        
         | wvenable wrote:
         | As a programmer, if you don't create an abstraction you'll
         | never be more than a 1x programmer. Abstraction is how to get
         | more productive than simply how fast you can type.
         | 
         | Yes, the wrong abstraction is bad. But almost every argument
         | whether it's for/against duplication or for/against abstraction
         | usually starts with the hidden premise that you're stuck with
         | whatever choice you've made and code you've written forever.
         | The underlying issue is the fear of change and the sunk cost
         | fallacy of already written code. If you have the wrong
         | abstraction, you can change it. If you created too much
         | duplication, you can remove it.
        
           | AmpsterMan wrote:
           | Wrong abstractions start off as right abstractions and slowly
           | become wrong abstractions. What's the point in which a right
           | abstractions becomes a wrong abstraction? Am I sure I can
           | identify that point? Can someone else? Can someone else which
           | has no knowledge of the original assumptions that were
           | implied during the initial abstraction?
           | 
           | There are two kinds of abstractions in your, the ones
           | everyone complains about and the ones that no one has ever
           | seen.
           | 
           | My rule of thumb is thus: Have I repeated myself three times
           | doing the EXACT same thing? Then CONSIDER abstracting away.
           | Otherwise, make as many implicit dependencies explicit as
           | possible and slightly keep repeating yourself until you are
           | exactly repeating yourself
        
             | wvenable wrote:
             | Your rule of thumb isn't great: If you have some important
             | logic duplicated twice but a year from now has a bug -- are
             | you going to remember to change it both places? But, lets
             | be honest, you probably would not create that code in the
             | first place -- you'd have abstracted it automatically
             | without even thinking about it.
             | 
             | These conversations generally tend to completely discount
             | _experience_. Junior programs are often terrible at
             | abstractions -- they either do way too much or way too
             | little. Can I give them a hard and fast rule that they can
             | use to never make that mistake? No, I can 't. It doesn't
             | exist. The only reason I know what's good or bad is because
             | I've done it wrong thousands of times.
             | 
             | That's the problem with every single one of these articles
             | that prescribe one true solution. It's not at all that
             | simple.
        
           | tuyiown wrote:
           | I'm sorry you are totally, utterly wrong.
           | 
           | Wrong abstractions percolates to your system, assumptions on
           | how your abstraction is supposed to work is hard ossification
           | that hides concrete implementation and their actual,
           | generally simpler, contraints.
           | 
           | Basically your refactoring work now requires to understand
           | all user code that relies on the wrong aspects of your
           | abstractions find a way to correct it if your lucky and make
           | it work exactly the same way the duplicated code would have.
           | 
           | And I didn't mention implementations that drifts in
           | incompatibles ways with the abstraction, a large source of
           | errors and regrets.
           | 
           | The good bet for productivity is recognizable implementations
           | patterns and duplication.
           | 
           | In the end refactoring duplicated code that had time to
           | settle and drifted in legitimate ways to find your correct
           | abstraction is a blast.
        
             | wvenable wrote:
             | I disagree and I can provide an example. I'm creating a
             | library to interface with a REST API. The creators of this
             | REST API obviously didn't do any abstractions and they have
             | multiple implementations for the same exact process: paged
             | queries of items. There's no reason for them to be
             | different -- they're just different because, I assume,
             | different developers built them differently at different
             | times. Nobody looked at this and said "This is all the same
             | so we should have one single common implementation
             | abstracting over all the endpoints."
             | 
             | However, as the developer of the interface library, I can
             | abstract over all the differences and give my consumers the
             | same exact same experience regardless of the API endpoint.
             | And that's exactly what I did. So now they're all more
             | productive because they don't need to know all these
             | unimportant details. They don't even need to know that
             | there's REST API. In fact, this this REST API replaces a
             | previous API implemented with a completely different
             | technology and we are swapping the whole thing out with
             | minimal changes because I abstracted it years ago.
             | 
             | Not all abstractions are wrong. Not all concrete
             | implementations are simpler. My goal with an abstraction is
             | to take some else and make closer to what we need because
             | most technology has a wide audience with wide requirements.
             | I'm a narrow audience with narrow requirements and so I can
             | hide the vast complexity that I simply don't care about.
        
           | kajecounterhack wrote:
           | > The underlying issue is the fear of change and the sunk
           | cost fallacy of already written code. If you have the wrong
           | abstraction, you can change it.
           | 
           | It's not that trivial. Consider that the wrong abstraction is
           | reflected into your API (common), and consider that your API
           | has many users. You are stuck with it, or you have to
           | convince multiple teams (or, god forbid, external customers)
           | to migrate to a better API. This can constitute a humongous
           | waste of SWE-hours ($$$$$) and take quarters to accomplish,
           | assuming you can get any buy in.
           | 
           | I think it comes down to what your organization looks like
           | and how many users are going to be touching your code. If
           | your abstraction is just for yourself internally and everyone
           | else is not allowed to touch it, then fine. You will own the
           | tech debt if the abstraction is wrong. If your abstraction
           | has other users at your company, or external customers, it
           | had better be the right one or at least an unavoidable
           | stepping stone.
           | 
           | > If you created too much duplication, you can remove it.
           | 
           | This is actually true. Refactoring duplicated logic is a lot
           | easier than fixing bad abstractions.
        
             | coliveira wrote:
             | Migrating to better APIs is done all the time. It is not an
             | issue worth discussing about anymore. But even if you have
             | to maintain an API this doesn't mean you cannot change the
             | underlying implementation.
        
               | simonw wrote:
               | It absolutely is an issue worth discussing, any time you
               | are maintaining a library with more than a few other
               | people using it.
               | 
               | Breaking backwards compatibility in a library with
               | hundreds or thousands of users is not something to be
               | taken lightly!
        
               | wvenable wrote:
               | I'm working with an API right now that is absolutely
               | based on duplicated code. They have a system for querying
               | items and the API has 3 different ways depending on the
               | endpoint. I just found a new one the other day and I
               | hated it -- why is this one API unnecessarily different
               | from all the rest!
               | 
               | I'm building a library to call this API and I've
               | abstracted over all these differences so my callers never
               | have to know how messed up the underlying API is -- they
               | get a consistent experience regardless.
        
             | wvenable wrote:
             | It is that trivial. There is no alternative. You either
             | have an API or you don't and you either change it or you
             | don't. Hand wringing over the potential of making a mistake
             | is a waste of time and effort. You will make a mistake. You
             | will never get it perfect. You just have to deal with it.
             | 
             | > Refactoring duplicated logic is a lot easier than fixing
             | bad abstractions.
             | 
             | Then you've just created an abstraction with all that
             | potential to be bad sometime in the future.
        
               | mcpeepants wrote:
               | > Then you've just created an abstraction with all that
               | potential to be bad sometime in the future.
               | 
               | I would argue that you've then created an abstraction,
               | but with all the hindsight allowing you to create the
               | _correct_ abstraction (or at least a much better chance
               | at it approaching "correct")
        
               | kajecounterhack wrote:
               | > It is that trivial.
               | 
               | > You will make a mistake. You will never get it perfect.
               | You just have to deal with it.
               | 
               | These two comments sound at odds. First statement says
               | it's easy. Second statement says it's hard.
               | 
               | We can agree that hard things don't get solved without
               | iterating. But a productive response to abstraction
               | (which is really API design) being hard is not to say
               | "stop handwringing, just do it." Instead, you can employ
               | various strategies such as preferring experienced people
               | to do it, making sure they did a good job of gathering
               | requirements and considered the risks of their approach,
               | spending time testing customer/developer ergonomics, etc.
               | You can also defer producing an abstraction until your
               | system is a bit larger and the duplication is becoming
               | too much to handle, since you have a larger sample size
               | of potential uses for your abstraction to help you
               | converge on the correct API.
               | 
               | Good abstractions can be the difference between success
               | and failure, between organizational velocity and
               | technical debt quagmire. Saying "we should always build
               | abstractions" when it's difficult to build them correctly
               | in one go sounds totally wrong on its face.
        
             | Kapura wrote:
             | 100% agree, especially about refactoring duplicated logic.
             | Super-duplicated code begs to be refactored, and having
             | many examples of the same functionality helps you build an
             | API that is robust without adding "what-if" functionality
             | to try to futureproof code (impossible).
        
           | overgard wrote:
           | I'm not sure that's true, when I look at the code bases of
           | some of the most productive coders I can think of (John
           | Carmack and the Doom and Quake code bases for example), they
           | tend to be fairly conservative with how much they abstract
           | things. There's a lot of very thoughtful data structure usage
           | (git is another good example of this) and diligence in
           | maintaining a coding standard (which generally has little to
           | do with formatting), but most of the code seems to be more
           | concrete and task focused rather than abstract.
           | 
           | I think Casey Muratori has a good way of thinking about this,
           | with his concept of "semantic compression" (
           | https://caseymuratori.com/blog_0015 ). To me that's a lot
           | more valuable than the ideas that you get in say something
           | like Clean Code (which in my opinion, the popularity of that
           | book has been a disaster for the industry)
        
             | wvenable wrote:
             | I'm positive they don't have much in the way of duplicated
             | code.
             | 
             | Abstraction can be as simple as a function.
             | 
             | It seems odd to give really good examples of abstraction
             | and then sort of do a No true Scotsman argument on it. "It
             | can't be abstraction because it's not terrible."
        
               | Kapura wrote:
               | so your argument is, "you've got to write functions,
               | therefore abstraction is always good" ?
               | 
               | yeah man i don't start in main() and never write another
               | function ever again. You got me. But I also don't
               | aggressively police duplication, accepting that I would
               | always rather see what's happening without file-hopping.
               | The correct abstractions will simplify code, and they
               | will seem obvious (even if just in retrospect).
               | Abstractions that force me to continually re-frame the
               | problem i am trying to solve in terms of another's use
               | case are antithetical to writing the sort of code that I
               | do.
        
               | wvenable wrote:
               | You don't have to write functions. Have you ever seen a
               | 10,000 line code that was just a single function -- I
               | have. It also had nested if statements 5 levels deep to
               | handle all sorts of logic with lots of duplication. It
               | was unmaintainable. Yet it did meaningful work and it
               | could have been refactored to a fraction of the size and
               | do the same work. But, to be honest, when I had to fix it
               | I just went it and fixed in the 20 places that needed
               | changing because it was impossible to follow.
               | 
               | You like good abstractions and you hate bad abstractions.
               | I couldn't agree with that more.
        
             | coliveira wrote:
             | There are useful abstractions and useless abstractions. A
             | lot of the GoF designs are bad abstractions (if used
             | indiscriminately) and clutches for bad languages. However,
             | using problem-focused abstractions is a big time saving
             | strategy.
        
         | jasonswett wrote:
         | > One of the biggest benefits of duplication that the author
         | fails to identify the locality of logic
         | 
         | There are a lot of things in the post that I fail to identify.
         | There are also some things I wrote there that I no longer
         | completely agree with.
         | 
         | Here's a newer post in which I go deeper, and in which I do
         | address the locality of logic.
         | 
         | https://www.codewithjason.com/duplication/
        
         | makeitdouble wrote:
         | > When, not if, things break, there's a large benefit to having
         | all of the logic contained to a few heavy-lifter classes that
         | are contain bespoke logic and are fit-for-purpose.
         | 
         | Things break in cycles. You'll have worked around a first wave
         | and be happy that you didn't abstract your code too much as you
         | can just fix one side of your logic very locally. It also means
         | you didn't touch the other sides that were not directly
         | affected, but probably needed a fix in a slightly different but
         | overall similar way.
         | 
         | So you'll see your code instances all break a way or another
         | and fix them one by one instead of hardening a central piece
         | where you could focus your testing efforts.
         | 
         | Of course it's a topic that needs nuance, but if you identify a
         | piece of code as duplicate, there will be no free lunch. Either
         | you pay upfront the effort of abstracting, or you pay down the
         | line the local fixes, but there's not one approach or the other
         | that will be fundamentally wrong, I see it as a bet that pays
         | off or not.
        
         | [deleted]
        
       | mrbungie wrote:
       | You go this route of thinking, and eventually you're making
       | OOP/FP programs out of what could've been simple Bash scripts.
        
       | tracerbulletx wrote:
       | Did I miss where he justifies the statement "duplication is one
       | of the most dangerous mistakes in coding"? That has not been my
       | experience and it's the crux of the value judgement here so I'd
       | expect him to explain why it's so bad.
        
       | ae_throw wrote:
       | We need to have a kind of a footnote in the pedagogy of software
       | engineering, and engineering in general that states to avoid
       | advice ("wisdom", lol) expounded by blowhards. You can identify
       | it usually by the title - it'll have a Grand Style that betrays
       | arrogance.
       | 
       | Lots of people from GoF onwards think they qualify to preach
       | bullshit ultimatums, thinking they have it all figured out. I
       | don't think any of them have any fucking clue what should
       | actually be considered harmful, what should be the two/three
       | "hardest things in computer science", and other nonsensical
       | bullshit they write. With apologies to Dijkstra who I do find to
       | have been one of the shining lights of computer science and
       | engineering but is often misquoted/out-contexted for that
       | considered harmful thing. His letters do betray a higher plane of
       | wisdom.
       | 
       | The more recent "what programmers need to know about {x}" as if
       | the author has any clue is just the continuation of "I've learned
       | this last week/in my last project and it's the most important
       | thing," instead of the trivia that it really is, or shit that's
       | abstracted for us nowadays and only serves to make the author
       | feel superior. Just fuck off with all of that nonsense.
       | 
       | Coincidentally, I'm going to go and read the Hamming book as it's
       | got tangible value having been written by someone who has done
       | something worthwhile in their career.
        
         | [deleted]
        
         | efxhoy wrote:
         | What's GoF?
        
           | neuromanser wrote:
           | Gang of Four: Gamma, Helm, Johnson, Vlissides; authors of
           | Design Patterns.
        
           | mathstuf wrote:
           | "Gang of Four":
           | https://en.wikipedia.org/wiki/Gang_of_Four_(software)
        
           | [deleted]
        
           | Jtsummers wrote:
           | "Gang of Four", references
           | https://en.wikipedia.org/wiki/Design_Patterns.
        
         | ravenstine wrote:
         | A significant amount of things said about computer "science"
         | and engineering is opinions, more so than most believe or are
         | willing to admit. That doesn't mean it's all wrong, but that
         | not everything is universally applicable just because a smart
         | person said a thing.
        
         | esafak wrote:
         | I think Hamming's book is more appropriate for juniors. I found
         | it rambling and obvious. How do others feel?
        
         | coldtea wrote:
         | It sounds more like the idea you propose is "just do whatever"
         | and that there's absolutely no experience those guys (seasoned
         | devs and instructors) have.
         | 
         | There's nothing particularly nonsensical about the "two/three
         | "hardest things in computer science" (although it was said half
         | in jest).
        
           | ilyt wrote:
           | The vast majority of advice like that _is_ garbage and are
           | trying to borrow authority of the few good articles that come
           | out with similarly structured titles.
           | 
           | Usually by people that mistake "the product is successful"
           | with "the product is well engineereed. Or mistaking their
           | rewrite from "the worst way to solve the problem" to "the
           | second worst way to solve the problem" <hyperbole> for "this
           | is the best way to solve problem"
        
         | greggsy wrote:
         | I enjoy the attitude, but sometimes people like to read someone
         | else's view on something, or to gain insights on something they
         | don't know anything about.
         | 
         | Writing authoritatively might be the only way people can get
         | people to read some things. I'm ok with that.
        
       | chrisan wrote:
       | I still find the rule of 3 to be the most pragmatic balance.
       | 
       | https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...
        
       | brohee wrote:
       | Oh... So that's how you end up with class names ending in
       | FactoryFactory... Factorisation at any cost without making sure
       | it makes sense and will keep making sense...
        
       | kylecordes wrote:
       | Maybe y'all are more talented developers than me;
       | 
       | But I have found repeatedly that building the wrong abstraction
       | is on the path toward discovering and building the right
       | abstraction.
        
         | hooverd wrote:
         | Right up until somebody else uses your wrong abstraction- and
         | now it's part of the bedrock.
        
         | CharlieDigital wrote:
         | Once you've seen enough code, the right abstraction becomes
         | easier to spot.
         | 
         | Applications are more similar than they are different. That's
         | why we have the concept of design patterns since these occur
         | with enough frequency that we should just give the abstraction
         | a name instead of re-inventing it each time.
         | 
         | Problem today -- my observation -- is that many younger devs
         | don't ever bother learning design patterns so we end up with 1)
         | devs who aren't aware of common, existing patterns codified
         | decades ago and then 2) think that the "wrong abstraction" is
         | expensive partially because of a lack of knowledge of the
         | "right abstraction" to use.
        
       | waysa wrote:
       | In a world of changing requirements it can be difficult to know
       | what the right abstraction is going to be. I am happy to accept
       | some duplication early in the development cycle until the
       | requirements have settled. Only then it's possible to go back and
       | refactor (which admittedly doesn't always happen in practice).
       | 
       | I believe duplication should raise eyebrows but it can be
       | justified.
        
       | DougBTX wrote:
       | This article isn't making a distinction between the interface
       | provided by an abstraction and the implementation details of that
       | abstraction, which I think causes it to come to the wrong
       | conclusions.
       | 
       | A bad abstraction is an interface which causes the implementation
       | to be more complex than necessary. Uses of the interface might
       | still look perfectly simple, but if the abstraction is bad the
       | overall complexity could be higher.
        
       | cogman10 wrote:
       | This is a very amateurish take The author very clearly (at least
       | at the time of writing this) has not dealt with complex code
       | bases.
       | 
       | > If I were to see a confusing piece of code littered with
       | conditional logic, I wouldn't see it and think "oh, there's an
       | incorrect abstraction", I would just think, "oh, there's a piece
       | of crappy code". It's neither an abstraction nor wrong, it's just
       | bad code.
       | 
       | This is the primary issue. The author does not recognize that
       | poor abstractions can involve more than just a lot of conditional
       | logic. That sometimes, that conditional logic bubbles in places
       | where secondary to where the bad abstraction was made.
       | 
       | A simple (real) example of this. One seen code where "get, these
       | two objects share a field, let's pull out a base object and have
       | them both inherit from it, after all, duplication is bad!". Then
       | later on "hey, here's two other objects with the same field, but
       | they don't have that old base objects field, duplication is bad,
       | so let's make a third base class"
       | 
       | This sort of thinking resulted in a really gnarly object graph.
       | But further, down stream code had to do type checks and casting
       | to compensate for this bad abstraction.
       | 
       | All because the original dev didn't want to duplicate a field on
       | two otherwise unrelated objects.
       | 
       | And worse, you the dev that works on this code years later are
       | left with the option "keep it as is, of rewrite and touch 100s of
       | files potentially breaking large amounts of code)."
       | 
       | Oh, not too mention the unit tests that accompanied such code,
       | ironically, filled to the brim with duplication around this
       | hierarchy making minor charges massive.
       | 
       | On smaller less complex code bases you rarely see this
       | comedy/tragedy play out.
        
         | em-bee wrote:
         | i wouldn't create a base class until there are a non-trivial
         | amount of common properties shared by several classes and i
         | find that i am adding more such common properties. and when a
         | class appears that doesn't have one of these common properties,
         | then perhaps it makes more sense to move that one no-longer-
         | common property out of the base class back into the individual
         | classes so that again i can have all classes share the same
         | base class.
        
         | ozim wrote:
         | I was looking for a word to describe my feeling about the
         | article and "amateurish" fits the bill.
         | 
         | What mostly took me down was: (for example, the same several
         | lines of code duplicated across distant parts of the codebase
         | dozens of times, and with inconsistent names which make the
         | duplication hard to notice or track down)
         | 
         | It is silly example because in such scenario there is no way
         | you can even start writing abstraction to handle that.
         | 
         | Other part is what cogman10 wrote, wrong abstraction is not
         | "simply piece of code gathering if statements". Wrong
         | abstraction is piece of code or whole part of system where you
         | cannot simply add an if statement and get going. Wrong
         | abstraction might be something that actively prevents you from
         | changing code in meaningful way.
         | 
         | There is also another comment I would riff off about DevOps and
         | having scripts per team/domain even if mostly those look the
         | same you never know what the team will require. Nowadays domain
         | driven development is in vouge, mostly because it recognizes
         | separation of concerns is much more important than DRY.
         | 
         | To finish off, author also assumes abstractions are born by de-
         | duplication of code, yes we discuss "duplication is cheaper" so
         | as finishing wanted to rant on something. Worst abstractions I
         | saw in practice were born in heads of "Astronaut Architects"
         | who built some system top down making stuff up "because it
         | should be like that". Other bad ones were done by junior devs
         | who were high on DRY.
        
           | laserlight wrote:
           | > It is silly example because in such scenario there is no
           | way you can even start writing abstraction to handle that.
           | 
           | You start writing the abstraction the first time you
           | duplicate, so that you don't end up with this mess down the
           | road.
        
         | bitblender wrote:
         | I think your disagreements are valid, but I don't think it is
         | fair to say this is an amateurish take or infer the author's
         | level of experience. Your example of unnecessary inheritance
         | hierarchies (which I have also faced many times in real world
         | scenarios) may even be a symptom of exactly what the author is
         | saying: what you might call a "bad fitting abstraction" the
         | author would just call "bad code". The implementation details
         | of how code gets shared (composition vs inheritance) is a
         | subtle but still vital consideration to the cost benefit
         | analysis. The author is observing that it might be misleading
         | or dangerous advice to urge developers to choose duplication
         | just because issues with abstraction have been historically
         | observed, which I completely agree with and do not consider
         | myself to be an amateur. I also agree with you and other posts
         | that the author fails to mention the (exponentially higher)
         | costs of abstraction boundaries that also span human
         | organizational boundaries.
        
         | feoren wrote:
         | Class inheritance is flawed because it tries to be two things
         | at once: a shared "surface" (public members, polymorphism,
         | etc.) and shared implementation. An abstraction is _only_ a
         | surface -- this could be an interface, a function declaration,
         | or even a data model. It almost never happens that
         | implementation-sharing and surface-sharing completely coincide,
         | and this is why class inheritance is falling out of favor and
         | something I completely avoid ( _occasionally_ I will use
         | abstract classes, but I usually regret it later). This is where
         | "favor composition over inheritance" comes from. I'd go so far
         | to say that because they cannot be completely divorced from
         | implementation details, base classes cannot even be called
         | abstractions.
         | 
         | So if "wrong abstractions" includes shoddy base-class
         | shenanigans, then the statement becomes almost tautological. Of
         | course duplication is better than class inheritance --
         | _everything_ is better than class inheritance. So the real
         | statement there is  "class inheritance is actually awful",
         | which is important to understand, but a side point to this
         | debate.
         | 
         | If you don't count class inheritance as abstraction, then the
         | tradeoff between code duplication vs. abstraction becomes much
         | more nuanced, and that's what all this discussion is about. I
         | certainly don't agree that ignoring class inheritance is a
         | signal that the author is amateurish. Many complex codebases
         | have no class inheritance at all.
        
           | jeremyjh wrote:
           | Inheritance is very useful in domains like game engines,
           | where it is very common to have a base object such as "Node"
           | that has some properties that every object in the scene graph
           | must have, which all share the same implementation. For
           | example they should all have a parent property and a
           | collection of children, and ways to modify those properties.
           | They'll also share methods such as "render" which probably
           | must be overridden in every subclass. Its not impossible to
           | solve this with interfaces and composition but those
           | solutions are sub-optimal.
           | 
           | An example you might be more familiar with is the DOM of a
           | web browser - every element has some basic properties and
           | methods that all share an implementation.
        
             | yellowapple wrote:
             | > Its not impossible to solve this with interfaces and
             | composition but those solutions are sub-optimal.
             | 
             | The growing popularity of ECS and data-oriented design in
             | game engines suggests otherwise: keeping components
             | separate from entities enables both performance
             | enhancements and separations of concerns that are much more
             | difficult to achieve with the traditional inheritance-based
             | approach. To illustrate a bit:
             | 
             | > it is very common to have a base object such as "Node"
             | that has some properties that every object in the scene
             | graph must have, which all share the same implementation.
             | For example they should all have a parent property and a
             | collection of children, and ways to modify those
             | properties.
             | 
             | You don't need subclasses for that; you just need a table
             | of entity IDs (where both the things to render and the
             | scene itself are entities) and parent IDs, which you can
             | then recursively walk to get the entities you want to
             | render:                   WITH RECURSIVE entity_children AS
             | (             SELECT id, parent FROM entities
             | UNION ALL             SELECT ec.id, ec.parent
             | FROM entity_children AS ec             JOIN entities AS e
             | ON ec.id = e.parent         )         INSERT INTO
             | scene_entities (scene, entity)         SELECT
             | $scene_entity_id, id         FROM entity_children
             | WHERE parent = $scene_entity_id;
             | 
             | (Obviously you probably won't actually be running SQL
             | queries in a game engine's rendering loop; this is just to
             | illustrate the logic.)
             | 
             | Once you've got that list...
             | 
             | > They'll also share methods such as "render" which
             | probably must be overridden in every subclass.
             | 
             | You don't need subclasses for that; you just need a table
             | of entity IDs and things to render, which you can then
             | query and send to the GPU:                   INSERT INTO
             | some_buffer_in_GPU_memory (entity, mesh, texture, position)
             | SELECT se.entity, em.mesh, et.texture, ep.position
             | FROM scene_entities AS se         JOIN entity_meshes AS em
             | ON se.entity = em.entity         JOIN entity_textures AS et
             | ON se.entity = et.entity         JOIN entity_positions AS
             | ep ON se.entity = ep.entity         WHERE se.scene =
             | $scene_entity_id;
             | 
             | (Again: you probably ain't actually using SQL for this;
             | this is also overly simplified, since most modern game
             | engines use all sorts of other stuff besides a mesh,
             | texture, and position when rendering something. Note also
             | that "em.mesh", "et.texture", and "ep.position" need not be
             | actual meshes/textures/positions, but could instead be
             | indices into buffers already on the GPU.)
             | 
             | The key advantage in both of these cases is that the
             | parent/child data and the render data can live where they
             | make the most sense, and can be processed by independently-
             | running systems with minimal contention. This is critical
             | for processing game logic in parallel - something which the
             | game industry is learning the hard way with legacy engines
             | that can't fully exploit multicore hardware.
        
             | feoren wrote:
             | Quite the opposite: game engines are one of the few places
             | where the sub-optimality and fundamental problems with
             | object inheritance became so overwhelming that people
             | starting abandoning their deeply ingrained CS 101 models of
             | Dog : Animal and invented Entity-Component-System
             | architecture, which at its extreme uses no object
             | inheritance at all and is a deeply "relational" model. Game
             | engines which _don 't_ do this were either mostly developed
             | before ECS was invented/popularized (Unreal) or are
             | specifically targeting beginners who have little more than
             | a CS 101 understanding of OO programming (and also
             | following Unreal's lead).
             | 
             | DOM elements are a better example, but just because that's
             | how they _are_ done doesn 't mean that's how they _should_
             | be done. Does a  <script> element really need a "focus()"
             | method? It has one. Does a <br> element need an "innerHTML"
             | property? It has one. Does a <head> element need an
             | "offsetHeight" property? It has one. If you look at the
             | history of the development of HTML and JavaScript as a
             | shining ideal of software engineering, you're certainly in
             | the minority (this is all before TypeScript, which _is_ a
             | shining ideal of type systems!). The HTMLElement class has
             | 134 properties, most of which make no sense for most
             | elements. It has a long history and a lot of excuses for
             | becoming what it is today, but I would not recommend you
             | follow that lead in your own designs.
        
             | whstl wrote:
             | Not really. Composition has been the preferred technique
             | for a long time already.
             | 
             | A lot of Games and GUIs that use inheritance worked _in
             | spite_ of that inheritance. In more complex object graphs
             | there were always things like _override boolean
             | DoNotActuallyRender()_ in one or two children of the
             | _RenderableNode_ class to account for special behaviour.
             | 
             | ECS is just the nail in the coffin of inheritance in game
             | engines. And it's not even new anymore, it has been
             | fashionable for what, almost 15 years now?
        
       | bob1029 wrote:
       | Duplication is a superpower if you can put your OCD into a box
       | for a little bit and frame it as a temporary stepping stone.
       | 
       | Refactoring nightmare codebases _can_ become trivial if you don
       | 't mind a few copies of "the same thing" being kept around to
       | satisfy serializers and other legacy APIs. Writing mappers
       | between nearly-equivalent types sucks really hard but it still
       | sucks a lot less than saying things like "lets just rewrite the
       | whole product".
        
       | disintegore wrote:
       | I resent how much we've trained developers to value concision
       | over everything else. I can't tell you how many times I've seen
       | people use DRY as a justification to alias stuff that's already
       | heavily abstracted by the framework that they use, ending up with
       | less useful interfaces. Either that, or they'll explode the
       | cognitive load by building crazy type hierarchies and inserting
       | opaque anti-patterns like factories and decorators and whatnot.
       | 
       | These are "the wrong abstractions" in the sense that they're not
       | actually crappy code full of conditionals and are actually well-
       | redacted and not all that hard to decipher. They're "the wrong
       | abstractions" in the sense that there's either a way to do it
       | that is simpler and makes fewer assumptions, or in the sense that
       | they are worse than "no abstraction" which is to say sticking to
       | the abstractions that have already been invented for you by
       | people whose jobs it is to do that exact work for millions of
       | engineers and are therefore probably way better equipped.
        
       | HumblyTossed wrote:
       | A little bit of code duplication tends to be much less toxic than
       | a lot of discussions on proper coding techniques.
        
       | tester756 wrote:
       | ehh, mediocre ideas, tricks, dogmas, religious approaches.
       | 
       | You have to evaluate decisions by the case, with the context.
       | 
       | It is called software *engineering* - you have some goals and
       | constrains and you design with those in mind.
       | 
       | Sometimes it is better to duplicate, sometimes to it is better to
       | have single source of truth.
        
       | lijok wrote:
       | > So far so good, perhaps. But, by creating this new abstraction,
       | the programmer signals to posterity that this new abstraction is
       | "the way things should be" and that this new abstraction ought
       | not to be messed with. As a result, this abstraction gets
       | bastardized over time as maintainers of the code need to change
       | it yet simultaneously feel compelled to preserve it.
       | 
       | How I've been thinking about abstractions that turn bad over time
       | is: It was likely the correct abstraction at the time it was
       | made, given the requirements the writer had on hand. Now that the
       | abstraction is wrong, don't muck with it. Gather the new
       | requirements and write a v2.
       | 
       | I think the vast majority of abstractions will go bad over time.
       | To abstract is to generalize, and generalizations become invalid
       | over time because the world evolves over time. It's sort of like
       | trying to preserve a summary of a book that is continuously
       | having new pages added and existing pages replaced.
        
       | tracker1 wrote:
       | What seems to serve me best is keep things as simple as possible.
       | If you add abstractions, do so to make the rest of the code
       | easier and less complex. If you must do something complicated,
       | break it apart as pragmatically as possible and do it in the
       | simplest way possible. Favor YAGNI (you aren't going to need it)
       | over corporate-wide libraries that lock you in.
       | 
       | Keep your codebase discoverable first. Structure by
       | feature/function not type. Favor the local developer experience
       | first. If you cannot open, follow and run the code easily, your
       | developers won't be able to onboard quickly. Someone else will
       | have to continue with your mess, make it as orderly as possible.
       | I find that docker-compose can help a lot on this front, as can
       | developer containers.
        
       | aeturnum wrote:
       | I think the core of the difference can be found in the _What
       | exactly is meant by "the wrong abstraction"?_ paragraph.
       | Admittedly, the quoted article is also a bit confusing here, but
       | I think it 's easy to resolve.
       | 
       | I think the wisdom of the original saying is hard to understand
       | when you just look at any piece of code as it exists. Instead,
       | imagine the future. You have two pieces of code that do similar
       | things - you can centralize them (with a bunch of conditionals)
       | to have a "single" code path, or you can allow them to stay
       | separate (perhaps confusing new people). The wisdom of
       | "duplication is cheaper" is to observe that it will generally be
       | less work to allow the duplication than to maintain the
       | circumstantial needs over time. Each time you need to "do the
       | same thing again but a little different" you can either add more
       | conditionals to a single piece of code, or add another instance
       | of 'duplication' which can just deal with the concerns at hand.
       | It's not about "crappy code" - it's about the difficulty of
       | having one piece of functionality serve many masters over time.
       | 
       | IMO, in general, you will also find that if you have many
       | 'duplicated' copies of code, it will often be easier to see the
       | truly duplicated sub-sections that you can DRY out into a common
       | subroutine. I find that is easier to see with duplicates than
       | with a single piece of complex code.
        
       | coldtea wrote:
       | > _It seems to me that what's meant by "the wrong abstraction" is
       | "a confusing piece of code littered with conditional logic". I
       | don't really see how it makes sense to call that an abstraction
       | at all, let alone the wrong abstraction._
       | 
       | No, it means the wrong abstraction. Like forcing a one-size-fits-
       | all abstraction on a few pieces of duplicated code, and not
       | waiting for them to grow to enough cases to hint at what is the
       | best pattern/abstraction/architecture to handle them (perhaps
       | more than one, for different classes than somebody might just
       | shove in a single abstraction prematurely).
        
       | overgard wrote:
       | Every time you create an abstraction to remove duplication,
       | you're tying two pieces of code together and creating a common
       | dependency. The more dependencies you have, the harder it is to
       | change code, because a change in one place reverberates in many
       | places.
       | 
       | To me, that's the cost. You gain a decrease in code size and
       | verbosity at a cost of making localized changes more difficult.
        
         | LeifCarrotson wrote:
         | I call this a distinction between "inherent sameness" and
         | "incidental sameness".
         | 
         | Yes, right now, those two servers have the same number of
         | processor cores. But who's to say that after a hardware update
         | that will still be true?
         | 
         | Conversely, the fact that every processor has a certain number
         | of cores is inherent to the way we represent a processor.
         | 
         | In my line of industrial automation, it's almost always cheaper
         | to pay the cost of complexity up front, and assume that every
         | conveyor VFD might get replaced with a different model, or with
         | a contactor, somewhere down the line. That duplication is cheap
         | when the line is on the integrator's shop floor. Any downtime
         | later on, when enormous dependencies have come to rely on that
         | line, is more costly.
        
       | tflinton wrote:
       | Duplication is bad. In fact, duplication is one of the most
       | dangerous mistakes in coding.
       | 
       | I have to disagree with this; the article feels lofty in its
       | assumption that when you start to program you know what to
       | abstract. More often people begin abstracting due to a misguided
       | axioms like "DRY" rather than to solve a problem with a real cost
       | benefit trade off. DRY as a goal in itself is fairly dangerous.
       | 
       | I can't count how many convoluted and confusing frameworks people
       | have put together under this misguided perspective. It's not
       | atypical for an abstraction born of "DRY" motivations to be more
       | code and brittle than just copying and pasting 2 lines in 15
       | places.
       | 
       | Not to say abstractions are inherently bad, but to the point of
       | abstracting for the sake of DRY is a mistake.
        
         | lolinder wrote:
         | The problem is with being either dogmatic or thoughtless in
         | either direction. I've seen what you're talking about: people
         | combine code religiously because of DRY, leading to insane
         | pyramids of abstractions that are impossible to modify.
         | 
         | However, I've also seen people copy and paste everything they
         | ever need. When that happens, those offshoots gradually evolve
         | independently from one another, and introducing a proper
         | abstraction becomes a huge slog. I've spent hours reading
         | through git blame trying to piece together a phylogenetic tree
         | of the various copies of the same code so we can ensure that
         | the new abstraction contains all relevant features and bug
         | fixes. I wish those developers had thought more carefully about
         | DRY.
         | 
         | I think the best balance is to use these catch phrases as
         | principles to guide your decision making, while being willing
         | to make exceptions when they don't apply. If DRY makes you
         | think for a second before copying a piece of code, it's done
         | its job, even if you decide that this situation really does
         | call for a copy.
        
       | briantakita wrote:
       | "Duplication is cheaper than the wrong abstraction" makes sense
       | coming from the Rails community. Between the meta-programming,
       | lack of static types, large amount of unit tests, etc. Rails has
       | a tendency to lock a project into an abstraction choice & is very
       | expensive to change. The pain is particularly intense during
       | major version Rails upgrades. From my experience, the Rails
       | framework got in the way & bogged down project velocity. It was
       | difficult to move away up to ~2010 as many of the jobs were
       | locked into Rails. There were many frustrated Rails programmers
       | around that time. When node.js, Go, & other languages/platforms
       | came out, there were finally full stack libraries that did not
       | lock in abstractions as heavily as Rails. Nowdays, I use
       | astro.js, solid.js, & target isomorphic libraries. The
       | flexibility of Javascript with the static types of Typescript
       | make changing abstractions significantly easier. The Javascript
       | ecosystem spent far too long focusing on SPAs when the isomorphic
       | MPA was low hanging fruit.
        
       | emtel wrote:
       | I think it comes down to this:
       | 
       | - If your code has a bug, you will be better off without
       | duplication, so that the bug must only be fixed once.
       | 
       | - If you will have to change the behavior of your code for
       | product reasons, duplication is often better, because user needs
       | are idiosyncratic. If the code is fully factored, you may have to
       | pass in flags to indicate which behavior should be used in which
       | case.
       | 
       | Learning to anticipate which of these two cases you might find
       | yourself in in the future comes with experience.
        
       | BurningFrog wrote:
       | The question I ask myself is "If X changes, how many places in
       | the code need to change?", for any reasonable value of X.
       | 
       | If the answer is 2 or more, you probably want to deduplicate
       | something.
       | 
       | If not, it doesn't really matter if different code looks similar.
        
       | nraf wrote:
       | > for example, the same several lines of code duplicated across
       | distant parts of the codebase dozens of times, and with
       | inconsistent names which make the duplication hard to notice or
       | track down
       | 
       | While I think there's merit in deduplicating these situations,
       | one pitfall is introducing coupling and tangled dependencies when
       | DRYing.
       | 
       | There are ways around this of course, but I've come across a
       | number of instances where deduplication has led to unnecessary
       | coupling between modules.
        
       | andrewprock wrote:
       | The underlying problem is that the "don't repeat yourself"
       | principle is often in conflict with the "single responsibility
       | principle". Structurally, this comes down to the problem of
       | managing dependencies. Over the years, the problem of dependency
       | management has become bigger and more difficult to tame.
       | 
       | The same problem holds for internal code as well as external
       | code. Duplicating code creates one kind of dependency problem
       | (feature drift). Shared code creates another kind of dependency
       | problem (increased coupling). Broadly speaking, solutions which
       | reduce coupling are going to be cheaper to maintain.
       | 
       | Ideally, there would be clear, well defined layers with narrow
       | communication protocols.
        
       | tabtab wrote:
       | I find it's better to keep abstractions small and independent, so
       | you can mix and match. Too big, and they risk not fitting future
       | change well. Even if the smaller ones create a bit more work or
       | "mini duplication", it's worth it to have that flexibility.
        
       | andybak wrote:
       | Typing on phone so I'll be brief. The key concept I find missing
       | from this piece is "locality".
       | 
       | When dealing with a complex and/or unfamiliar codebase, locality
       | (by which I mean "I can understand this thing here without
       | jumping around the codebase") can make up for a lot of other
       | deficiencies.
       | 
       | And imho, dedupped code with an excess of if statements is
       | actually one of the least worst things to encounter.
        
         | rightbyte wrote:
         | Many programmers believe that the more complex the better.
         | 
         | In my experience good code looks silly simple, such that you
         | might think the problem was easy. And thus underrate the author
         | ...
         | 
         | I have never read someone else code at work and complained that
         | a function is too big with too many if clauses.
         | 
         | However, deep call trees are really hard to comprehend.
         | Especially if some function is called multiple times in the
         | same call stack (unless the algorithm is recursive in a good
         | way).
        
       | julienreszka wrote:
       | Says the amateur. There is never enough granularity. DRY is only
       | useful after a significant threshold of repetition
        
       | aranchelk wrote:
       | The elephant in the room is without strong static typing and a
       | good type checker changing abstractions is somewhere between a
       | significant pain in the ass and downright perilous.
       | 
       | In my experience when you have those things, whether you make
       | significant changes to your API or decide to dedupe old divergent
       | copy pastes, it's largely just busy work -- very little thought
       | involved. The type checker says change line 135 in file foo.
       | Okay, next.
        
       | luckycharms810 wrote:
       | Whenever this conversation is had - it seems to completely
       | dismiss the idea of domain. Duplication doesn't happen in a
       | vacuum - it happens within a certain context. Some acceptable
       | conditions for duplication include:
       | 
       | * If two things are semantically different within the context of
       | a domain but require similar functionality.
       | 
       | * Code paths with different risk profiles.
       | 
       | * When new functionality is evolving with domain learnings.
        
       | MathMonkeyMan wrote:
       | I read a blog post somewhere (don't remember where) that
       | describes the process of unfactoring (multiplying?) code as an
       | exercise. Copy/paste the code until there's one straight code
       | path per use case. Then examine the similarities and factor the
       | code again. What you end up with will often be different from
       | what you started with, and probably simpler, especially if the
       | code had begun to drift from its original author's design.
       | 
       | So, "unfactor" the code and then factor it again. Let's call
       | it... "refactoring."
       | 
       | My $0.02, then, is that "the wrong abstraction" assumes that you
       | are unwilling to change it. What if we were comfortable tearing
       | down our classes all willy nilly and replacing them with some
       | other thing? Is it too risky? Does it hurt too many feelings?
       | 
       | Maybe the problem lies there, instead of in duplicate vs.
       | abstract.
        
       | maximinus_thrax wrote:
       | [dead]
        
       | gorjusborg wrote:
       | The saying 'duplication is cheaper than the wrong abstraction' is
       | a gem of a saying, but like many pieces of wisdom, takes
       | experience to fully understand.
       | 
       | I first saw the saying when DRY was being applied without any
       | nuance. If a piece of code appeared in two places, it was
       | obvious, and important, to factor it out, because that was 'good
       | coding practice'.
       | 
       | The saying being discussed was pushback against that kneejerk,
       | thoughtless application of DRY. The 'cheaper than the wrong
       | abstraction' is pointing out that DRY isn't a 'no tradeoff'
       | policy. By factoring out any duplication, many uses pass through
       | the same code. If the uses don't quite match, there is a tendency
       | for the code to get modified to fit them anyway. This, over time,
       | makes the shared code simultaneously unfit for use, and widely
       | used. A recipe for poor code quality and system health.
       | Ironically, this is the outcome that DRY was called in to
       | address.
        
         | devjab wrote:
         | I think it's down to the systems, and I think the people who
         | favour abstraction often forget who needs to write it.
         | Duplication isn't just cheaper than the wrong abstraction, it's
         | cheaper than almost any abstraction. Not because it should be,
         | mind you, but because duplication works for a tired Thursday
         | afternoon programmer and abstraction doesn't. Maybe it's
         | because I spent some time in management, but a key concept I
         | worked with when I did that was how we have two modes of mental
         | capacity. One where we have the energy and wit to do the right
         | thing, and one where we haven't slept for a week, and, well...
         | it's Thursday afternoon after a day of too many useless
         | meetings.
         | 
         | I think the best way I saw it put was for a Theme-park to coin
         | a slogan that any employee would be able to find inspiration in
         | when dealing with a customer on that Thursday afternoon. To me
         | most abstractions are similar to having a slogan along the
         | lines of "Think Different", which is an absolutely useless
         | concept when you're tired and dealing with an angry customer in
         | your summer job about an hour before you clock out.
         | 
         | I obviously don't think you should avoid all abstraction. The
         | author of the article is right, theoretically at least, it's
         | just that this way of thinking rarely works out. Similar to
         | you, my experience is that it tends to fail after a few years
         | of changing needs.
         | 
         | These days I favour abstraction only when it's use is never
         | altered in the slightest. For everything else duplication is so
         | much easier to handle over 5+ year periods. Of course there are
         | many ways to deal with this. Small single purpose functions are
         | abstractions as well, just don't build big OOP hierarchies.
         | Because they just don't work for those Thursday afternoons.
        
         | wvenable wrote:
         | The most important thing to note about DRY is that it's not
         | about code -- it's about knowledge. You should not repeat
         | knowledge -> logic, constants, etc. If the temperature is 87
         | and the price of the widget is 87 that is coincidence and not
         | repetition.
         | 
         | There should just be one source of truth for any logic or
         | process. If you duplicate that then bad things will eventually
         | happen.
        
           | jasonswett wrote:
           | Totally agree, although I'd maybe replace "knowledge" with
           | "knowledge or behavior".
        
         | starbugs wrote:
         | Anything pushed to the extreme will result in its opposite.
         | 
         | There's an interesting Wikipedia page about a concept from
         | philosophy called "Unity of opposites":
         | https://en.wikipedia.org/wiki/Unity_of_opposites
         | 
         | Worth a read IMHO.
        
           | zogrodea wrote:
           | Thanks for the link. This sounds more useful to refer to
           | (because more general_ than horseshoe theory.
        
         | serial_dev wrote:
         | > If the uses don't quite match, there is a tendency for the
         | code to get modified to fit them anyway.
         | 
         | And this is essential, this is how you'll end up 5 arguments
         | and 6 further bool flags to an 7-line function.
        
           | ilyt wrote:
           | And one of them for some reason takes from environmental
           | variable...
        
         | _the_inflator wrote:
         | DRY should be an option.
         | 
         | Results may vary and depend on the code in question as well as
         | the language you are using.
         | 
         | We - a former team a couple of years ago using Java - started
         | to duplicate code in Java, because we were totally tired of
         | interface'ing and class'ing everything away that was not DRY.
         | It became to tedious to bloat code with them as well as
         | understanding whole classes when all you got was references to
         | other interfaces etc.
         | 
         | If there is a small service architecture like in Angular with
         | TypeScript, abstracting away becomes fun and useful.
         | 
         | It all depends. But what I really do not miss is the pile of
         | interfaces in Java and C#. These became so tough to grasp and
         | entangled, that we DRY'ed this cesspool. DRY on DRY so to say.
        
           | switchbak wrote:
           | So your issue was with the nature of the language and the
           | size of the project more than the application of DRY?
           | 
           | I think I see what you're getting at, but I've certainly also
           | seen very large Java projects that are simple at a high level
           | and composed in such a way that they're still legible without
           | a ton of duplication. These might be somewhat orthogonal
           | concepts.
        
       | thecodrr wrote:
       | The author is going into technicalities without much actual
       | substance, ending with: it depends.
       | 
       | I think whenever we, as programmers, try to pin down a certain
       | principle, it bites us. Hard. DRY was cool as an observation but
       | when it got turned into a law we saw the spaghetti code.
       | 
       | Duplication, on the other hand, is detested almost as much as the
       | goto statement. Let me tell you, it's not that bad. Duplicate
       | code makes everything more flexible. It helps you to NOT bend
       | over backwards in order to change a line of code. It allows you
       | to NOT touch anyone else's code.
       | 
       | So many good things. Of course, I agree with the author's summary
       | of the bad things that can happen with duplicated code. But
       | there's a litmus test for that:
       | 
       | If you have to make changes in multiple blocks of duplicated code
       | in order to change the behavior of something, there's a problem.
       | DRY out the code so you only have to touch 1 place.
       | 
       | If, however, 2 blocks of code LOOK similar but aren't actually
       | the same, and changing one block doesn't make the other block
       | outdated and stale, you are good to go.
       | 
       | Judge and decide. It's just 2 approaches that when taken to an
       | extreme can cause a lot of pain, but if used with common sense,
       | nothing is simpler.
        
         | overgard wrote:
         | > Duplication, on the other hand, is detested almost as much as
         | the goto statement.
         | 
         | Honestly, even the goto statement isn't that bad. It's pretty
         | useful in C code. I'm not saying anyone should put it in a new
         | language, but the amount of hate it gets is really just related
         | to BASIC monstrosities from the 1970s, not any real-world
         | applications of it.
        
       | ants_everywhere wrote:
       | The wrong abstraction can completely destroy a startup. I've
       | never seen duplicate code with that ability to cause damage.
       | 
       | Or consider the centuries humans spent trying to make geocentric
       | astronomical models work.
        
         | edgyquant wrote:
         | You're lucky then. I joined a company that had a team of
         | inexperienced engineers where every form or details page was a
         | separate program and the render functions were several hundred
         | lines long by themselves. When I joined they had a dozen pages
         | that were each so buggy adding new ones was nearly infeasible
         | and fixing bugs took most of the dev time. Duplicate code can
         | certainly slow down the dev process and kill a startup.
        
         | waffletower wrote:
         | I have seen much damage from duplicate code at multiple
         | organizations. I have seen thoughtful abstractions work
         | successfully to mitigate it, and rarely encountered the
         | opposite. I have encountered multiple perjoratives: copypasta
         | coders, couch developers et al.
        
         | justincredible wrote:
         | [dead]
        
       | DustinBrett wrote:
       | Whatever is faster is cheaper because everything needs to
       | constantly justify it's existence with new features.
        
       | gosukiwi wrote:
       | I like Dan Abramov's "The Wet Codebase"
       | (https://www.youtube.com/watch?v=17KCHwOwgms) -- I've been guilty
       | of doing just what he says in his talk at first, removing all
       | duplications and making the codebase DRY. But then I came to like
       | "prefer duplication over the wrong abstraction", as Sandi Metz
       | puts it.
       | 
       | Sometimes it's good to wait to have more data to make an easier
       | and more informed decision.
        
       | kristov wrote:
       | Abstraction is not just about hiding code - its about reducing
       | options. You purposefully reduce options to make the system
       | easier to reason about. A "function" in a programming language is
       | an abstraction over machine code. It looks like variables have
       | scope in an isolated environment, and it looks like the braces
       | mean something, but it's compiled down to machine instructions
       | that have no such concept. Goto considered harmful, but compiled
       | machine code is littered with jump instructions (of course). You
       | can do a lot of funky tricks with machine code that the higher
       | abstraction of a programming language doesn't let you do. When
       | you create an abstraction you reduce options for the user of that
       | abstraction. So abstractions tend to gather cruft over time
       | because users want those restrictions relaxed to do their special
       | thing.
        
         | [deleted]
        
         | feoren wrote:
         | Absolutely right. One of the most important questions to ask an
         | abstraction is: what can I _not_ do with this? If the answer is
         | "nothing -- you can do everything you could before", then the
         | abstraction is an inner platform. The entire power that
         | abstraction brings is in "focusing" on the problems we care
         | about solving; it must make other problems impossible (ideally
         | ones we don't care about). It follows from the No Free Lunch
         | theorem.
         | 
         | One way to make sure your abstractions are focused on solving
         | the right problems is to always define them based on what you
         | _need_ , not based on what you _have_. The root of the
         | abstraction vs. duplication debate comes down to this. Indeed
         | it 's unhelpful to look at two pieces of code and say "these
         | look the same; I will abstract them!". Instead you say "wow
         | these have really similar needs; I will define exactly what
         | that need is and they'll both ask for it."
        
       | lamontcg wrote:
       | This whole article is based on a bad reading of the problem.
       | 
       | The problem that happens when code is first duplicated is that
       | the correct abstraction is a fundamental UNKNOWN.
       | 
       | If you knew the right way to de-duplicate it, you would of course
       | always construct that abstraction, because that would always be
       | better.
       | 
       | What happens in practice is that the wrong abstraction is usually
       | chosen.
       | 
       | Then that incorrect abstraction isn't usually held around because
       | of "[feeling] honor-bound to retain the existing abstraction" (if
       | that's a direct quote from Sandi then I disagree with the quote
       | and feel it has entirely the wrong emphasis). The problem is that
       | it is always easier to add a new knob to the bad abstraction than
       | it is to go back and de-dup the whole code and fix the
       | abstraction. So the bad abstraction tends to accrete more bad
       | abstractions on top of it until it becomes a mess because of
       | doing the cheap, easy thing.
       | 
       | We should not do that. But the realities of software development
       | are that when you are dealing with an orthogonal problem, you
       | WILL wind up adding a knob to something that can be done in a
       | day, rather than taking 2 weeks to refactor a different subsystem
       | that your original problem only barely touches and isn't the
       | primary concern of whatever business objective you are trying to
       | deliver.
       | 
       | So the advice is to let it sit for awhile. Let the code accrete a
       | few more requirements over the weeks or months ahead, and when
       | you find yourself doing a double edit to both sides of the code
       | and the right abstraction is clear to you then go ahead and de-
       | duplicate it.
       | 
       | Note that if the problem is TRIVIAL then go ahead and de-dup it
       | right from the start. This isn't advice for junior programmers
       | who are faced with something as simple as dropping two hash keys
       | into an array and then iterating over it so that it makes it easy
       | to add a third key. This is more about having two classes which
       | are fairly similar and extracting a whole base class and jamming
       | all the shared code into the base with a tightly-coupled poorly-
       | thought-out "wide" interface (using inheritance as a hammer to
       | de-dup code). And the whole problem becomes even worse if someone
       | external might come along and pick up that base class and start
       | using it with the existing API and you might be locking yourself
       | into a shitty API that you can't change without breaking
       | backwards compatibility.
       | 
       | And even if you're in a "non-OO" language like Go you can still
       | make this mistake by designing bad interfaces, it is the exact
       | same thing.
        
       | highwind wrote:
       | The article seems to be arguing against conditionals not
       | duplication.
        
       | danbruc wrote:
       | That something is the wrong abstraction is something you can only
       | know after the fact, at the time you build the abstraction it is
       | - or at least it should be - a reasonable choice. And later on as
       | the code evolves there are two possible outcomes, the abstraction
       | remains a good choice or the abstraction stops being a good
       | choice and you have to change it. Maybe it can be saved with some
       | refactoring, maybe at has to go completely.
       | 
       | But at the very least you had a working abstraction for some time
       | and you can easily figure out all the places where this
       | functionality is used and you have a single place to make changes
       | when you have to make them instead of having to hunt down all the
       | different places with slightly different implementations. Even if
       | an abstraction breaks completely down and has to be split up into
       | several implementations, each of those will usually have several
       | usages which would all still be repetitions without the
       | abstraction.
        
       | karmakaze wrote:
       | There's an aspect of "not seeing what others are seeing" here.
       | 
       | > I think "the wrong abstraction" is a confused way of referring
       | to poorly-de-duplicated code. Here's why. [...]
       | 
       | > So instead of "duplication is cheaper than the wrong
       | abstraction", I would say "duplication is cheaper than confusing
       | code littered with conditional logic". But I actually wouldn't
       | say that, because I don't believe duplication is cheaper. I think
       | it's usually much more expensive.
       | 
       | It seems the author is considering 'cost' to be the mechanical
       | effort of managing the sync/desync of the DRY code. What it's not
       | considering is that _distinct intents_ can _incidentally_ use the
       | same implementation at the moment. This is when it 's not a good
       | idea to DRY because there they are not _meant_ to stay in sync.
        
       | seadan83 wrote:
       | I think a key distinction often lost here is that generic code
       | and abstract code are different. Abstract code hides details,
       | generic code allows its use in more places. When hiding details,
       | often it also becomes more generic. Making code generic does not
       | necessarily hide details, it can very well often expose
       | additional details
       | 
       | Also seemingly not mentioned - SRP (single responsibility
       | principle). SRP & DRY should be considered together. If a person
       | DRY'ies up code without regard to SRP, they're making any code
       | that can be generic, generic. A rule of thumb is generic code is
       | 3x more expensive than non-generic code.
       | 
       | ==============
       | 
       | To illustrate, here is an example (and pretend that these
       | examples are duplicated in 20 different places that all need the
       | account balance sum):
       | 
       | --------------
       | 
       | Example (1) - non-generic, non-abstact
       | 
       | ```
       | 
       | int savingsBalance = 1;
       | 
       | int checkingBalance = 1;
       | 
       | int totalBalance = savingsBalance + checkingBalance;
       | 
       | ```
       | 
       | --------------
       | 
       | Example (2) - generic, minimally abstract
       | 
       | ```
       | 
       | int savingsBalance = 1;
       | 
       | int checkingBalance = 1;
       | 
       | int totalBalance = addBalances(savingsBalance, checkingBalance);
       | 
       | ```
       | 
       | --------------
       | 
       | Example (3) - abstract, potentially generic:
       | 
       | ```
       | 
       | int totalBalance = addBalances();
       | 
       | ```
       | 
       | ===============
       | 
       | Now consider what happens if we need to add a 'brokerage account
       | balance' to the mix (and let's say we get that value via an API
       | call). These example change in the following ways:
       | 
       | Example (1), updated:
       | 
       | ```
       | 
       | int savingsBalance = 1;
       | 
       | int checkingBalance = 1;
       | 
       | int brokerageBalance = fetchBrokerageBalance();
       | 
       | int totalBalance = addBalances(savingsBalance, checkingBalance,
       | brokerageBalance);
       | 
       | ```
       | 
       | Example (2), updated:
       | 
       | ```
       | 
       | int savingsBalance = 1;
       | 
       | int checkingBalance = 1;
       | 
       | int brokerageBalance = fetchBrokerageBalance();
       | 
       | int totalBalance = addBalances(savingsBalance, checkingBalance,
       | brokerageBalance);
       | 
       | ```
       | 
       | Example (3), updated & unchanged:
       | 
       | ```
       | 
       | int totalBalance = addBalances();
       | 
       | ```
       | 
       | Example (1) & Example (2) have similar scaling behavior here
       | (scaling relative to complexity). This illustrates a very key
       | difference between abstract and generic code.
       | 
       | Now, let's say on another hand that whether we should include
       | brokerage balance is conditional. In example 1, we have the same
       | logic to be applied in 20 different places. We can mutate example
       | 3 to be more generic (EG: pass in flag -
       | `addBalances(Flags.includeBrokerageAccount)`). At this point we
       | can say that the abstraction is wrong and needs to be split into
       | different methods (which is fine!). Making example 3 more generic
       | is more complex, we incur the penalty of having generic code.
       | Example 1 is arguably the worst to have since we will get subtle
       | errors if we fail to update everything. In part these design
       | principles are also there to help protect updates and make them
       | safe (very similar to the ACID guarantees of database that help
       | make it so you can update data without breaking the overall
       | database)
       | 
       | Another mention, which I won't go into detail over, boiler plate
       | code has yet different characteristics.
       | 
       | In sum, it's largely a question of what kind of coupling is best
       | and how to deal with that coupling. Duplicated code is coupled
       | without any runtime or compile time checks that it stays in sync
       | (if you forget to update something from example 1 above, it's a
       | bug!). Keeping code consolidated into a common procedure does not
       | remove that coupling, it just changes the nature of the coupling
       | and makes it more explicit. Common code between micro-services
       | couples those micro-services together (and that can be very bad).
       | 
       | Thus, we need to look at a lot of things when applying DRY: we
       | need to consider SRP, whether we are coupling services together,
       | and whether or not if we are simplying making non-generic code
       | generic.
        
         | seadan83 wrote:
         | Typo in that updated Example (1), should have been:
         | 
         | int totalBalance = savingsBalance + checkingBalance +
         | brokerageBalance;
         | 
         | It's hard to explain such complicated concepts super concisely.
         | What I'm getting is that DRY is often equated to merely making
         | code generic and re-used, while the goal of DRY is not at all
         | about re-use. Generic code is more complicated than non-generic
         | code, thus if we make code generic for the sake of making it
         | used in many places - that is likely going to make things more
         | complex. It's a fundamental misunderstanding that DRY is simply
         | the act of using human pattern matching to make all similar
         | looking code generic and re-used. Instead DRY is more about:
         | "are we sanitizing data before sending it to the front-end?
         | Than that should be done in one place." "Where are we
         | configuring database connections?" etc..
         | 
         | Further DRY should not be the only guiding factor, SRP &
         | coupling should always be considered at the same time as DRY.
        
       | preommr wrote:
       | Duplication is cheaper because of how most programmers write code
       | at their job:
       | 
       | - write stuff as fast as possible, without having time to think
       | about overall architecture, especially if it involves having to
       | cooperate with other devs. It's easier to just implement
       | something that's as quick and as simple as possible so that it
       | can be passed off to someone else with minimum effort.
       | 
       | - no need to communicate the abstraction semantics - no need for
       | documentation outlining the abstraction, reasoning, possible
       | expansion, etc.
       | 
       | - it's much easier to make localized changes. A well written
       | abstraction will cover some logic that might be spread across
       | multiple areas. Changing something major to the abstraction
       | requires understanding how the abstraction affects all it's
       | applications in the same ticket. Whereas duplicated code can
       | result in a ticket being resolved by just making a change to a
       | specific code block, like a function.
       | 
       | - Things that work well aren't appreciated. If it's easy to
       | update an abstraction to a new feature, it'll be the expected
       | outcome. When a change like the previous point is needed, it's
       | much more memorable because of the frustrating experience is
       | likely to be longer and more strenuous. We also tend to remember
       | negative experiences over positive ones.
       | 
       | - Abstractions require reading more code with additional levels
       | of indirection and devs don't like reading other people's code.
       | 
       | - Writing things well requires effort, so bad abstractions are
       | more likely.
       | 
       | - More mature projects tend to have more abstractions because of
       | their additional complexity, so I would guess that there's a
       | strong correlation between difficult projects and frequency of
       | abstractions.
       | 
       | - Some people went absolutely nuts with writing blogs posts, and
       | evangelizing certain techniques which were completely unnecessary
       | in an effort to push out content. There's lots of things to write
       | about on implementing abstractions. But little in the other
       | direction other than don't write unnecessary abstractions.
        
         | AnimalMuppet wrote:
         | The flip side is, duplication is bad because when you find a
         | bug and fix it, did you fix it _everywhere_? How many places
         | were there where the bug needed fixed? Are you sure you got
         | them all? It 's much easier when there's only one place that
         | you have to fix.
        
           | mfitton wrote:
           | I'm getting shot in the foot by this right now as our team
           | embarks on tackling some long-term tech debt.
           | 
           | The approach we've found that works is health checks and
           | manually looking into cases when we think we've fixed a bug,
           | as it will often point us to a piece of duplicated code we
           | missed that we can wrap into the fold.
        
           | postalrat wrote:
           | Or maybe it's a bug in 80% of the cases and not in the other
           | 20%.
        
           | preommr wrote:
           | Ticket closed, case closed.
           | 
           | And I am only half-joking about that. I don't think that
           | effort is that visible and often goes unrewarded. I feel like
           | a lot of managers don't directly, but indirectly use number
           | of tickets closed as a sign of productivity which affects
           | promotions and compensation.
           | 
           | Obviously YMMV, but teams that care about their code quality
           | to such an extent are less likely than places that act as
           | ticket factories.
        
           | hooverd wrote:
           | Maybe that bug in that abstraction was actually load bearing.
        
           | ozim wrote:
           | Flip side of the flip side is bad because when you fix the
           | bug in "abstraction" or de-duplicated piece of code: how do
           | you know you did not break something you don't know.
           | 
           | Duplication is easier because once you fix that single place
           | - you are 100% sure you fixed that place and you did not
           | break 10 other places. Maybe you know "by writing unit
           | tests", but when you write unit tests, when you find out you
           | broke something.
           | 
           | Funny story time: had an add/edit popup in system because
           | they looked the same so dev just made it "single thing".
           | Something like 3 months dev1 fixed something -> qa2 found a
           | bug X -> dev2 fixed something -> qa1 found a bug Y -> dev3
           | fixed something -> qa 3 again found bug X. When I got into
           | code base I noticed that ping-pong because somehow I was only
           | sane person to check git history and I split things up.
           | Something like that happened multiple times in my career
        
       | aidenn0 wrote:
       | I clicked on the "more nuanced and comprehensive post" and the
       | real TL;DR is "I define duplication differently than everybody
       | else, and, by that definition, claim that duplication is always
       | bad"
        
         | jasonswett wrote:
         | > Just because a piece of duplication costs something doesn't
         | automatically mean that the de-duplicated version costs less.
         | It doesn't happen very often, but sometimes a de-duplication
         | unavoidably results in code that's so generalized that it's
         | virtually impossible to understand. In these cases the
         | duplicated version may be the lesser of two evils.
        
       | janaagaard wrote:
       | > My understanding of the "duplication is cheaper than the wrong
       | abstraction" idea, based on Sandi Metz's post about it, is as
       | follows. When a programmer refactors a piece of code to be less
       | duplicative, that programmer replaces the duplicative code with a
       | new, non-duplicative abstraction.
       | 
       | I think one of the main takeways from Sandi Metz's quote is that
       | you should postpone creating the abstraction until after you have
       | the duplicated code. Sometimes you will remove the duplication
       | when you have just two implementations, sometimes you will want
       | many more. Once you have the repeated code it's relatively easy
       | to make the right abstraction.
        
       | olingern wrote:
       | I passionately disagree with this. Abstractions inherently
       | introduce some level of opaqueness and it's only useful in the
       | context of making things more maintainable. Duplicated code is
       | easier to reason about because its intent is closer to the
       | problem it originally solved.
        
       | wellpast wrote:
       | To talk about "duplication is cheaper than the wrong abstraction"
       | without invoking "dependencies" at all (and their costs) means
       | the entire premise has been missed.
       | 
       | Another tell:
       | 
       | > Don't try to make one thing act like two things. Instead,
       | separate it into two things.
       | 
       | If abstractions were so easily split like this, then the advice
       | wouldn't hold. But they never are. Abstractions immediately
       | accumulate dependencies making it near impossible to split them,
       | as we all learn after living in anything other than toy code
       | bases.
       | 
       | This is the hallmark of a junior (ie someone who has not been to
       | battle much) is making de-duplicating code a priority and not
       | understanding the cost of dependencies.
        
       | [deleted]
        
       | bheadmaster wrote:
       | It took me a long time and many thousands of lines of code
       | written, read and re-written in order to understand one thing:
       | Code is supposed to reflect the intention.
       | 
       | Good code reflects that intention smoothly, like a well-written
       | paragraph of a book reflects the events that happened in the
       | story.
       | 
       | DRY makes sense semantically, when a piece of code _always_ needs
       | to be the same as another piece of code - that 's when you
       | isolate it into a function with a semantically meaningful name
       | and behavior. Applying DRY without understanding and
       | indiscriminately leads only to confusion and needless complexity.
        
       | Ensorceled wrote:
       | I have a current example where this bit my team.
       | 
       | A fairly common pattern that I've seen over and over in multiple
       | domains is this:                   Given a group of of "things"
       | with a start and stop date,  list all the things that are
       | "active" during a given date range.
       | 
       | Some one abstracted it because we have several "things" that use
       | this logic.
       | 
       | Then it had a bug because some of the things are inclusive and
       | some are exclusive.
       | 
       | Then it had a bug because some of the things use dates and some
       | timestamps.
       | 
       | Then it had a bug because some of the things are timezone aware
       | and some are not.
       | 
       | So we started down the path of a rather simple query construction
       | becoming a complex thing with flags for inclusive/exclusive for
       | start and end, timezone settings ...
        
         | IshKebab wrote:
         | That sounds like your code just isn't properly typed.
         | 
         | For example in Rust the first bug would be caught by `Range` vs
         | `RangeInclusive`.
         | 
         | The second bug would trivially be caught because dates and
         | timestamps are different types.
         | 
         | The third is trickier, but (depending on exactly what you mean)
         | that can be caught with static types too.
         | 
         | Pointing your finger in the wrong place IMO. If anything this
         | refactoring highlighted worrying inconsistencies in your code
         | that probably would have cropped up as bugs elsewhere.
        
         | willio58 wrote:
         | Great example. One way to avoid these problems is having lots
         | of tests written for the various uses of the abstracted thing
         | so you know they're all covered. But also, if all of these
         | things function in different nuanced ways, is it really any
         | benefit to have them all jammed into the same abstraction in
         | the first place? I've found this comes down to personal taste.
         | I prefer a little duplication if it means not having to "own"
         | an abstraction that I'll need to heavily document and hope
         | people read the documentation for in order to not break. But
         | some would rather own the one point of failure.
        
         | williamdclt wrote:
         | > So we started down the path of a rather simple query
         | construction becoming a complex thing with flags for
         | inclusive/exclusive for start and end, timezone settings ...
         | 
         | Forcing the caller to _think_ about inclusivity and timezone
         | awareness is not a bad thing, rather the opposite. These are
         | important decisions to be taken: the abstraction is not trivial
         | because what it abstracts actually does have inherent
         | complexity.
         | 
         | If the abstraction forces you to take the necessary decisions
         | (inclusive? timezone?) without having to think of how to
         | implement them, it doesn't sound like a bad abstraction. Too
         | often these decisions are not thought about, and the expected
         | behaviour is "whatever is implemented".
        
         | alphanumeric0 wrote:
         | Sounds like each thing should know how to search for active
         | instances of itself, given a date range, which is a common OO
         | abstraction.
        
           | chiefbucket wrote:
           | The point isn't the interface though it's the implementation.
           | And if many of those things are implementing the same search
           | functionality slightly differently, you're back to the same
           | spot, except now your bugs are spread across multiple sites,
           | often with duplication.
           | 
           | The underlying issue is just that correctness is hard I
           | think.
        
           | marcosdumay wrote:
           | One should at minimum name the things that behave differently
           | by different names, what is a common practice in data
           | modeling.
           | 
           | I expect all those bugs to return again and again as
           | different people maintain that code. At least with code
           | deduplication they would have a clear alarm telling them
           | their knowledge is wrong and they must pay attention. But
           | with each query doing everything people will just assume they
           | know it all.
        
         | wvenable wrote:
         | Who knows whether some things are inclusive or not? Who knows
         | what use dates and timestamps? It seems like this should be
         | abstracted somewhere and this knowledge codified in one single
         | place. It sounds like your abstraction, in this case, isn't
         | very abstract at all.
         | 
         | That is common for bad abstractions -- they add a layer but
         | they don't actually encapsulate any knowledge. To use this
         | abstraction, you shouldn't be passing any flags for
         | inclusive/exclusive, etc -- it should know that for you.
        
       | opportune wrote:
       | When you work in a very large and complex codebase you encounter
       | a few things that this author doesn't seem to consider or thinks
       | are very minor:
       | 
       | 1. Refactoring something introduces non-negligible risk. Consider
       | a class with many fields and multiple mutexes it uses to control
       | concurrent access to those fields. Even just consolidating those
       | mutexes introduces the hard-to-conclusively-find-in-testing risks
       | of introducing a deadlocks and livelocks. And that's like the
       | base case of refactoring the class: anything involving splitting
       | the class up, moving data fields up or down the stack, changing
       | the way member functions (which acquire locks) call each together
       | is even more complicated and risky. It is just not worth
       | refactoring this thing unless you have a very very good reason.
       | 
       | 2. A function or object often has a many-to-many relationship in
       | what it touches: it is called or accessed from multiple places
       | and it calls and accesses many things. Non-trivial improvements
       | to abstractions typically involve changes at both ends: which may
       | be "as simple" as updating all the call sites to take a new
       | argument or handle a different kind of error (hopefully all your
       | call sites are structured so error handling is compatible with
       | _their_ abstraction!) or as complex as completely refactoring
       | multiple levels up and down the stack to reflect better-
       | abstracted semantics.
       | 
       | No you shouldn't lazily copy-paste around such problems when they
       | are straightforward enough. But it can so so much less work (and
       | again, less risk of breaking things) to use composition +
       | wrappers, or inheritance, or to copy some little chunk to code
       | than to do things the "right" way.
       | 
       | 3. Let's face it, your cool new abstraction sounds right in your
       | head, but in a complex system it may just be playing abstraction
       | whackamole once all the bugs and edge cases you're not initially
       | considering get addressed. It may be impossible to fully
       | understand the entire system from beginning to end, without which
       | it's hard to be confident you're actually improving things before
       | embarking on your epic partial rewrite, or at the very least know
       | you're not changing semantics around some arbitrarily-drawn box.
       | But if you're not even changing the semantics, see point 1.
        
       | JohnMakin wrote:
       | We need a new saying - "Premature DRY is the root of all evil."
        
       | vemv wrote:
       | I wish more people simply were happy with using _themselves_
       | whatever set of beliefs /techiques they deemed best (abstraction,
       | duplication, whatever), preaching nothing, and arguing less.
       | 
       | Which is to say, there will never be a single truth for these
       | topics. So why not build a mindset that is ready for encountering
       | differing opinions, diverse code?
        
         | hinkley wrote:
         | When your job is mentoring, RCA or cleaning up after other
         | people (hello) then these aren't opinions and aesthetics.
         | They're empirical evidence and/or coping mechanisms.
         | 
         | Invalidating people's coping mechanisms without proposing your
         | own never goes over well. And sometimes even then.
         | 
         | When diagnosing a production issue, we don't have the luxury of
         | entertaining five different ways to solve the same problem. And
         | code smells slow debugging-under-the-gun because most bugs are
         | in code smells, so they draw your attention only to prove to be
         | a false signal.
         | 
         | If you don't do any of these things, then it's challenging to
         | have empathy for or understanding of the people who do. The
         | people keeping the wheels on deserve the benefit of the doubt.
         | In fact anyone who will stand up and fix problems when they
         | arise deserves a bigger vote on how things get done. Everyone
         | else's opinions are theoretical rather than vocational.
        
       | dathinab wrote:
       | I do.
       | 
       | I have seen ton of time wasted due to the wrong abstraction.
       | 
       | Through it's a question about how much and what you duplicate.
       | 
       | Which means I somewhat partially agree with the articles which is
       | more well nounced then the title implies.
       | 
       | One of the most common case of bad de-duplication is doing so
       | with code which happens to be mostly same but there is nothing on
       | a business logic pov which makes it the same.
       | 
       | Or code which differs mainly in points which the language used
       | needs a lot of complexity to abstract over.
       | 
       | In my experience having a more power full type system, like in
       | Scala, Haskell or Rust one one side has the benefit of making the
       | refactoring much less bug prone, but also are easier to go into
       | the "abstraction introduces too much complexity" territory. In
       | the end using a type system _appropriately_ is a skill. One some
       | which some technical very skilled people are missing.
       | 
       | Through what I also realized is that with strict type system a
       | "top down abstractions" using e.g. custom
       | traits/interfaces/abstract classes tend to be much more likely to
       | cause issues compared to composite bottom up abstractions using
       | closures to fill in the missing part. Sadly this kind of
       | abstraction while simple in the simple case are also prone to
       | need some limited degree of higher kindred typing in the less
       | simple case. This is putting limits on how much you can
       | practically apply them in many languages (or it accidental
       | becoming to complex due to missing intuitive notation for the
       | limited higher kindred type parts needed).
       | 
       | Through the most important thing for many projects is to make the
       | code easy to change. And with this I mean changing the source
       | code, not having complicated abstractions allowing you to use the
       | same source code in many different ways even through you only do
       | use it in one way at any point in time.
        
         | hinkley wrote:
         | There are two situations I've observed where Sunk Cost Fallacy
         | reliably doesn't kick in. One is three line functions and unit
         | tests. The other is duplicated code. It's better to err on the
         | side of mistakes that people don't get precious about fixing
         | later.
         | 
         | A lot of the arguments I have with coworkers end up being about
         | friction and blind spots about friction. "You" think these
         | things don't slow you or others down later, but I have a
         | bibliography of incidents that say you're wrong. Wishful
         | thinking is married to magic thinking, and they have a child
         | named "mortgaging the future".
        
       | stephc_int13 wrote:
       | Software architecture is a domain where hard and fast rules don't
       | work.
       | 
       | This is all about understanding tradeoffs and nuance.
       | 
       | In general, I believe that abstractions should be used with
       | moderation, de-duplication is not always an improvement,
       | especially in the long run.
       | 
       | I've made this mistake a lot as I tend to be quite obsessive with
       | so-called code "cleanliness".
       | 
       | It is good that novice programmers are warned about the dark side
       | of abstractions, but ultimately they'll have to experience it by
       | themselves to fully grasp why and how they can be detrimental.
        
       | jhp123 wrote:
       | For some reason programmers think that an "abstraction" is the
       | same as just naming something. If I take a bunch of code that
       | will only work given specific, concrete conditions and give it a
       | name like "setup()" then I have "abstracted" it.
       | 
       | People who know what abstraction means, and people who use it to
       | mean indirection or naming things, will of course never agree
       | about how useful it is.
        
         | [deleted]
        
       | yowlingcat wrote:
       | I found this blog post low on insight and thoughtfulness. I've
       | worked with engineers in the past who had an inflated esteem not
       | just of their own abilities but of the nature of the business
       | domain they were ostensibly building solutions inside. I have
       | found that in many cases, there's a level of naievete commingled
       | with arrogance that comes from never having worked with an
       | intrinsically complex enough problem to understand the true cost
       | of abstraction, which is always nonzero.
       | 
       | Now, it is the case that there are many cases where the cost of
       | abstraction is low enough to not be ROI negative. But there are
       | many cases otherwise. Other commentators here have done a great
       | job of detailing that space -- that incidental and actual
       | repetition vary, that abstractions should exist to reduce
       | optionality and ease of reasoning rather than simply reducing
       | code, and those are all correct. But at a very basic level, all
       | of those observations reflect the most critical missing factor
       | from this post, which is context.
       | 
       | No software is created or operated in a vacuum. Every piece of
       | software is created by humans to solve problems for themselves or
       | other humans. So every piece of downstream of the working process
       | of those humans. Given that these working processes are subject
       | to change and evolution, changes in requirements aren't edge
       | cases but table stakes. This means that often the cost of an
       | abstraction is not just whether it's the wrong abstraction at a
       | point of time, but also whether it's an abstraction that is
       | likely to erode over time given a particular working process.
       | 
       | With that said, a lot of this post seems like an exposition of
       | this central point:
       | 
       | ```
       | 
       | If I were to see a confusing piece of code littered with
       | conditional logic, I wouldn't see it and think "oh, there's an
       | incorrect abstraction", I would just think, "oh, there's a piece
       | of crappy code". It's neither an abstraction nor wrong, it's just
       | bad code.
       | 
       | ```
       | 
       | I've seen this dismissal from many engineers over my career, and
       | in every case, without fail, it reflected an inability to deeply
       | read and understand the code, its history, and likely its future.
       | To all the engineers out there reading this: thinking like this
       | will prevent you from maturing from a junior engineer to a mid
       | level engineer, never mind a mid level engineer to a senior
       | engineer or engineering leader. You've been forewarned.
        
       | henrydark wrote:
       | I have a hot take on this, which I hope will resonate with at
       | least a few people: duplication, even of blocks of up to a few
       | long statements, rarely bothers me, because I remember all the
       | duplications as a single instance. I have extra ordinary memory,
       | and this makes a huge difference in how I think of and write
       | code. Or anything really. I save everything I've ever written,
       | like bash history, but everything, and refer beck to it and copy
       | paste somewhere else. I wonder if anyone else has this. This
       | doesn't affect how I think of production code, but it hugely
       | affects my work flow.
        
       | jmull wrote:
       | The key factor when de-duping some code is to know whether the
       | code is the same because they express _the same abstraction_ or
       | due to _coincidence_.
       | 
       | If they are the same abstraction then they should always be the
       | same and you're doing the right thing to de-dupe.
       | 
       | If they are the same due to coincidence de-duping will tie
       | together things that should be independent. As development
       | continues the implementations will need to diverge. That's when
       | you get the rat's nest of conditional logic. It's a lot easier to
       | add a parameter and conditional logic to a function than rip it
       | out.
       | 
       | It's not always easy to tell if two bits of code are the same due
       | to coincidence or not... it might come down to nuances of
       | business considerations that the developer has no idea about (or,
       | since we're talking about predicting the future, no one knows
       | about).
       | 
       | I don't think it can be done perfectly. But it's worth
       | considering why _not_ to de-dupe before you do it.
        
         | aleksiy123 wrote:
         | I think this is the biggest thing people generally
         | misunderstand about "duplication".
         | 
         | It's really about if two concepts need to change together over
         | time. They should be singularly defined.
         | 
         | If they can move independently they should be two definitions.
         | 
         | It's not literally about the code looking the same.
        
         | kevincox wrote:
         | I was looking for this. There are definitely two types of
         | duplication. For example not every use of the number 16 should
         | be replaced with a SIXTEEN constant. However if the maximum
         | allowed password length is 16 you shouldn't be writing 16 all
         | over you code, you should be writing MAX_PASSWORD_CODEPOINTS
         | because your system may depend on that value being consistent.
         | 
         | Although I would disagree that you should never deduplicate
         | things that are coincidentally the same. Sometimes code that is
         | coincidentally the same can have the same bugs and require the
         | same updates over time, so deduplicating them can reduce
         | maintenance cost and remove bugs. However I wouldn't race to
         | deduplicate these things. Just if they become frequent patterns
         | or have remained the same for long enough to justify the effort
         | to unify them.
        
       | djha-skin wrote:
       | I'm a DevOps engineer. I totally buy duplication is better than
       | the wrong abstraction but I'd like to nuance it: duplication is
       | better than an abstraction used by two disparate parties (groups
       | of people that don't talk to each other).
       | 
       | This is in agreement with Conway's law, which absolutely governs
       | everything I do. I work on a DevOps team that supports several
       | different development teams all working on different things. The
       | code I write for those teams I often duplicate along team
       | boundary lines. Build scripts, for example, I write and I put
       | them in each team's git repository. These might look very
       | similar. This allows the scripts to grow and change and evolve
       | according to the different teams needs without the teams needing
       | to talk to each other.
       | 
       | "Proper duplication" goes back to separation of concern. If you
       | have two different concerns (using the lens of Conway's law, two
       | very different teams) using the same code, perhaps they should
       | not be using the same code because that is not a separation of
       | concern. Separate the concerns by separating the code paths both
       | concerns use.
       | 
       | This type of duplication is praised in more depth on wingolog[1].
       | I highly recommend reading it as something every engineer should
       | read.
       | 
       | It's very important to know when to duplicate and when not to do
       | so, because duplicating it the wrong time can lead to pain, but
       | not duplicating can lead to pain also.
       | 
       | 1: https://wingolog.org/archives/2015/11/09/embracing-
       | conways-l...
        
         | waffletower wrote:
         | Very sad hearing this from a DevOps engineer. While the config
         | smear of ops is encouraged by their tools (Terrorform is a
         | fantastic example), a DevOps engineer that does not dedicate
         | themselves to DRY practices will erode the productivity of an
         | organization by default. I remember how Terrable things were
         | before our Ops team developed strong module abstractions for
         | our infrastructure. And get them to talk to each other.
        
         | schnable wrote:
         | 100% agree. Coupling between systems and teams is very
         | expensive and should be done as deliberately as possible.
        
         | dahart wrote:
         | This. I think you're hitting the nail on the head. The question
         | is whether there are multiple dependencies on a given bit of
         | code. When there are multiple dependencies, changing the code
         | because one of them wants something means the code needs to be
         | checked and tested against all the other places the code is
         | being used. And it's really really common to have inadequate
         | understanding and inadequate test coverage, so things break,
         | and hence people develop superstitions about code that
         | shouldn't be touched.
         | 
         | Another way of putting it is that if the code is really truly
         | duplicated, then it doesn't need to change at all. If it has to
         | change, the need for change is there because the multiple
         | parties depending on that code have slightly different needs
         | and slightly different ideas about what they want. Abstracting
         | the code to make deduplication happen is just a way of
         | spackling over those differences, but it can and does often
         | cause trouble down the road, even when it's done well. Once
         | abstracted for two dependencies, a third dependency or more
         | without test coverage can make changes exponentially more
         | dangerous and error prone.
         | 
         | Duplication is good when forking for separate parties (or
         | separate dependencies), each of whom may wish to customize the
         | code, and now they are free to do so without the risk or fear
         | of breaking someone else. I feel like the author of the article
         | didn't understand the benefits of duplication.
        
         | ncruces wrote:
         | This is an example of Go's proverb: A little copying is better
         | than a little dependency.
        
       | yoyohello13 wrote:
       | Here's my hot take: It's not worth thinking about abstraction
       | until you've implemented the same thing 3 times.
        
       | samsquire wrote:
       | I think I would want to look for an accurate "representation" and
       | expression of the right problem, not any particular abstraction
       | technique or mechanical refactoring.
       | 
       | Refactoring code to your understanding helps you understand the
       | code but leaves the code in a different organisation to how it
       | was, adapted for your mental model of the problem.
       | 
       | If programming languages were expressive enough, we could
       | represent things how they are and replicate that base pattern to
       | different cases or scenarios and that would be enough but
       | unfortunately our languages are not expressive of our high level
       | intent and invariants we want to maintain. (Such as extensibility
       | or hookability)
       | 
       | In other words, get the mental model for the problem right and
       | the abstraction will be invisible and the solution shall be
       | obvious.
       | 
       | Abstraction impedance mismatch is when people introduce a design
       | pattern or a strategy that is harder to understand than the
       | problem that was being solved and obfuscates it.
        
       | ckdot2 wrote:
       | Don't re-use using inheritance, but dependency injection. A
       | (well-tested) software component that get's dependency-injected
       | should get considered "final". If it makes sense to adjust it,
       | you may still can do it - there's nothing preventing you from
       | this. But you should always be aware that logic relying on the
       | dependency may behave different in a way you haven't forseen. If
       | you just want to make a change for a single place in your
       | software, you can easily replace the dependency with another one
       | implementing the same interface. You could even decorate the
       | original dependency if you want to re-use most of it's code. What
       | I want to say, nearly all of the abstraction issues come from
       | inheritance and in many, many cases there's no need to use
       | inheritance at all.
        
       | dgb23 wrote:
       | The most important underlying issue isn't discussed in the
       | article:
       | 
       | DRY must be understood and applied correctly.
       | 
       | "Every piece of knowledge must have a single, unambiguous,
       | authoritative representation within a system"
       | 
       | The keyword here is _knowledge_.
       | 
       | When we see duplication, repetition and so on, then that might be
       | because that piece of code represents:
       | 
       | - data of different entities that have similar structures
       | 
       | - logic that just happens to be similar.
       | 
       | - boilerplate code
       | 
       | None of these things have anything to do with representing the
       | same piece of knowledge in a program. In fact, you can easily get
       | into trouble _especially_ if you think the first two things are
       | violating DRY when they are not.
       | 
       | I agree with the article, wholeheartedly though. If your code or
       | data _model_ is not DRY, you can get into trouble very easily.
       | Very nasty bugs, regressions during maintenance or extension,
       | hours spent in frustration, money lost etc. On top of that: Non-
       | DRY code almost always _proliferates incidental complexity_,
       | because if you don't fix it, then eventually you patch over it.
       | 
       | Here's the best case scenario: Even if you are aware of code not
       | being DRY, do everything right and turn multiple knobs at the
       | same time to change or extend it correctly instead of fixing it,
       | you will do so with much more reluctance and it will be much more
       | mentally taxing.
       | 
       | Non-DRY code is by definition complex: You now have more
       | interconnected parts than you need. So really, if you make your
       | code more DRY, you _simplify_ it.
        
         | dahart wrote:
         | My favorite counter-acronym to DRY is WET: write everything
         | twice (or thrice!). Doing and then redoing it once you
         | understand it better is the best way to learn how to apply DRY
         | correctly.
         | 
         | > Non-DRY code is by definition complex: You now have more
         | interconnected parts than you need. So really, if you make your
         | code more DRY, you _simplify_ it.
         | 
         | It really depends, I think there are some assumptions here that
         | could use clarification. The whole point of choosing
         | duplication is to _disconnect_ parts that shouldn't be
         | connected, so I don't understand what you mean about non-DRY
         | code being more interconnected than duplicated code. Conscious
         | duplication (often called "forking") allows people who depend
         | on a piece of code to change it without breaking anyone else.
         | When you merge two pieces of similar code, they already had two
         | or more separate uses, and you're adding a new connection,
         | tying together the fates of two or more different users. From
         | now on, if they don't have exactly the same agenda, there will
         | be tension and /or bugs.
         | 
         | If deduplication requires adding an abstraction layer, then
         | that absolutely is adding complexity, and it happens because
         | the code being de-duplicated was not _exactly_ the same. Code
         | that's truly duplicated doesn't need to change in order to de-
         | duplicate. So you can delete a copy in that case and centralize
         | the dependencies onto the remaining copy. That eliminates code
         | but doesn't really simplify; it has the potential to simplify
         | future development, but it doesn't simplify the code at the
         | moment of deletion. With modern build systems and project
         | structures, however, it might take a lot of work and it might
         | add complexity to get the DRY code into the right spot where
         | it's visible to everyone who needs it. Another reason for
         | duplication is to avoid having to do backflips to get the code
         | into the right file or scope.
        
           | dgb23 wrote:
           | > Conscious duplication (often called "forking") allows
           | people who depend on a piece of code to change it without
           | breaking anyone else.
           | 
           | Then that code is DRY by definition and simpler.
           | 
           | It can't be Non-DRY because that would imply you'd need to
           | change things in multiple places at once in order to avoid
           | breakage.
           | 
           | If you have two separate parts that can evolve independently,
           | without coordination then those parts don't represent the
           | same piece of knowledge.
        
           | hinkley wrote:
           | For testing I prefer DAMP. Descriptive And Meaningful
           | Prose/Phrases. I've watched otherwise smart people wrestle
           | with testing boilerplate when requirements change and I've
           | had my fill for this lifetime.
           | 
           | Each test is a separate story. At most tests in a suite
           | should share setup code. Anything more than that is coupling
           | of tests, which is a no-no. The distinction between mocks and
           | fakes are the most common place I see this blow up in our
           | faces. Fakes result in coupling of tests. They were difficult
           | to write so they get amortized across ten tests, making new
           | requirements difficult to impossible to add without
           | accidentally removing coverage of other requirements.
        
       | Attummm wrote:
       | Until you have to maintain a codebase that has horrible
       | abstractions, and everything is to coupled.
        
       ___________________________________________________________________
       (page generated 2023-08-29 23:01 UTC)