[HN Gopher] I don't buy "duplication is cheaper than the wrong a...
___________________________________________________________________
I don't buy "duplication is cheaper than the wrong abstraction"
(2021)
Author : Akronymus
Score : 127 points
Date : 2023-08-29 12:31 UTC (10 hours ago)
(HTM) web link (www.codewithjason.com)
(TXT) w3m dump (www.codewithjason.com)
| gpderetta wrote:
| I think it is worth distinguishing proper opaque abstractions,
| that are defined by a contract, from convenience macro-like
| "abstractions" that are defined by their implementation.
|
| The former are for abstracting different implementations behind
| and interface and/or decoupling, and require thought, planning,
| and careful consideration for their evolution.
|
| The latter are purely for convenience, to save some typing, some
| mental overhead when understanding code (although they can
| increase it just as well) and to centralize minor bug fixes or
| common features. For long term evolution and divergence, these
| abstractions should simply be macro-expanded instead of trying to
| refit them for the new requirements.
|
| Of course reality is a continuum.
| vendiddy wrote:
| Does reducing duplication make your code easier to maintain? Or
| does it not? Make your decision accordingly.
|
| Start treating duplication as a means to means to an end. It's
| not an end in itself.
| groby_b wrote:
| Duplication is neither "cheaper than the wrong abstraction" nor
| is it "one of the most dangerous mistakes in coding".
|
| There's a cost to abstraction. There's a cost to duplication. Our
| job, as engineer, is to stop applying blanket statements and
| instead reason about the tradeoffs. And no, they aren't static
| tradeoffs either, because requirements and constraints don't stay
| static.
| waffletower wrote:
| I think the answer here can be different depending upon the
| ecosystem. I confidently believe that abstraction is better
| instrumented and practiced in functional programming languages
| than those of the still-dominant object-oriented paradigm.
| Awkward abstractions are much easier to grow and stumble upon
| when the basic unit (an object) encourages private, greedy,
| encapsulation of data and method implementations. In functional
| languages, living up to DRY (don't repeat yourself) is a much
| more immediate and clear proposition.
| lvncelot wrote:
| > Except in very minor cases, duplication is virtually always
| worth fixing.
|
| I disagree with the severity of this, and would posit that there
| are duplications that can't be "fixed" by an abstraction.
|
| There are many instances I've encountered where two pieces of
| code coincided to look similar _at a certain point in time_. As
| the codebase evolved, so did the two pieces of code, their usage
| and their dependencies, until the similarity was almost gone. An
| early abstraction that would 've grouped those coincidentally
| similar pieces of code would then have to stretch to cover both
| evolutions.
|
| A "wrong abstraction" in that case isn't an ill-fitting
| abstraction where a better one was available, it's any (even the
| best possible) abstraction in a situation that _has no_ fitting
| generalization, at all.
| MilStdJunkie wrote:
| I might have to say some unkind things here, but statements
| like: instead of "duplication is cheaper than
| the wrong abstraction", I would say "duplication is cheaper
| than confusing code littered with conditional logic".
|
| seems like it's looking at this problem from an extremely
| narrow context.
|
| The truth is that the phrase "wrong abstraction" is (more or
| less) unquantifiable, which makes the original phrase, as
| employed, sort of like a koan. It addresses the very human
| tendency to see patterns in noise, and our ability to
| "transmit" such hallucinations to other humans via natural
| language and other means.
|
| The closest I can get to - given my at-best-apprentice status
| as a formal programmer - is the quantitative test I developed
| for CCS (conditional content systems), where the abstraction
| lies in the SNS[1], and the de-duplication mechanism is
| applicability[2]. Since each applicability statement carries
| its own overhead, there's a limit on how much "abstraction" the
| model can take before it's using quantitatively more keystrokes
| than duplication.
|
| The test goes like this: take the flat text procedures for ALL
| the configurations, and add it together. Now, take the
| conditionalized, applicability-laden procedure that unifies the
| procedure, and measure its file size. If the latter is LARGER
| than the former, then you're using the wrong SNS/applicability
| model for rolling up this content.
|
| Thing is, this is _inevitable_ if you throw enough dissimilar
| configurations at a CCS, because each configuration has its own
| overhead, and eventually that outpaces the content itself.
|
| You can address this in a bunch of ways - like adding a
| containing pseudo-product that has all the configurations
| inside of it - but the actual real Product Management might not
| let you build on the applicability like that, because the
| Product itself isn't sold that way. Any other abstraction isn't
| available to you, because in the end this is natural language,
| which - unlike structured language - resists first order
| abstractions _really well_. This is one of those instances
| where, yes, the abstraction of the SNS /Applicability is
| _worse_ - quantifiably - than duplication. All that complexity
| would be better handled via version control fork /branch
| relationships - _far outside_ of the realm of natural language.
|
| [1] standard numbering system, a sort of numeric designator of
| functional systems, the primary way that content is designated
| as semi-independent modules.
|
| [2] conditional "chunks" that turn on and off depending on the
| applicability statement
| feoren wrote:
| It'd be wonderful if we could measure the utility of software
| engineering choices by counting keystrokes or measuring file
| sizes or putting them in a turbo encabulator and seeing which
| one has more modial interaction with its magneto-reluctance.
| Unfortunately, reality is just too complicated, with far too
| many tradeoffs to be balanced. I'd recommend deep thought and
| discussion about the domain over looking at a graph of your
| codebase's sinusoidal repleneration.
|
| > All that complexity would be better handled via version
| control fork/branch relationships
|
| Probably not.
| MilStdJunkie wrote:
| Holy smokes, my turbo-sarcasmo detector just broke! But
| yeah, that's more or less the TLDR of my point. The phrase
| "wrong abstraction" does some heavy lifting, but it's not a
| bad concept, even if largely a qualitative one. No one
| should use a single metric to toss ginormous architecture
| decisions - they're tools to inform educated judgement, not
| replace it.
|
| Re: fork/branch shenanigans, no, you're right, that's not
| an optimal way to handle variance . . in a normal
| programming language. In the context of natural language,
| it's not the same kettle of fish, because, well, lots of
| reasons, probably the most prominent being the "messy
| unidirectionality" of NL that's all mish mished with its
| extremely complex grammar vs constructed languages.
| Chopping up giant documents into tiny pieces a la CCS[1]
| systems has made this a stew of problems, but for some
| reason Leadership is fond of the idea. It's not unlikely
| that specialized on-prem LLMs are going to nuke the CCS
| concept from orbit in the next five years, except for those
| cases where the CCS is a contractual requirement for doing
| the work.
|
| [1] component content systems
| rdedev wrote:
| You got a good point about code evolution. Has anyone taken a
| look at it from a biological perspective? Seems like such
| problems can occur in genetics and nature might have come up
| with some tricks we can use
| ilyt wrote:
| > There are many instances I've encountered where two pieces of
| code coincided to look similar at a certain point in time. As
| the codebase evolved, so did the two pieces of code, their
| usage and their dependencies, until the similarity was almost
| gone. An early abstraction that would've grouped those
| coincidentally similar pieces of code would then have to
| stretch to cover both evolutions.
|
| Then you split that abstraction again. It's very cheap and very
| quick.
|
| Many people talk about the issue like it was an absolute in the
| code, but that's wrong approach. If you end up writing 4
| functions that are the same, by all means, merge it into one.
|
| If then you need to add a parameter only this code path uses
| and rest doesn't care about, by all means split it back. Moving
| blocks of code around is cheap.
| oxfordmale wrote:
| Splitting the abstraction is never cheap and quick, mostly
| because of politics. With duplicated code you often can
| assign a single responsible owner to each duplication.
|
| However, once abstracted, the code may suddenly be used by a
| number of different teams. You will need to get this work on
| their roadmap, increasing the friction to get this done. In
| many companies, this will also end up in endless discussions
| about the new approach.
| yellowapple wrote:
| Solution there would be to make the abstraction "opt-in",
| such that a team can elect to duplicate or abstract as
| desired. Also helps if the "main" abstraction is itself
| composed from smaller abstractions, from which downstream
| teams could then pick-and-choose rather than having to
| either fully abstract or fully duplicate.
| esafak wrote:
| This is a good point. Following Conway's Law, a team may
| choose to duplicate code or do thing theoretically sub-
| optimally simply to avoid having to deal with other teams.
| jrumbut wrote:
| I think the key here is the oft repeated but often poorly
| understood maxim to favor composition ("has a") over
| inheritance ("is a").
|
| If you have a mixin (or other means of composition) that you
| use in several places and one diverges, it's easy to remove
| it. If you use inheritance, it's going to be more painful.
|
| A language that offers OOP via prototypes instead of classes
| like JS can (sometimes) give you the best of both worlds, but
| it will confuse a lot of devs who aren't familiar with that
| kind of OO design.
| BWStearns wrote:
| Agreed. Abstractions also tend to be more resistant to change,
| both from a technical level, and a social level.
|
| At a technical level an abstraction will have more call sites
| to worry about in different contexts, the more wrong the
| initial abstraction the harder it will be to change.
|
| The social level is maybe even more problematic. Abstractions
| seem more important than calling code and will experience more
| friction in code review. This change friction can also increase
| with the "wrongness" of the initial abstraction. The starting
| point makes less sense so a reviewer needs to work more to
| understand the context. If the abstraction is gnarly enough
| then it's possible that the reason for the abstraction is
| almost obscured. Even someone who knows _how_ it works might
| have lost the forest through the trees and push back on changes
| that simplify it or improve it if the change is a sufficiently
| large departure from the initial state. In this case you can
| often see small incremental changes get added easier but this
| just makes the shared code a bit gnarlier for next time.
| tetha wrote:
| > At a technical level an abstraction will have more call
| sites to worry about in different contexts, the more wrong
| the initial abstraction the harder it will be to change.
|
| As I recently called it, infrastructure and systems lose
| agility as they gain dependency and move down the stack.
|
| If you have like 1 customer and they have good retries,
| honestly: fuck everything. Deploy master, in fact, deploy
| every keystroke to prod. It'll be fine.
|
| At the same time, about 30k - 40k FTEs of our B2B customers
| depend on one of my Postgres instances during business hours
| and about twice of that during different holiday seasons.
| Honestly? Nothing touches the system-level settings of these
| database systems unless we have pondered a change for 2
| weeks. And even then we will schedule an approved change over
| 4 weeks across applicable postgres clusters. The carnage a
| bad change at this level can cause is ridiculous enough to
| not be.
| ljm wrote:
| This is my beef with naively applied DDD, separation of
| concerns, and design patterns.
|
| Usually what happens is the 'clean' code ideal comes first,
| and then the implementation is squeezed into it. This then
| informs the organisation (or architecture) of the rest of the
| codebase and your software design has become a matter of
| putting pegs into the right-shaped holes.
|
| I have _never_ found that kind of highly abstracted code
| easier to work with than some simple procedural alternative
| that is easy to delete and easy to refactor, so long as
| effort was put into writing it well.
|
| Of course, the patterns have a purpose and do help when used
| nicely - a lot of code you write will fall into some of those
| patterns even without you explicitly mentioning it. It's
| just...doing it for the sake of it is a problem.
| yellowapple wrote:
| > An early abstraction that would've grouped those
| coincidentally similar pieces of code would then have to
| stretch to cover both evolutions.
|
| In that case, my takeaway would be that it ain't the
| abstraction itself that's wrong, but the unwillingness to get
| rid of it (or decompose it) when it no longer serves its
| purpose.
| bcrosby95 wrote:
| Given a long enough timeline, every abstraction turns wrong.
|
| The answer isn't to not abstract, the answer is to tear it out
| when it turns wrong. That was actually the original point of
| the popular article that streamlined this view - that we
| shouldn't be afraid of tearing them out, not that we shouldn't
| make them in the first place. Most people just read headlines
| though.
| lolinder wrote:
| The resistance to tearing out a bad abstraction isn't just
| cultural: combining two different functions into one is a
| lossy operation, which makes splitting an abstraction harder
| than creating it in the first place.
|
| While the functions are distinct the call sites are self-
| documenting. You know which calls are for which purpose
| because the names are different. After combining them to
| deduplicate the code, you've lost that information, and to
| disentangle the abstraction now requires you to infer and
| reintroduce that lost information.
|
| It's not that it can't be done, but there is real friction
| that doesn't just exist in people's heads.
| abathur wrote:
| I think the difficulty of making the right decisions
| without this lost information is well-observed.
|
| I wrote a short post in roughly this idea space last year:
| https://t-ravis.com/post/doc/what_functions_and_why_functio
| n...
|
| It feels like the same thread you're describing, but I
| guess it's pulling on the other end of it. It's thinking
| about how to name things in a way that makes it easier to
| see that the implementations might diverge later, and
| simplify actually doing so (by preserving more of this
| intentional context).
| mostlylurks wrote:
| > An early abstraction that would've grouped those
| coincidentally similar pieces of code would then have to
| stretch to cover both evolutions.
|
| This seems to be the underlying assumption behind most uses of
| the "duplication is cheaper than the wrong abstraction" quote,
| but the assumption is simply incorrect. You should almost never
| try to expand abstractions in this manner. If you don't treat
| the abstractions relating to the thing you want to change in
| your codebase as "the" place where you need to make your
| change, and instead eagerly make new abstractions and throw old
| ones away as required, you won't really run into this problem.
|
| In fact, this predominant mindset where creating abstractions
| is strongly discouraged leads to the very problem that mindset
| is based on, as it will simply encourage junior developers and
| the like to modify the existing abstraction, creating the
| aforementioned kind of mess where abstractions become
| complicated through repeated modification, instead of creating
| new abstractions when appropriate, because creating
| abstractions has a stigma attached to it.
|
| Additionally, if someone has made a "wrong" abstraction based
| on something silly like two pieces of code simply being similar
| in terms of their structure and those use cases start to drift
| apart, you should feel eager to simply split apart the
| abstraction, be it into bare implementations or two new
| abstractions, or any other combination. Abstractions are cheap
| as long as you don't give them special significance.
| jameshart wrote:
| When an abstraction evolves to a point where it needs to be
| split into two separate implementations to meet diverging
| needs...
|
| _you will need to replace that abstraction with
| duplication_.
|
| Which is the right thing to do because that duplication is
| cheaper than maintaining the wrong abstraction.
|
| I think this post makes the mistake of thinking that the only
| way in which duplication comes up is that it is discovered in
| the codebase, and we have the choice of abstracting it away
| or keeping it.
|
| On the contrary, duplication can - and should - be
| consciously introduced to fix bad abstractions when we find
| _them_ in the codebase.
| cratermoon wrote:
| > When an abstraction evolves to a point where it needs to
| be split into two separate implementations to meet
| diverging needs... _you will need to replace that
| abstraction with duplication._
|
| Hard disagree. When the formerly common parts of an
| abstraction evolve to no longer be common, then that
| duplication no longer exists. There now exists two
| abstractions, one for each of the diverging needs. There
| may be some leftover commonality that can be abstracted
| out, but it's no longer the original abstraction.
| hannasanarion wrote:
| The point is that they were never actually common in the
| first place, only superficially similar.
|
| You're saying we should look for duplications, abstract
| them, and then every time a change needs to be made to
| the abstraction to suit only one of the use cases,
| refactor the codebase to de-abstract and re-duplicate,
| undoing the work we did in the name of DRY in the first
| place.
|
| That is a lot more work and a lot more confusion and a
| lot more headache for maintainers and reviewers than
| copy-pasting the thing the first time, having realized
| that the duplication was incidental, not structural.
|
| Let's take this line of reasoning to its extreme:
|
| I notice that there's a section of my code that's
| repeated twice where we add one to a value, so I abstract
| it into a function called add1(x:int). Some time later,
| at places where add1 is used we sometimes need to
| actually add a value other than one, so we need to make a
| decision: do we refactor everything and re-duplicate, or
| do we stick the DRY principle and make our abstraction
| more accomodating? The path of least resistance is to
| stick to DRY because it's a smaller and more
| comprehensible commit, so we add an optional arg, add1(x:
| int, operand?: int). Some time later one of the callers
| to this function needs to pass a vector instead of a
| single value, so we need our add1 function to have
| polymorphism and conditional logic in it now, and
| potentially more arguments. Sooner or later we have a
| frankenfunction that's hundreds of lines long and
| branches a bazillion ways and might as well be a turing
| machine in itself.
|
| Dogmatic adherence to DRY leads to madness.
| cratermoon wrote:
| > You're saying ... refactor the codebase to de-abstract
| and re-duplicate, undoing the work we did in the name of
| DRY in the first place.
|
| That's the exact opposite of what I'm advocating for, but
| perhaps I didn't express myself well.
|
| > Sooner or later we have a frankenfunction that's
| hundreds of lines long and branches a bazillion ways and
| might as well be a turing machine in itself.
|
| Yeah, that's not a good abstraction, and not at all what
| I meant.
| seadan83 wrote:
| To some extent I agree, though I don't think DRY means to
| remove all similar looking lines of code and put that
| behind a procedure. Generic code vs abstractions are
| different.
|
| Instead, any given task (which already is an abstraction)
| should exist in only one place. That is DRY, I would
| paraphrase it to mean any given abstraction should be
| done in one place (and combine with SRP to say further
| that one place should only do that one abstraction)
|
| If one place can be updated independently of another, it
| argues it is not the same task to begin with. DRY'ing
| that code is a misnomer IMHO, instead that code is being
| put behind a procedure and is being made generic (and not
| necessarily more abstract. Abstracting hides details,
| putting a block of code behind a procedure with full
| parameterization is not hiding details, it's just a
| procedure [and let us hark back to the days of procedural
| programming and ways that can become mess])
|
| DRY and SRP (single responsibility principle, AKA the DnD
| principle) need to be considered together.
| ajuc wrote:
| In many cases you cannot see the correct abstraction
| without introducing the duplication back. When working
| with particularly messy code I often do sort of
| https://en.wikipedia.org/wiki/Karnaugh_map of important
| variable states to see what actually happens before I can
| refactor it.
|
| This is basically introducing the duplication back.
|
| Whether you keep the duplicated code or refactor it in a
| different way is another question, what matters for the
| "duplication is cheaper than wrong abstraction" to be
| true is just the fact that by introducing abstraction
| early you wasted time refactoring one way and back.
| Refactoring isn't free. So in fact leaving the
| duplication there would have been cheaper - Q.E.D.
|
| It doesn't mean you should never risk it, but it does
| mean you should think hard before you do it.
| sweezyjeezy wrote:
| I think there's a middle ground here. The original quote does
| not mean DRY=bad, abstraction=bad. The point is there is a
| non-zero cost to these things. A bad abstraction can, as you
| say, accumulate to something terrible through inertia or
| inexperience. A bad abstraction, even if caught early, was
| probably not worthwhile - I mean, it took time just to make
| the original one right? This does not mean that we should be
| scared of abstraction in general, but in my opinion
| abstractions that are purely for the sake of reducing
| duplication should be viewed with an extra level of
| apprehension.
| TOGoS wrote:
| > You should almost never try to expand abstractions in this
| manner
|
| But somebody on your team /will/.
|
| > you should feel eager to simply split apart the abstraction
|
| Sure, but it's going to be a lot more work at this point than
| if we had avoided the mess in the first place.
| bheadmaster wrote:
| There's a very good quote on a programming blog [0] that I
| enjoyed reading: Repeat yourself to avoid
| creating dependencies, but don't repeat yourself to manage
| them.
|
| [0] https://programmingisterrible.com/post/139222674273/write-
| co...
| CuriouslyC wrote:
| Duplication can sometimes be useful, for instance if you have
| many small variations on a central process. Trying to make one
| process with all the edge cases baked in leads to overly-
| complex, hard to reason about, expensive software.
|
| In my experience, the right way to handle this sort of
| situation is to create a functional mini-DSL for the process
| that handles all the implementation details, then create a
| "default" process which serves as a template. If a process
| needs slightly different logic, just copy the template, update
| the DSL to support any new logic, and update the template with
| the new DSL statements. This approach lets you give semantic
| meaning to implementation details, and you can see where all
| the different custom logic is at a glace by looking at all the
| template copies. As long as the template is only calling out to
| DSL actions with no internal logic of its own and process flow
| is correctly encapsulated in the DSL, you should never need to
| update templates to change behavior, only update the DSL.
| augustk wrote:
| DSL = Domain-specific language (I guess).
|
| Always a good idea to expand an abbreviation the first time
| it's used.
| amalcon wrote:
| This way of doing things (which I agree is often the correct
| way) is the reason for Greenspun's Tenth Rule:
| https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule
|
| Though it's less true today and in languages that are not C
| or Fortran. Even something like C++ or Java has the template
| method pattern, which gets you 80% of the way there. Dynamic
| languages like Python or Ruby tend to have pretty reasonable
| facilities for building DSLs, as do more modern languages
| like Scala and Rust.
| crazygringo wrote:
| > _is to create a functional mini-DSL_
|
| Exactly. Is there a formal term for this?
|
| Instead of one gigantic function with 50 parameters, you have
| 100 "template" functions, that all make use 60 different
| "helper" functions (what you're calling the DSL).
|
| Instead of castles-of-logic abstraction, it's nuts-and-bolts
| or grass-roots abstractions. I've never come across a name
| for this development style.
|
| But it generally works extremely well when building processes
| for tens/hundreds of data formats or customers or what have
| you.
| fiddlerwoaroof wrote:
| This sounds like what lispers call something like "language
| driven design" or "growing a language"
| consilient wrote:
| > I've never come across a name for this development style.
|
| Libraries designed like this are sometimes called
| "combinator libraries".
| code_biologist wrote:
| Embedded DSLs is the term I've seen in the Haskell, Scheme
| and Ruby communities.
| tracker1 wrote:
| This is generally my approach to data ingress/egress (ETL)...
| I'd rather have a hundred similar, small scripts for each
| data source than try to create one complex (monstrosity)
| application to handle them all.
| codegeek wrote:
| Also, you become a better programmer if you write duplicate
| code and then learn how to abstract it for cases that make
| sense. I also don't believe that dupe code is always a bad
| thing. Like everything else in software engineering, IT
| DEPENDS.
| bena wrote:
| The answer, as always, is "sometimes".
|
| _Sometimes_ duplication is cheaper than the wrong abstraction.
|
| And
|
| _Sometimes_ it 's better to abstract away a duplication rather
| than let it lie.
|
| And that's the mark of becoming a master at the craft. Being
| able to recognize all of these various slight permutations of
| state and what to do about them.
| rightbyte wrote:
| Rule of thumbs really need to be told like this. Or they will
| be missused. Either by newbies that doesn't know any better
| or unpleasant programmers that will show their dogmatic
| beliefs down your throat with the common wisdom as excuse.
| awkward wrote:
| A good example of this is operations type stuff, like the pile
| of shell scripts or terraform files or whatever that get used
| to deploy your app. These scripts benefit greatly from a one to
| one relationship between the thing you're creating and the
| written text describing it. Not having a situation where
| changing one thing breaks everything else is a huge help there.
| capableweb wrote:
| As a FYI, just as it's OK to abstract away duplication in code,
| it's OK to do the opposite, remove abstraction and add
| duplication.
|
| So in your particular case, it could have been possible to
| abstract away the code _at that point in time_ and once they
| diverge, remove the abstraction and duplicate, then adjust one
| of the duplicates (which no longer is a proper duplicate
| really).
|
| But, might be more work than it's worth. YMMV.
| patrick451 wrote:
| > As a FYI, just as it's OK to abstract away duplication in
| code, it's OK to do the opposite, remove abstraction and add
| duplication.
|
| > So in your particular case, it could have been possible to
| abstract away the code at that point in time and once they
| diverge, remove the abstraction and duplicate, then adjust
| one of the duplicates (which no longer is a proper duplicate
| really).
|
| This sounds nice in theory, but the reality is that the
| effort required to make these two kinds of changes is not
| symmetric. It's about 10 times easier to get a PR approved
| and merged that combines similar looking code into a function
| than vise versa. If you any suspicion at all that an
| abstraction you're making may need to be removed and
| duplicated in the future, you're better of just never
| abstracting in the first place.
|
| It sucks pushing a change which unwinds an abstraction like
| that through code review. It's usually a lot easier to just
| never abstract it in the first place.
| mdiesel wrote:
| Equality doesn't necessarily mean Equivalence.
| Gibbon1 wrote:
| My problem always is often when writing a function to remove
| duplication brings up the question of where to put it. If its
| only called inside one module doesn't matter really. But if not
| you've created a dependency. Which is bad.
|
| I think how much you hate that may depend on your language and
| the program. Some big enterprise Java monolith is a garbage
| dump of thousands of small files. So who cares. In C without
| name spaces and the need for headers you care more.
| schwartzworld wrote:
| The problem is that "duplication is cheaper than the wrong
| abstraction" is basically an excuse that lazy devs use not to
| engineer their code.
|
| The other one I hear a lot is "it's not realistic to reach 100%
| test coverage / type safety" when submitted code with `any` all
| over it and zero tests.
| __alias wrote:
| I buy into the same belief as you here, but I guess you could
| easily argue that you could create a suitable fitting
| abstraction earlier on with the understanding that you can
| "detach" them once the point that they're fundamentally
| different comes
| consilient wrote:
| The point of abstraction is to reduce the number of concepts
| in play. If you're still tracking which old concept is
| "really" being used every time, you haven't actually
| abstracted over anything, you're just naming things badly.
| capableweb wrote:
| > The point of abstraction is to reduce the number of
| concepts in play.
|
| I'm not sure I agree with this. For me, the point of
| abstraction is divide the number of concepts between the
| layers you introduce, effectively to hide concepts from the
| layers where you don't want to have to care about them.
| Often times, abstractions adds the total number of concepts
| at play, but hides them beneath/above the layers.
| fluoridation wrote:
| The problem is that there's an impetus to continue working on
| top of established facilities, because it's usually
| incrementally less work than reworking a piece of code into
| something else. Plus it's difficult to recognize ahead of
| time when something is about to become a problem, rather than
| fix something that's already a problem.
| feoren wrote:
| You're absolutely right that it's important to look beyond how
| two modules superficially look right now, and look instead at
| how they _change_. However, if you 've always defined your
| abstractions based on what their consumers _need_ rather than
| what their implementations _have_ , then you shouldn't ever
| need to stretch them. They're not trying to "cover" both cases,
| they're trying to solve a problem that both cases have. Your
| two cases are not implementations of the abstraction, they are
| consumers of it. If one case grows to not have that problem, it
| just stops asking for that abstraction. If it grows to have
| more problems, it just asks for more abstractions. The original
| abstraction, if based on a common need, doesn't have to change.
|
| That's not to say abstractions never change -- they do. But
| they change because your understanding of the sub-problem
| they're solving has changed, not because their implementations
| or consumers have changed.
| peeters wrote:
| I try to think about whether two concepts are innately similar
| or incidentally similar. Computing compounding interest for a
| home equity loan and a mortgage might be innately similar. A
| desired change to one will probably make a desired change to
| the other. Computing growth of a fruit fly population and
| computing compounding interesting for a loan might be
| incidentally similar. Until you change your
| "computeExponentialGrowth" function to now handle occasional
| decimations from environmental sources, and anyone looking at
| the code wonders what the heck that looks like for a loan.
| HWR_14 wrote:
| As interest rates go back up, paying off (part of) a mortgage
| early might come back.
| adammarples wrote:
| If you've got your abstractions correct, then the exponential
| growth term and the decimation term will be partial
| differentials which will compose together nicely
| AnimalMuppet wrote:
| For a loan, maybe it looks like payments on the principal?
|
| But your overall point is _very_ correct. Don 't make an
| abstraction because of coincidence.
| msluyter wrote:
| I think one example where duplication > abstraction is in
| tests. I personally find tests that have a ton of extra helper
| classes/functions to do stuff like set up fixtures or do
| assertions to be painful to deal with. Taken to an extreme you
| end up with a mini test framework that obscures the actual test
| cases and is as hard to understand as the code in question.
|
| I'm not against shared test fixtures or some utility functions,
| but IMHO, it's better to have some duplication but clearer
| tests.
| echelon wrote:
| Fully agree.
|
| I would add that you should duplicate the common, cross-
| cutting setup (eg. faked/mocked dependencies that don't
| matter), but make the test conditions themselves explicit.
|
| You get a feel for the correct granularity the more tests you
| write within the codebase. If you try to be too clever in
| saving boilerplate, you'll cause pain for future
| modifications and maintainers. Sometimes fixing "clever"
| tests takes longer than the code change itself.
| abcdaiojjdfoj wrote:
| I like it when you have nice, composable utility functions.
| Ideally each test contains a short preamble setting up the
| appropriate context for the test to run. The preamble
| elucidates what the tests are actually testing. It can also
| serve as documentation on how to use those functions.
|
| There will probably be _some_ duplication across tests, but
| if the utility functions are idempotent /composable, they're
| usually pretty easy to read/understand and equally mechanical
| to write/update.
| ezekg wrote:
| > I personally find tests that have a ton of extra helper
| classes/functions to do stuff like set up fixtures or do
| assertions to be painful to deal with.
|
| I think it depends on the context. For example, I typically
| agree, but when I was writing authz tests [0], I ended up
| writing a DSL so that 1) I'd more more inclined to write the
| thousands and thousands of tests, and 2) I'd be able to focus
| on the actual authz assertion and not on verbose setup.
|
| I couldn't imagine writing those policy tests without that
| abstraction. I would have lost my mind with all of the
| repetition, and would have almost assuredly made mistakes.
|
| [0]: https://github.com/keygen-sh/keygen-
| api/blob/master/spec/pol...
| HumanOstrich wrote:
| Thank you for the link. This is inspiring. Do you have any
| resources you could link to that would explain some or all
| of the style for these tests?
| crabbone wrote:
| So... you kept modifying the two similar pieces of code until
| they became dissimilar. Why do you think that you wouldn't be
| able to modify the abstraction if you saw that it doesn't fit
| anymore?
| convolvatron wrote:
| I think part of the issue here is that a fair number of
| programmers work in shops where they have very limited
| agency. They are tasked with making the minimum defensible
| change to add a feature or fix a bug. They are not allowed to
| change the tests or suggest refactoring. So those things just
| don't occur.
| AnimalMuppet wrote:
| In that situation, the correct thing to do is, when the two
| pieces drift away from each other, to recognize that they are
| no longer the same abstraction and to break the connection.
| That may be painful - you have to look at everywhere that
| abstraction is used and figure out which thing it really is,
| and change the code to reflect it.
|
| But if that's going to happen, then in the early days, a little
| duplication was probably better.
| [deleted]
| lukeramsden wrote:
| > There are many instances I've encountered where two pieces of
| code coincided to look similar at a certain point in time. As
| the codebase evolved, so did the two pieces of code, their
| usage and their dependencies, until the similarity was almost
| gone
|
| https://connascence.io/
| NewEntryHN wrote:
| The author did not understood the idea.
|
| His description of his understanding does not include any
| reference to the "wrong"-ness of abstractions that shouldn't
| exist. If I read him as-is, I should conclude that the idea is to
| never make any abstraction at all. It obviously cannot be it
| since that would be stupid.
|
| "Wrong" abstractions are already bastardized, from their first
| iteration. Developers decide to code them nonetheless because
| they estimate that their "awkwardness" is worth it in comparison
| to code duplication. What they fail to realize is that, to the
| contrary that code duplication which just "is there", the
| awkwardness of the abstraction will compound.
|
| Duplication is the last resort, when one has established that he
| couldn't find any non-wrong abstraction.
| gumby wrote:
| An important context is the use case. Grossly speaking, business
| applications tend to have a shorter lifetime and faster cycle
| time than system code like, say, the Linux kernel or gcc. So the
| cost of refactoring in the latter case is amortized over longer
| timescale; when you have rapid business needs it can often be
| better to just pmake the change in two or three places and move
| on because in a few years the whole thing will be replaced.
|
| We all know of exceptions to those examples (quick-and-dirty code
| that survives decades later) but I think that's the way to think
| about it.
| waffletower wrote:
| I have my umbrella at the ready for a downvote hailstorm: it
| makes perfect sense that the OP is hearing this repeated in the
| Rails community, as they are already enmired in the wrong
| abstraction -\\_(tsu)_/-
| cochne wrote:
| > Not every piece of code is an abstraction of course. To me, an
| abstraction is a piece of code that's expressed in high-level
| language so that the distracting details are abstracted away. If
| I were to see a confusing piece of code littered with conditional
| logic, I wouldn't see it and think "oh, there's an incorrect
| abstraction", I would just think, "oh, there's a piece of crappy
| code". It's neither an abstraction nor wrong, it's just bad code.
|
| The wrong abstraction isn't crappy code itself. It is a
| reasonable looking piece of code that will force the next person
| into writing crappy code to accommodate it.
|
| Edit: I think the entire project of TensorFlow is a good example
| of this. They built the library around a "graph" entity, and
| anything you did had to be shoehorned to fit that. That worked OK
| for some straightforward neural networks and situations for a
| while. As the area evolved though, it proved very burdensome.
| They tried to evolve it into TensorFlow 2.0 which was more
| forgiving, but by that point it was too late, the ecosystem
| became a mess. PyTorch stole the thunder because they didn't make
| the wrong abstraction (though I'm not sure if "duplicating" is
| what helped them do that)
| Strilanc wrote:
| One of the major shifts in my coding style over the past ten
| years has been to increase the amount of duplication. My
| threshold for "I should really dedupe that" increased from ~3:7
| lines to ~10:50. Looking back this was driven by two main
| factors: testing and performance optimization.
|
| The testing side is just that tests become awful much faster than
| normal code if you dedupe them. Unit tests are supposed to be
| simple and independent, but deduping makes them correlated and
| complex. You think you'll make things simpler by extracting the
| common setup from twenty tests into one method, but instead
| you've coupled the tests so they can't individually be tweaked
| and laid the seeds for a monster incomprehensible test object to
| grow from.
|
| The performance side is that often improving performance requires
| removing abstraction layers so everything is in one spot,
| allowing irrelevant cases to be removed. Adding the abstraction
| layers ahead of time makes performance worse to start with, from
| all the jumping and "paper over one more difference" flag
| checking, and also makes performance improvements harder later.
|
| If two things are supposed to behave analogously, I'm nowadays
| much more likely to enforce this by testing the analogy rather
| than by sharing the implementation.
| yafbum wrote:
| Let me give an example bad abstraction that isn't due to littered
| conditionals, but still very bad.
|
| One time company A had a database, and code that loaded persisted
| object state from them. Some of the objects could be soft
| deleted. Rather than check various objects for soft deletion, the
| team decided to check all objects for soft deletion, regardless
| of their type, by querying a table where objects had to be listed
| if they were still live (not soft deleted).
|
| Fast forward a few years, everybody follows this pattern, and
| there is massive hotspotting of that central "object lifetime"
| table that has basically two columns (object_id, is_deleted) that
| becomes a latency bottleneck because absolutely everything is
| joining on it all the time.
|
| Truth is, it made it convenient to code with this, because you
| never had two ways of checking whether an object was live, and by
| construction you could never make the mistake of operating on a
| soft deleted object or forgetting to implement lifecycle
| deletion.
|
| But man was that a poor abstraction. It was probably redundant
| with database functionality. It gave soft deletion capabilities
| even to things that didn't need soft deletion. It had a
| significant latency cost. But everybody adding a new object type
| just picked it because it was the way the company has decided it
| would do soft deletion.
| mannykannot wrote:
| I feel you are describing an implementation that was once fine
| but is no longer satisfactory, rather than an abstraction,
| which perhaps could have been made easier to fix with a bit
| more abstraction: a function to do the soft deletion if
| possible, with a better-performing (albeit probably more
| complex) way of determining whether soft deletion was an
| option.
| harrisonjackson wrote:
| > Fast forward a few years
|
| Sounds like it was just what was needed at the time and worked
| better/longer than most abstractions.
| RandallBrown wrote:
| > Sounds like it was just what was needed at the time
|
| The problem I've seen often in codebases is that as an
| abstraction or pattern grows more unwieldy, they don't take
| the time to update it.
|
| They often don't get revisited until they're so bad that they
| can't be ignored.
|
| Handling a something with a switch or if/else is fine if
| there's only 2 or 3 options, but people will often just keep
| piling on. When it's 10 things, changing it becomes much more
| work so people will continue to add to it. Then when it
| breaks at 20 things, someone will come in and say "Why did we
| write it this way in the first place? It doesn't make any
| sense!"
|
| I'm often torn between pragmatically writing the simplest
| code possible and being proactive about abstracting early to
| prevent an eventual breakdown of the pattern.
| edgyquant wrote:
| Why is a switch bad? Python uses a giant switch statement
| to run it's opcodes
| Pannoniae wrote:
| How does a switch break at 20 items? Any respectable
| compiler or interpreter should handle that fine. If it was
| 32k cases, I could imagine why it would raise an error. But
| 20? Seriously?
|
| Often, writing more cases into a switch statement is way
| easier and less boiler-plate-y than abstracting it out to
| subclasses or a dictionary or whatever.
| arein3 wrote:
| It depends. If it's not some core area of the code, but more like
| a script, some code that lives at the periphery, it might be
| better to "duplicate" almost similar code that is hard to
| abstract.
|
| I saw attempts to remove "duplication" that made the code so
| hairy and hard to read, as opposed to very readable. I put
| duplication in quotes, because code might be similar, but not
| 100%.
|
| Some code is easy to deduplicate.
|
| Some code might be hard, and if the overengineering is done to
| remove 2 occurrences at some code periphery, is not worth it.
| geoguess wrote:
| I'm a copy+paste programmer, and proud of it. It's quicker,
| easier, and most importantly: someone else's problem to fix, if
| they're the type of developer who disagrees with this coding
| style.
|
| I'll keep churning out duplicated code and you guys can keep
| refactoring against it. We all get paid so what's the problem?
| disintegore wrote:
| Pragmatic genius
| Bjartr wrote:
| For some, there's more to job satisfaction/QoL than just money.
| seattle_spring wrote:
| I have a feeling you and I are not paid the same, so there's
| that.
| saurik wrote:
| As someone who has made a good life over the years by taking
| advantage of the security bugs (either to build my embedded
| empires--aka, jailbreaking--or to directly collect bounties)
| caused by all of the people who hate abstraction so much (or are
| merely so bad at doing it that they don't know how to do it well)
| that they vehemently argue that duplication is not merely a
| temporary pragmatic decision to incur potentially-dangerous
| architectural debt which you intend to come back and fix later
| but is somehow _better_ than even _trying_ to address it, I guess
| I find this discussion thread of people almost 100% tearing into
| this article 's fundamental premise... kind of fun? ;P
|
| So, yes, yes: _please_ do continue to ensure you have so much
| boilerplate in your "flat and easy to understand" code that you
| eventually make a fatal mistake (potentially simply while doing a
| merge commit), refuse to factor your safety checks out into
| abstractions that prevent you from making the same mistake twice
| due to your refusal to "obfuscate the underlying API everyone
| knows how to use", and (my true favorite) litter your code with
| multiple implementations of the same algorithms that have _very
| subtle_ differences in them (so called "parser differentials")
| as you insist on every single programming language in use having
| its own copy of the algorithm "for ergonomic reasons, as IPC/FFI
| would be crazy when I can just import a second one off-the-
| shelf".
| rightbyte wrote:
| How do you know the code you reverse engineered was flat and
| simple and not Best Practice with Scrum on top?
| 1vuio0pswjnm7 wrote:
| Abstractions that are software language idioms are more palatable
| than bespoke ones.
| amelius wrote:
| What is wrong with duplication if you can just ask Copilot to
| deduplicate it whenever you want?
| eweise wrote:
| DRY to me means having a single authoritative source. So for
| instance, if I need to define a person data structure then I use
| protobuf. I can add validation rules, and types to it. I can
| generate bindings for java, go, ruby, etc and they can all rely
| on the same person structure, with the same validations. Code is
| technically copied but there is still a single authoritative
| source.
|
| If I need handle bank transactions, then I will create a single
| "microservice" that knows how to create a transaction and update
| the account balance. I wouldn't want that logic duplicated in
| multiple places.
| bingemaker wrote:
| Wrong abstractions make the abstraction configurable, and it is a
| slippery slope. Keep adding more arguments, configuration
| options, and there is no end to it. Sometimes duplication is
| indeed cheaper
| geophile wrote:
| Don't generalize too soon. But don't wait too long either. If you
| have to choose one, wait too long and then refactor.
| 49531 wrote:
| I think mislabeling something as a duplication is where most of
| these issues stem from.
|
| Humans love to pattern match, we find patterns in things that
| often have no real pattern. It is not uncommon in my experience
| to see patterns in code, label the code as not DRY, and attempt
| to DRY it up. If the "duplication" detected was, in fact, not a
| duplication but rather code that just happens to be similar, the
| abstraction will often go awry.
|
| My rule-of-thumb is to prioritize maintenance over authorship. Am
| I writing this code in a way that makes it easier for future me
| or another programmer to change it, or am I optimizing for a
| sleek diff in my code review? I think our code can look like
| breadboards instead of a bespoke printed circuit board, we have
| compilers for that.
| cm2012 wrote:
| Not coding related, but the worst sins of corporate life (like
| strict procurement teams) stems from efforts at deduplication.
| shadowfoxx wrote:
| In my career as a software dev I've found one thing to be true -
| Every Paradigm I ingest that opens new windows of opportunity are
| great at first pass and as I learn more the more narrow the scope
| they can be applied. (This is kinda true in life, too. Like when
| people say, "Its econ 101 or bio 101." etc. What seems like a
| statement about 'common' knowledge is actually an indication of
| how shallow your knowledge is!)
|
| Specifically related to this topic is a talk by Dan Abramov
| called, "The Wet Codebase" - He says it better than I can sum up
| and has visual aids : https://www.youtube.com/watch?v=17KCHwOwgms
|
| Other have pointed out code that is similar in function vs
| similar by coincidence and I think that thought alone is worth
| chewing on.
| linuxftw wrote:
| If working in a solitary codebase, this problem isn't very
| interesting. Do whatever makes your life easier.
|
| If you're working on any kind of code that serves as a library to
| other code, don't mutate the signatures of your public
| methods/functions. Once that signature is released, the only
| changes to it's output should be bug fixes. If you have a need
| for two very similar functions, you should use 2 wrapper
| functions with the common code in the 3rd.
| mcqueenjordan wrote:
| The crux of the argument is:
|
| > I think "the wrong abstraction" is a confused way of referring
| to poorly-de-duplicated code.
|
| But I believe this is similar to a no true Scotsman fallacy. "If
| you just make the right abstraction, de-duplicating is fine!"
|
| Yes, if you're good at making the right abstraction, it's not
| worse! Those are the cases when I definitely do the refactoring:
| when I know for sure I know the right abstraction. Otherwise, I
| defer the decision for an older, smarter, wiser me (or future
| maintainer).
| jkubicek wrote:
| It _is_ the "no true Scotsman" fallacy.
| vlunkr wrote:
| > To me, an abstraction is a piece of code that's expressed in
| high-level language so that the distracting details are
| abstracted away
|
| That might be what an abstraction is to the author, but it's not
| a correct definition. Abstraction has nothing at all do with the
| high or low level languages.
|
| https://en.wikipedia.org/wiki/Abstraction_(computer_science)
| TheOtherHobbes wrote:
| It's curious there's no formal concept of "unduplication" -
| splitting a single abstraction originally created to avoid
| duplication, now littered with conditionals and spaghetti, into
| separate abstractions that now do something unrelated.
| itsafarqueue wrote:
| This code golfing always boils down to "it depends". Senior
| engineers by definition nod sagely and everyone else looks around
| nervously. It's a tough break. Both approaches are correct and
| wrong. It depends.
| msie wrote:
| When I was younger I was more productive when I didn't
| contemplate such matters. Maybe I wrote a lot of junky code but I
| got a lot of working stuff done. Now my time is wasted reading
| clout chasers and their opinions. Reading about coding is such a
| bad habit when it stops you from coding.
| twodave wrote:
| I have been down both roads. I've seen unwieldy abstractions
| reduce a codebase down to a giant pile of edge cases, and I've
| seen codebases where making a single change to the design has
| required editing dozens of files. Where I've ended up over the
| years is to abstract the "big" things. The types that represent
| your domain. The pieces of the data layer that need to be exactly
| the same every time. After that, solve for large classes of
| problems. This may be an abstraction, a usage pattern, or just a
| function. Transaction management, logging, etc.
|
| Know that if you try to wrap ANYTHING in an adapter "in case we
| want to swap it out later" that this almost never happens, and
| when it does the abstraction you came up with is probably
| inadequate. Transaction handling in one tech is different than
| another. Or logging context is handled via disposable scopes
| instead of as part of the log entry. For those cases, if someone
| isn't already maintaining a good abstraction (like MassTransit)
| then it probably doesn't exist.
| EugeneOZ wrote:
| There is a simple merit: if some code is complicated enough to
| make you think twice before modifying it because you'll need to
| modify all the copies (and you realize that it will be not easy)
| - then it is better to make this code DRY.
|
| There are some simple pieces of code that are cheap to copy and
| modify later. And nothing wrong will happen if you do not apply
| future modifications to every copy. A code like this doesn't have
| to be DRY.
| sozin wrote:
| I think the author gets it wrong.
|
| The cost of DRY (Don't Repeat Yourself)-ing up your code can be
| high, in that it increases the coupling of your code, and
| potentially lowers its cohesion.
|
| Consider function def foo(a: int), called from call sites C1 and
| C2. Eventually C1 wants something out of foo() that it doesn't
| offer, but, critically, something that C2 _doesn't care about or
| need_. The author of foo() adds a new default argument: def
| foo(a: int, b:int = 0), and then there is a conditional block in
| foo() that deals just with this new b argument.
|
| You've now potentially broken callsite C2, by exposing it to
| changes that it doesn't care about it. Put another way: you
| should only deduplicate the code of _all_ the call sites will
| _always_ change for the same reason. Otherwise, you're lowering
| code quality of the code by increasing coupling and lowering
| cohesion. Copy and pasting the code in this case makes sense,
| because C1 and C2 both have entirely different needs out of
| foo(). Overtime, foo() will accumulate more and more default
| arguments as the author stridently attempts to keep everything
| DRY, and the overall code base becomes more and more fragile.
| jackcviers3 wrote:
| So you make: // new foo block //
| using b:int stuff, // calling fooInternalsX,
| // fooInternalsY, etc. // for common functionality
| foo(a: int, b:int) = ...
|
| and // stays the same // except
| replacing // common functionality with // calls
| to fooInternalsX, // fooInternalsY, etc.
| foo(a: int) = ...
|
| and enough private fooInternalsX(a:Int) = ...
| private fooInternalsY(a: Int) = ...
|
| methods to cover the common functionality.
|
| Your code is still DRY, and you are using polymorphism (foos of
| different type signatures) instead of if/else, the _behavior_
| of foo(int) doesn 't change, so you don't require additional
| tests for _foo(int)_ , the fooInternals<X,Y,Z> aren't public,
| and you have now added tests for foo(int, int). You aren't
| paying any additional costs in terms of maintenance. You aren't
| increasing behavioral risk at C2 for calling foo(int). You are
| _only_ paying more for foo(int, int), and those are costs that
| you would have to pay regardless of if foo(int, int) literally
| duplicated the body of foo(int) for common pieces or refactored
| the common pieces out. You save cost for maintaining both
| foo(int) and foo(int, int) if the common pieces need to change,
| as you are adding tests for the behavioral changes to both
| foo(int) and foo(int,int) tests, but are only making a single
| change in the common code.
|
| Also, when doing this, the abstraction is the original
| foo(int), not the new, additional foo(int, int). Abstraction is
| the assumption of some parameterized behavior via hard-coding.
| Here, the new, additional parameterized behavior introduced by
| the second b:int parameter is abstracted away in the original
| foo(int), not in the new foo(int, int). That doesn't make the
| original foo(int) abstraction _wrong_ , because it is used in
| at least one call site (C2).
|
| Only when all call sites must change to accommodate something
| that a new parameter allows through _more than one change-set_
| can you begin to call an abstraction wrong. Otherwise, it is a
| simple bug that was fixed by a single change-set.
| mannykannot wrote:
| _" To me, an abstraction is a piece of code that's expressed in
| high-level language so that the distracting details are
| abstracted away. If I were to see a confusing piece of code
| littered with conditional logic, I wouldn't see it and think "oh,
| there's an incorrect abstraction", I would just think, "oh,
| there's a piece of crappy code". It's neither an abstraction nor
| wrong, it's just bad code."_
|
| Of course, if bad code is not an abstraction, then there can be
| no such thing as a bad abstraction!
|
| More to the point, code littered with conditional logic might
| well be both good code and a good abstraction. There's a somewhat
| well-known article out there claiming that Netscape shot itself
| in the foot by deciding to rewrite the browser from scratch. As
| an example of how that went wrong, the author mentions the
| hapless developer trying to write code to work with some hardware
| component (the great many different dial-up modems that were out
| there at the time, IIRC), discovering that most of them had
| unique quirks that had to be respected, even when they nominally
| conformed to the same spec.
|
| The thing is, you can no more apply abstraction to a program
| until everything is simple than you can apply compression to a
| file until its down to a byte. What's really at issue here, as
| Fred Brooks noted many years ago, is the difficult problem of
| satisfying the demands of the context's essential complexity
| while keeping a lid on the implementation's accidental
| complexity.
| tikhonj wrote:
| There are a lot of ways for good code to express bad
| abstractions. The abstraction could be inconsistent with other
| parts of the system, inconsistent with the concepts it is meant
| to represent, inconsistent with its own observable behavior,
| inherently complex or hard to reason about, inconvenient to
| actually use, poorly suited to whatever people _actually_ use
| it for...
|
| I've seen a lot of code that is perfectly clean and "well-
| organized" _as code_ but organized into _absolutely awful_
| abstractions.
|
| None of that goes against your core point, I just think that
| seeing the code and its abstractions separately is an important
| perspective for understanding code design.
|
| On the flip side, it's also totally possible to have bad code
| but a good abstraction. Some of the best abstractions I've
| worked with have painful implementations, and it didn't impinge
| on the quality of the abstraction itself! Of course, the bad
| code made life a lot more painful for the people responsible
| for implementing and maintaining the abstraction, and I'm sure
| it required some real skill and experience to keep that from
| manifesting to users of the abstraction, but they managed it.
| joshstrange wrote:
| The whiplash I get from reading this article is massive. One
| second they agree that bad abstraction (filled with conditionals)
| is bad but then say:
|
| > So instead of "duplication is cheaper than the wrong
| abstraction", I would say "duplication is cheaper than confusing
| code littered with conditional logic". But I actually wouldn't
| say that, because I don't believe duplication is cheaper. I think
| it's usually much more expensive.
|
| (emphasis on the last sentence)
|
| I couldn't disagree more. In fact it's an incredibly "junior dev"
| mindset that sees 2 pieces of similar (or _even identical_) code
| and is compelled to abstract it. Unless there are at a _minimum_
| of 3 implementations I think it's always better to duplicate.
| I've watched too many "common" functions grow over time with way
| too many arguments, too many conditionals, and way too confusing
| for anyone to easily follow. The most egregious is different
| return values based on arguments passed in. I'm not talking
| "array of strings" or "null" but "array of strings" or "single
| string" (or worse).
|
| Abstraction can be fun to write and it feels like you are doing
| something to help "future proof" (also XKCD 927 [0]) but in
| reality it boxes people in (especially if you try to abstract
| with less than 3 real implementations) and leads to overly
| complicated code, or worse "clever" code.
|
| As I've grown as a dev I'm less and less inclined to write
| "magic" or highly abstracted code and prefer dealing
| with"boilerplate" that I can tweak as needed for the individual
| use-case. Only once I have a clear pattern of code that's been
| deployed and used for a good bit of time do I reach for
| abstraction/reusable code.
|
| [0] https://xkcd.com/927/
| mostlylurks wrote:
| > I've watched too many "common" functions grow over time with
| way too many arguments, too many conditionals, and way too
| confusing for anyone to easily follow.
|
| This is not the fault of the abstraction. This is the fault of
| (especially junior developers) treating abstractions as sacred
| and non-disposable, which is itself the result of a mindset in
| which creating abstractions is discouraged. You should almost
| never modify an abstraction. Don't modify abstractions to cover
| new use cases, and you more or less won't run into any of these
| issues. If you need to, create new abstractions and throw old
| ones away.
|
| > Unless there are at a _minimum_ of 3 implementations I think
| it's always better to duplicate.
|
| This is a silly rule to follow, except for the most
| inexperienced of developers, perhaps. It doesn't take long to
| gather enough experience to know be able to recognize in most
| cases whether some instance of duplication is coincidental
| (structurally similar by happenstance, which could be
| "abstracted" in a macro-like manner, resulting in something
| quite fragile to changes) or if you're actually encoding some
| piece of knowledge into an abstraction. Advice like waiting
| until a piece of code repeats three times encourages developers
| to think about abstractions in terms of structural similarity,
| which is exactly the opposite of how abstraction should be
| considered.
| joshstrange wrote:
| > This is a silly rule to follow, except for the most
| inexperienced of developers, perhaps.
|
| Perhaps you'd consider me inexperienced though I don't
| consider myself to be so. I've learned enough times that
| neither I, nor my colleagues, can accurately predict the
| future and every time we think we know the cases that code
| will need to handle in the future we guess wrong more often
| than not.
|
| What I'm trying to say is until you are sure a piece of code
| is literally the same or with tiny differences that you can
| cleanly abstract you shouldn't try to guess how future code
| will use the abstraction. It's the same rule of mine where I
| try to never proactively add functionality to a
| function/piece of code. You think that you are saving your
| future self (or peers) time but too many times I've see
| people guess wrong at what extra functionality we will need
| and then that code never gets touched and/or gets
| migrated/updated for years before someone realizes there is
| no calling-code that uses that functionality but we have been
| dragging it along this whole time.
|
| Could you check everywhere and make sure it's not being used
| and thus can be removed? Maybe but I understand the desire to
| make as few changes as possible and preserve the
| functionality as it was when you first went to edit the code.
| Overall that's a good idea when making changes and sometimes
| you don't always know what params all the clients are passing
| to an endpoint to be sure of if something is still in use or
| not.
| jamil7 wrote:
| > I couldn't disagree more. In fact it's an incredibly "junior
| dev" mindset that sees 2 pieces of similar (or _even
| identical_) code and is compelled to abstract it. Unless there
| are at a _minimum_ of 3 implementations I think it's always
| better to duplicate. I've watched too many "common" functions
| grow over time with way too many arguments, too many
| conditionals, and way too confusing for anyone to easily
| follow. The most egregious is different return values based on
| arguments passed in. I'm not talking "array of strings" or
| "null" but "array of strings" or "single string" (or worse).
|
| I agree with you here and tend to rather, if possible,
| deduplicate subsystems or sub-functions of similar
| looking/identical code and keep the duplicate public surfaces.
| joshstrange wrote:
| > I agree with you here and tend to rather, if possible,
| deduplicate subsystems or sub-functions of similar
| looking/identical code and keep the duplicate public
| surfaces.
|
| Completely agree, take the small parts that are
| standalone/discrete and abstract them. I greatly prefer
| something like function1() {
| commonCode1(); commonCode2();
| commonCode4(); } function2() {
| commonCode2(); commonCode3();
| commonCode4(); }
|
| Over something like (assume I've inlined the commonCodeX
| logic): functionCommon(branchingParam) {
| if(branchingParam) { commonCode1();
| } commonCode2(); if(!branchingParam)
| { commonCode3(); }
| commonCode4(); }
| jkubicek wrote:
| > As I've grown as a dev I'm less and less inclined to write
| "magic" or highly abstracted code and prefer dealing
| with"boilerplate" that I can tweak as needed for the individual
| use-case.
|
| This is part of creating abstractions to benefit the reader,
| not the writer of the code.
|
| I'm currently refactoring a python package that was designed to
| make writing ETLs very elegant (it worked!), but as a
| consequence, when something goes wrong, figuring out what
| happened involves pouring through 4 different modules, class
| hierarchies and trying to track variables through multiple
| layers of abstraction. It's a nightmare for debugging.
|
| Simple boilerplate is repetitive and boring, but man would it
| be so much easier to read
| joshstrange wrote:
| > Simple boilerplate is repetitive and boring, but man would
| it be so much easier to read
|
| Yep, and I'll fully admit when I first started out I hated
| this idea and wanted everything to be super-DRY but I've
| swung back in the opposite direction (or at least to a good
| mean). I had a developer ask why we had some boilerplate
| semi-recently when the function in question was simply
| calling another function on the parent class, why not just
| call the parent function directly (it was protected, they
| wanted to just make it public). I explained that yes, right
| now we were doing a straight pass-through essentially (this
| was for a CRUD layer) but that we had learned over and over
| that over time we needed to add in things like business
| logic, validation, or data migrations and this way we just
| needed to change our "intermediate" function instead of
| adding one later and having to change all the places that
| were calling the "direct" function. Same idea as with
| getters/setters, yes you don't "need" them always when you
| first write them but having those hooks are invaluable down
| the line.
| jongjong wrote:
| I wish people would have this saying in the Node.js and
| JavaScript community. I disagree with OP about this topic.
|
| Abstractions are like the foundations of a building. Imagine that
| you're building an apartment block and your job is to build the
| foundations but you're unsure about how tall the building will
| be.
|
| If you build it on mud, that might be fine for a one story
| construction but once other builders start adding additional
| storeys on top, it will become totally unsuitable and the whole
| thing will have to be rebuilt from scratch. Not only that, the
| costs will begin to materialize immediately because those who
| build on top of your foundation will make all sorts of bad
| decisions because of your poor judgment; they might decide to
| build the walls out of cheap wood instead of bricks simply
| because it's lighter and they don't want the building to tilt and
| sink into the mud... Then because wood was chosen as the
| material, there may be a termite infestation and builders will
| have to apply a special varnish on the entire surface of the
| building... Then the varnish will turn out to be toxic and will
| need to be removed. Every inch of the building will have to be
| polished with sandpaper and painted over... And when the next
| storey will need to be added, they will be forced to make it out
| of cardboard... Then the tenants on the top floor will want their
| money back and the whole building will need to be destroyed
| anyway; all that back and forth will have been nothing but a
| waste of time. You would have saved an entire decade and millions
| of dollars if the foundation had been laid on solid bedrock in
| the first place. Just one small sub-par decision which triggered
| an avalanche of terrible decisions.
|
| I think duplicating code makes sense and can be a wise decision
| early in the project because it's essentially a refusal to lay
| the foundation until there is more clarity about the scope of the
| project. It's a lot easier to refactor and combine duplicated
| code into a new abstraction than it is to refactor one
| abstraction into a different abstraction. Not to mention that
| developers become very attached to abstractions (including
| incorrect abstractions) and it tends to upset people once they're
| invested in it.
| ekidd wrote:
| I dislike code duplication. But do you know what I like even
| less?
|
| Giant functions with 12 keyword arguments passed up and down a
| call stack, because those functions have many callers which want
| _slightly_ different things.
|
| Choosing the wrong abstraction often leads to endless kludges and
| special cases. Two warning signs are functions with 12+ keyword
| arguments, and strange class hierarchies full of callbacks that
| only interact with a few functions.
|
| The problem with all programming advice is that it needs to come
| with George Orwell's classic advice to "Break any of these rules
| sooner than say anything outright barbarous."
|
| If programming advice makes your code look obviously gross,
| ignore the advice.
| CharlieDigital wrote:
| That's not what abstraction is.
|
| Abstraction tends to shift logic from _procedural_ to
| _structural_.
|
| Rather than 12 keyword arguments and 12 branches in 1 big
| function, it should be 12 small classes (in OOP) or 12 small
| functions (in FP) that each handle one of the branches. All
| organized in some way that thelogic of executing those parts is
| shared in the structure of the code.
| ekidd wrote:
| > _Rather than 12 keyword arguments and 12 branches in 1 big
| function, it should be 12 small classes (in OOP) or 12 small
| functions (in FP) that each handle one of the branches._
|
| I mean, sure, you could convert your library into 12 little
| classes, or a collection of purely-functional combinators.
| Sometimes that helps. Sometimes it makes the situation even
| worse.
|
| Some of the most terrifyingly inappropriate abstractions I've
| seen in my career involved complex class hierarchies, or
| worse, things like "abstract interpretation over the free
| monad."
|
| There's no substitute for asking, "Are these things I'm
| trying to abstract over actually similar in any fundamental
| way?" And "Is this code actually just horrible?"
| CharlieDigital wrote:
| > "Are these things I'm trying to abstract over actually
| similar in any fundamental way?"
|
| I'm not sure why this is even a point; if there's no
| similarity of the things that are being abstracted, why
| would one even discuss abstraction in the first place? The
| point of abstraction is that there is some fundamental
| similarity that the abstraction addresses. `ICloudStorage`
| abstracts `GoogleCloudStorage` and `AwsS3Storage` because
| at some level, they both have the same abstract operations:
| read, write, delete, etc.
| overgard wrote:
| Ah, but the solution is to turn that 12 argument function into
| a class that does one thing (runs the function), and dependency
| inject all those arguments. It still totally sucks, but you can
| pretend you're writing "clean" code by obfuscating the
| parameter passing.
| gpderetta wrote:
| Worse, for those 12 parameters (invariably booleans or if you
| are lucky enums) functions is that usually only a small subset
| of all possible flag combination is tested (or even
| meaningful). Worst part is when the flags are directly or
| indirectly under user (or configuration) control and the
| application can go into uncharted territory.
|
| Worse still, good luck refactoring those functions when you
| have no idea which combinations are actually meant to be
| supported and what their original semantics where.
| ozim wrote:
| I would rewrite it in a bit different way:
|
| Fixing a bug in one place and being sure only one place was
| affected and being sure that one place was really fixed is
| cheaper
|
| -
|
| than fixing a bug in one place affecting 20 where in 15 places it
| was a proper fix, in 5 places it will break in unforeseen way
| when user does something different and somehow additional 2
| places totally broken because no one ever knew these were
| affected.
| theknocker wrote:
| [dead]
| sergiotapia wrote:
| I disagree with OP. You cannot abstract after two or three dupes
| because you don't even know what you have or need yet.
|
| Let it breathe, let it stink for a bit - THEN make an informed
| decision about what to refactor and abstract. You're just jerking
| off otherwise, and I hate working with code that's been
| abstracted early for no reason.
| peeters wrote:
| > I don't see how it can be said, without qualification, that
| duplication is cheaper than the wrong abstraction.
|
| I mean, this statement _IS_ qualified. The word "wrong" is doing
| some heavy lifting. Part of what makes an abstraction wrong is
| when it is expensive to use as tiny differences emerge in the
| requirements.
| hinkley wrote:
| It's also wrong when active epics contain a third
| implementation of the "same" pattern.
|
| It's been a while since I've seen as much time wasted as trying
| to a tract the second implementation only to be proven wrong by
| the third. So instead of being for instance 8, 8 and 16 points
| to implement, it ends up barely squeaking by as 8, 16 and then
| 16 again.
|
| It's one thing to fight the Rule of Three for things that might
| happen. It's quite another when it _will_ happen.
| lcuff wrote:
| In about 1990 I got tasked with building an installation and
| configuration system for the hardware and software package my
| company built. It was an Ethernet card and a TCP/IP suite being
| added to the PCs of the era (that had an AT/ISA bus where you had
| to find a free address block, then jumper the card to have the
| correct address, lotsa fun.)
|
| I wrote the first system targeted at AT&Ts Unix for the 386.
| After it was completely done, I was assigned to do the same for
| Xenix. After that was completely done, I got assigned to do SCO
| (Santa Cruz Operation) Unix. After that, Interactive Systems
| (ISC). Each system had its own architecture for installation and
| configuration. I didn't know in advance anything about the
| different systems, nor any knowledge that the other systems were
| on the horizon. As I was writing the second system, I was
| refactoring like mad to avoid duplicating code, and feeling very
| proud due to previously learning the horrors of duplicate code. I
| can't remember details, but among other things files had to be
| placed in a specific directory hierarchy for each system, and
| various files had to perform certain (different) functions on
| each system. When I turned to the third and fourth target
| systems, the refactoring just became weirder and more
| complicated, but I was determined to avoid duplication.
|
| Historically it turned out we never revised these releases. With
| 20-20 hindsight, it's a case where the refactoring was completely
| pointless, and code duplicated 4 times would have been way faster
| to create, and easier to maintain if we had made new releases. I
| think part of Sandi's point is that YAGNI applies as well ... a
| higher level abstraction may accommodate changes that never
| arrive, or the changes may be so large that NO abstraction will
| cover it.
|
| On the opposite end of the spectrum, in 1980 (yeah, I'm really
| old) as a summer-hire, I'd written in HP-Basic this very funky
| single-purpose very primitive data base system. When I returned 9
| months later after graduating, two full time guys had made small
| changes to the system, but one guy had made a breaking change,
| the other guy got pissed off and duplicated _the entire program_
| (a single file, to be sure) and made one small change. Thereafter
| I had to maintain two versions of the thing. Gaaack. It was the
| ultimate lesson in "don't duplicate code". (It was also in the
| days before we had a version control system or diff, so backing
| out and correcting the change wasn't practical.) Mel, where are
| you now?
| Kapura wrote:
| I've never so viscerally disagreed with a link on this website.
| Particularly this point:
|
| > Duplication is bad. In fact, duplication is one of the most
| dangerous mistakes in coding.
|
| This to me reads insane, fanatical. One of the biggest benefits
| of duplication that the author fails to identify the locality of
| logic. When, not if, things break, there's a large benefit to
| having all of the logic contained to a few heavy-lifter classes
| that contain bespoke logic and are fit-for-purpose.
|
| "The wrong abstraction" in this case is bending over backward to
| fit your data into another API just to cut down on code
| duplication; it is better to have code with clean, uninterrupted
| data flow than code that frequently needs to re-translate the
| data to be consumed by different APIs, then decode the results
| back to useful logic. The translation/decoding steps are new
| places to introduce bugs, and the more translation or decoding
| required, the more bug-prone the code will be.
|
| A good abstraction to de-duplicate code should not add complexity
| to the existing call sites. If you've squinted and decided that
| two systems are close enough that they can be abstracted
| together, you're likely making one or both of those code paths
| much more treacherous.
| wvenable wrote:
| As a programmer, if you don't create an abstraction you'll
| never be more than a 1x programmer. Abstraction is how to get
| more productive than simply how fast you can type.
|
| Yes, the wrong abstraction is bad. But almost every argument
| whether it's for/against duplication or for/against abstraction
| usually starts with the hidden premise that you're stuck with
| whatever choice you've made and code you've written forever.
| The underlying issue is the fear of change and the sunk cost
| fallacy of already written code. If you have the wrong
| abstraction, you can change it. If you created too much
| duplication, you can remove it.
| AmpsterMan wrote:
| Wrong abstractions start off as right abstractions and slowly
| become wrong abstractions. What's the point in which a right
| abstractions becomes a wrong abstraction? Am I sure I can
| identify that point? Can someone else? Can someone else which
| has no knowledge of the original assumptions that were
| implied during the initial abstraction?
|
| There are two kinds of abstractions in your, the ones
| everyone complains about and the ones that no one has ever
| seen.
|
| My rule of thumb is thus: Have I repeated myself three times
| doing the EXACT same thing? Then CONSIDER abstracting away.
| Otherwise, make as many implicit dependencies explicit as
| possible and slightly keep repeating yourself until you are
| exactly repeating yourself
| wvenable wrote:
| Your rule of thumb isn't great: If you have some important
| logic duplicated twice but a year from now has a bug -- are
| you going to remember to change it both places? But, lets
| be honest, you probably would not create that code in the
| first place -- you'd have abstracted it automatically
| without even thinking about it.
|
| These conversations generally tend to completely discount
| _experience_. Junior programs are often terrible at
| abstractions -- they either do way too much or way too
| little. Can I give them a hard and fast rule that they can
| use to never make that mistake? No, I can 't. It doesn't
| exist. The only reason I know what's good or bad is because
| I've done it wrong thousands of times.
|
| That's the problem with every single one of these articles
| that prescribe one true solution. It's not at all that
| simple.
| tuyiown wrote:
| I'm sorry you are totally, utterly wrong.
|
| Wrong abstractions percolates to your system, assumptions on
| how your abstraction is supposed to work is hard ossification
| that hides concrete implementation and their actual,
| generally simpler, contraints.
|
| Basically your refactoring work now requires to understand
| all user code that relies on the wrong aspects of your
| abstractions find a way to correct it if your lucky and make
| it work exactly the same way the duplicated code would have.
|
| And I didn't mention implementations that drifts in
| incompatibles ways with the abstraction, a large source of
| errors and regrets.
|
| The good bet for productivity is recognizable implementations
| patterns and duplication.
|
| In the end refactoring duplicated code that had time to
| settle and drifted in legitimate ways to find your correct
| abstraction is a blast.
| wvenable wrote:
| I disagree and I can provide an example. I'm creating a
| library to interface with a REST API. The creators of this
| REST API obviously didn't do any abstractions and they have
| multiple implementations for the same exact process: paged
| queries of items. There's no reason for them to be
| different -- they're just different because, I assume,
| different developers built them differently at different
| times. Nobody looked at this and said "This is all the same
| so we should have one single common implementation
| abstracting over all the endpoints."
|
| However, as the developer of the interface library, I can
| abstract over all the differences and give my consumers the
| same exact same experience regardless of the API endpoint.
| And that's exactly what I did. So now they're all more
| productive because they don't need to know all these
| unimportant details. They don't even need to know that
| there's REST API. In fact, this this REST API replaces a
| previous API implemented with a completely different
| technology and we are swapping the whole thing out with
| minimal changes because I abstracted it years ago.
|
| Not all abstractions are wrong. Not all concrete
| implementations are simpler. My goal with an abstraction is
| to take some else and make closer to what we need because
| most technology has a wide audience with wide requirements.
| I'm a narrow audience with narrow requirements and so I can
| hide the vast complexity that I simply don't care about.
| kajecounterhack wrote:
| > The underlying issue is the fear of change and the sunk
| cost fallacy of already written code. If you have the wrong
| abstraction, you can change it.
|
| It's not that trivial. Consider that the wrong abstraction is
| reflected into your API (common), and consider that your API
| has many users. You are stuck with it, or you have to
| convince multiple teams (or, god forbid, external customers)
| to migrate to a better API. This can constitute a humongous
| waste of SWE-hours ($$$$$) and take quarters to accomplish,
| assuming you can get any buy in.
|
| I think it comes down to what your organization looks like
| and how many users are going to be touching your code. If
| your abstraction is just for yourself internally and everyone
| else is not allowed to touch it, then fine. You will own the
| tech debt if the abstraction is wrong. If your abstraction
| has other users at your company, or external customers, it
| had better be the right one or at least an unavoidable
| stepping stone.
|
| > If you created too much duplication, you can remove it.
|
| This is actually true. Refactoring duplicated logic is a lot
| easier than fixing bad abstractions.
| coliveira wrote:
| Migrating to better APIs is done all the time. It is not an
| issue worth discussing about anymore. But even if you have
| to maintain an API this doesn't mean you cannot change the
| underlying implementation.
| simonw wrote:
| It absolutely is an issue worth discussing, any time you
| are maintaining a library with more than a few other
| people using it.
|
| Breaking backwards compatibility in a library with
| hundreds or thousands of users is not something to be
| taken lightly!
| wvenable wrote:
| I'm working with an API right now that is absolutely
| based on duplicated code. They have a system for querying
| items and the API has 3 different ways depending on the
| endpoint. I just found a new one the other day and I
| hated it -- why is this one API unnecessarily different
| from all the rest!
|
| I'm building a library to call this API and I've
| abstracted over all these differences so my callers never
| have to know how messed up the underlying API is -- they
| get a consistent experience regardless.
| wvenable wrote:
| It is that trivial. There is no alternative. You either
| have an API or you don't and you either change it or you
| don't. Hand wringing over the potential of making a mistake
| is a waste of time and effort. You will make a mistake. You
| will never get it perfect. You just have to deal with it.
|
| > Refactoring duplicated logic is a lot easier than fixing
| bad abstractions.
|
| Then you've just created an abstraction with all that
| potential to be bad sometime in the future.
| mcpeepants wrote:
| > Then you've just created an abstraction with all that
| potential to be bad sometime in the future.
|
| I would argue that you've then created an abstraction,
| but with all the hindsight allowing you to create the
| _correct_ abstraction (or at least a much better chance
| at it approaching "correct")
| kajecounterhack wrote:
| > It is that trivial.
|
| > You will make a mistake. You will never get it perfect.
| You just have to deal with it.
|
| These two comments sound at odds. First statement says
| it's easy. Second statement says it's hard.
|
| We can agree that hard things don't get solved without
| iterating. But a productive response to abstraction
| (which is really API design) being hard is not to say
| "stop handwringing, just do it." Instead, you can employ
| various strategies such as preferring experienced people
| to do it, making sure they did a good job of gathering
| requirements and considered the risks of their approach,
| spending time testing customer/developer ergonomics, etc.
| You can also defer producing an abstraction until your
| system is a bit larger and the duplication is becoming
| too much to handle, since you have a larger sample size
| of potential uses for your abstraction to help you
| converge on the correct API.
|
| Good abstractions can be the difference between success
| and failure, between organizational velocity and
| technical debt quagmire. Saying "we should always build
| abstractions" when it's difficult to build them correctly
| in one go sounds totally wrong on its face.
| Kapura wrote:
| 100% agree, especially about refactoring duplicated logic.
| Super-duplicated code begs to be refactored, and having
| many examples of the same functionality helps you build an
| API that is robust without adding "what-if" functionality
| to try to futureproof code (impossible).
| overgard wrote:
| I'm not sure that's true, when I look at the code bases of
| some of the most productive coders I can think of (John
| Carmack and the Doom and Quake code bases for example), they
| tend to be fairly conservative with how much they abstract
| things. There's a lot of very thoughtful data structure usage
| (git is another good example of this) and diligence in
| maintaining a coding standard (which generally has little to
| do with formatting), but most of the code seems to be more
| concrete and task focused rather than abstract.
|
| I think Casey Muratori has a good way of thinking about this,
| with his concept of "semantic compression" (
| https://caseymuratori.com/blog_0015 ). To me that's a lot
| more valuable than the ideas that you get in say something
| like Clean Code (which in my opinion, the popularity of that
| book has been a disaster for the industry)
| wvenable wrote:
| I'm positive they don't have much in the way of duplicated
| code.
|
| Abstraction can be as simple as a function.
|
| It seems odd to give really good examples of abstraction
| and then sort of do a No true Scotsman argument on it. "It
| can't be abstraction because it's not terrible."
| Kapura wrote:
| so your argument is, "you've got to write functions,
| therefore abstraction is always good" ?
|
| yeah man i don't start in main() and never write another
| function ever again. You got me. But I also don't
| aggressively police duplication, accepting that I would
| always rather see what's happening without file-hopping.
| The correct abstractions will simplify code, and they
| will seem obvious (even if just in retrospect).
| Abstractions that force me to continually re-frame the
| problem i am trying to solve in terms of another's use
| case are antithetical to writing the sort of code that I
| do.
| wvenable wrote:
| You don't have to write functions. Have you ever seen a
| 10,000 line code that was just a single function -- I
| have. It also had nested if statements 5 levels deep to
| handle all sorts of logic with lots of duplication. It
| was unmaintainable. Yet it did meaningful work and it
| could have been refactored to a fraction of the size and
| do the same work. But, to be honest, when I had to fix it
| I just went it and fixed in the 20 places that needed
| changing because it was impossible to follow.
|
| You like good abstractions and you hate bad abstractions.
| I couldn't agree with that more.
| coliveira wrote:
| There are useful abstractions and useless abstractions. A
| lot of the GoF designs are bad abstractions (if used
| indiscriminately) and clutches for bad languages. However,
| using problem-focused abstractions is a big time saving
| strategy.
| jasonswett wrote:
| > One of the biggest benefits of duplication that the author
| fails to identify the locality of logic
|
| There are a lot of things in the post that I fail to identify.
| There are also some things I wrote there that I no longer
| completely agree with.
|
| Here's a newer post in which I go deeper, and in which I do
| address the locality of logic.
|
| https://www.codewithjason.com/duplication/
| makeitdouble wrote:
| > When, not if, things break, there's a large benefit to having
| all of the logic contained to a few heavy-lifter classes that
| are contain bespoke logic and are fit-for-purpose.
|
| Things break in cycles. You'll have worked around a first wave
| and be happy that you didn't abstract your code too much as you
| can just fix one side of your logic very locally. It also means
| you didn't touch the other sides that were not directly
| affected, but probably needed a fix in a slightly different but
| overall similar way.
|
| So you'll see your code instances all break a way or another
| and fix them one by one instead of hardening a central piece
| where you could focus your testing efforts.
|
| Of course it's a topic that needs nuance, but if you identify a
| piece of code as duplicate, there will be no free lunch. Either
| you pay upfront the effort of abstracting, or you pay down the
| line the local fixes, but there's not one approach or the other
| that will be fundamentally wrong, I see it as a bet that pays
| off or not.
| [deleted]
| mrbungie wrote:
| You go this route of thinking, and eventually you're making
| OOP/FP programs out of what could've been simple Bash scripts.
| tracerbulletx wrote:
| Did I miss where he justifies the statement "duplication is one
| of the most dangerous mistakes in coding"? That has not been my
| experience and it's the crux of the value judgement here so I'd
| expect him to explain why it's so bad.
| ae_throw wrote:
| We need to have a kind of a footnote in the pedagogy of software
| engineering, and engineering in general that states to avoid
| advice ("wisdom", lol) expounded by blowhards. You can identify
| it usually by the title - it'll have a Grand Style that betrays
| arrogance.
|
| Lots of people from GoF onwards think they qualify to preach
| bullshit ultimatums, thinking they have it all figured out. I
| don't think any of them have any fucking clue what should
| actually be considered harmful, what should be the two/three
| "hardest things in computer science", and other nonsensical
| bullshit they write. With apologies to Dijkstra who I do find to
| have been one of the shining lights of computer science and
| engineering but is often misquoted/out-contexted for that
| considered harmful thing. His letters do betray a higher plane of
| wisdom.
|
| The more recent "what programmers need to know about {x}" as if
| the author has any clue is just the continuation of "I've learned
| this last week/in my last project and it's the most important
| thing," instead of the trivia that it really is, or shit that's
| abstracted for us nowadays and only serves to make the author
| feel superior. Just fuck off with all of that nonsense.
|
| Coincidentally, I'm going to go and read the Hamming book as it's
| got tangible value having been written by someone who has done
| something worthwhile in their career.
| [deleted]
| efxhoy wrote:
| What's GoF?
| neuromanser wrote:
| Gang of Four: Gamma, Helm, Johnson, Vlissides; authors of
| Design Patterns.
| mathstuf wrote:
| "Gang of Four":
| https://en.wikipedia.org/wiki/Gang_of_Four_(software)
| [deleted]
| Jtsummers wrote:
| "Gang of Four", references
| https://en.wikipedia.org/wiki/Design_Patterns.
| ravenstine wrote:
| A significant amount of things said about computer "science"
| and engineering is opinions, more so than most believe or are
| willing to admit. That doesn't mean it's all wrong, but that
| not everything is universally applicable just because a smart
| person said a thing.
| esafak wrote:
| I think Hamming's book is more appropriate for juniors. I found
| it rambling and obvious. How do others feel?
| coldtea wrote:
| It sounds more like the idea you propose is "just do whatever"
| and that there's absolutely no experience those guys (seasoned
| devs and instructors) have.
|
| There's nothing particularly nonsensical about the "two/three
| "hardest things in computer science" (although it was said half
| in jest).
| ilyt wrote:
| The vast majority of advice like that _is_ garbage and are
| trying to borrow authority of the few good articles that come
| out with similarly structured titles.
|
| Usually by people that mistake "the product is successful"
| with "the product is well engineereed. Or mistaking their
| rewrite from "the worst way to solve the problem" to "the
| second worst way to solve the problem" <hyperbole> for "this
| is the best way to solve problem"
| greggsy wrote:
| I enjoy the attitude, but sometimes people like to read someone
| else's view on something, or to gain insights on something they
| don't know anything about.
|
| Writing authoritatively might be the only way people can get
| people to read some things. I'm ok with that.
| chrisan wrote:
| I still find the rule of 3 to be the most pragmatic balance.
|
| https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...
| brohee wrote:
| Oh... So that's how you end up with class names ending in
| FactoryFactory... Factorisation at any cost without making sure
| it makes sense and will keep making sense...
| kylecordes wrote:
| Maybe y'all are more talented developers than me;
|
| But I have found repeatedly that building the wrong abstraction
| is on the path toward discovering and building the right
| abstraction.
| hooverd wrote:
| Right up until somebody else uses your wrong abstraction- and
| now it's part of the bedrock.
| CharlieDigital wrote:
| Once you've seen enough code, the right abstraction becomes
| easier to spot.
|
| Applications are more similar than they are different. That's
| why we have the concept of design patterns since these occur
| with enough frequency that we should just give the abstraction
| a name instead of re-inventing it each time.
|
| Problem today -- my observation -- is that many younger devs
| don't ever bother learning design patterns so we end up with 1)
| devs who aren't aware of common, existing patterns codified
| decades ago and then 2) think that the "wrong abstraction" is
| expensive partially because of a lack of knowledge of the
| "right abstraction" to use.
| waysa wrote:
| In a world of changing requirements it can be difficult to know
| what the right abstraction is going to be. I am happy to accept
| some duplication early in the development cycle until the
| requirements have settled. Only then it's possible to go back and
| refactor (which admittedly doesn't always happen in practice).
|
| I believe duplication should raise eyebrows but it can be
| justified.
| DougBTX wrote:
| This article isn't making a distinction between the interface
| provided by an abstraction and the implementation details of that
| abstraction, which I think causes it to come to the wrong
| conclusions.
|
| A bad abstraction is an interface which causes the implementation
| to be more complex than necessary. Uses of the interface might
| still look perfectly simple, but if the abstraction is bad the
| overall complexity could be higher.
| cogman10 wrote:
| This is a very amateurish take The author very clearly (at least
| at the time of writing this) has not dealt with complex code
| bases.
|
| > If I were to see a confusing piece of code littered with
| conditional logic, I wouldn't see it and think "oh, there's an
| incorrect abstraction", I would just think, "oh, there's a piece
| of crappy code". It's neither an abstraction nor wrong, it's just
| bad code.
|
| This is the primary issue. The author does not recognize that
| poor abstractions can involve more than just a lot of conditional
| logic. That sometimes, that conditional logic bubbles in places
| where secondary to where the bad abstraction was made.
|
| A simple (real) example of this. One seen code where "get, these
| two objects share a field, let's pull out a base object and have
| them both inherit from it, after all, duplication is bad!". Then
| later on "hey, here's two other objects with the same field, but
| they don't have that old base objects field, duplication is bad,
| so let's make a third base class"
|
| This sort of thinking resulted in a really gnarly object graph.
| But further, down stream code had to do type checks and casting
| to compensate for this bad abstraction.
|
| All because the original dev didn't want to duplicate a field on
| two otherwise unrelated objects.
|
| And worse, you the dev that works on this code years later are
| left with the option "keep it as is, of rewrite and touch 100s of
| files potentially breaking large amounts of code)."
|
| Oh, not too mention the unit tests that accompanied such code,
| ironically, filled to the brim with duplication around this
| hierarchy making minor charges massive.
|
| On smaller less complex code bases you rarely see this
| comedy/tragedy play out.
| em-bee wrote:
| i wouldn't create a base class until there are a non-trivial
| amount of common properties shared by several classes and i
| find that i am adding more such common properties. and when a
| class appears that doesn't have one of these common properties,
| then perhaps it makes more sense to move that one no-longer-
| common property out of the base class back into the individual
| classes so that again i can have all classes share the same
| base class.
| ozim wrote:
| I was looking for a word to describe my feeling about the
| article and "amateurish" fits the bill.
|
| What mostly took me down was: (for example, the same several
| lines of code duplicated across distant parts of the codebase
| dozens of times, and with inconsistent names which make the
| duplication hard to notice or track down)
|
| It is silly example because in such scenario there is no way
| you can even start writing abstraction to handle that.
|
| Other part is what cogman10 wrote, wrong abstraction is not
| "simply piece of code gathering if statements". Wrong
| abstraction is piece of code or whole part of system where you
| cannot simply add an if statement and get going. Wrong
| abstraction might be something that actively prevents you from
| changing code in meaningful way.
|
| There is also another comment I would riff off about DevOps and
| having scripts per team/domain even if mostly those look the
| same you never know what the team will require. Nowadays domain
| driven development is in vouge, mostly because it recognizes
| separation of concerns is much more important than DRY.
|
| To finish off, author also assumes abstractions are born by de-
| duplication of code, yes we discuss "duplication is cheaper" so
| as finishing wanted to rant on something. Worst abstractions I
| saw in practice were born in heads of "Astronaut Architects"
| who built some system top down making stuff up "because it
| should be like that". Other bad ones were done by junior devs
| who were high on DRY.
| laserlight wrote:
| > It is silly example because in such scenario there is no
| way you can even start writing abstraction to handle that.
|
| You start writing the abstraction the first time you
| duplicate, so that you don't end up with this mess down the
| road.
| bitblender wrote:
| I think your disagreements are valid, but I don't think it is
| fair to say this is an amateurish take or infer the author's
| level of experience. Your example of unnecessary inheritance
| hierarchies (which I have also faced many times in real world
| scenarios) may even be a symptom of exactly what the author is
| saying: what you might call a "bad fitting abstraction" the
| author would just call "bad code". The implementation details
| of how code gets shared (composition vs inheritance) is a
| subtle but still vital consideration to the cost benefit
| analysis. The author is observing that it might be misleading
| or dangerous advice to urge developers to choose duplication
| just because issues with abstraction have been historically
| observed, which I completely agree with and do not consider
| myself to be an amateur. I also agree with you and other posts
| that the author fails to mention the (exponentially higher)
| costs of abstraction boundaries that also span human
| organizational boundaries.
| feoren wrote:
| Class inheritance is flawed because it tries to be two things
| at once: a shared "surface" (public members, polymorphism,
| etc.) and shared implementation. An abstraction is _only_ a
| surface -- this could be an interface, a function declaration,
| or even a data model. It almost never happens that
| implementation-sharing and surface-sharing completely coincide,
| and this is why class inheritance is falling out of favor and
| something I completely avoid ( _occasionally_ I will use
| abstract classes, but I usually regret it later). This is where
| "favor composition over inheritance" comes from. I'd go so far
| to say that because they cannot be completely divorced from
| implementation details, base classes cannot even be called
| abstractions.
|
| So if "wrong abstractions" includes shoddy base-class
| shenanigans, then the statement becomes almost tautological. Of
| course duplication is better than class inheritance --
| _everything_ is better than class inheritance. So the real
| statement there is "class inheritance is actually awful",
| which is important to understand, but a side point to this
| debate.
|
| If you don't count class inheritance as abstraction, then the
| tradeoff between code duplication vs. abstraction becomes much
| more nuanced, and that's what all this discussion is about. I
| certainly don't agree that ignoring class inheritance is a
| signal that the author is amateurish. Many complex codebases
| have no class inheritance at all.
| jeremyjh wrote:
| Inheritance is very useful in domains like game engines,
| where it is very common to have a base object such as "Node"
| that has some properties that every object in the scene graph
| must have, which all share the same implementation. For
| example they should all have a parent property and a
| collection of children, and ways to modify those properties.
| They'll also share methods such as "render" which probably
| must be overridden in every subclass. Its not impossible to
| solve this with interfaces and composition but those
| solutions are sub-optimal.
|
| An example you might be more familiar with is the DOM of a
| web browser - every element has some basic properties and
| methods that all share an implementation.
| yellowapple wrote:
| > Its not impossible to solve this with interfaces and
| composition but those solutions are sub-optimal.
|
| The growing popularity of ECS and data-oriented design in
| game engines suggests otherwise: keeping components
| separate from entities enables both performance
| enhancements and separations of concerns that are much more
| difficult to achieve with the traditional inheritance-based
| approach. To illustrate a bit:
|
| > it is very common to have a base object such as "Node"
| that has some properties that every object in the scene
| graph must have, which all share the same implementation.
| For example they should all have a parent property and a
| collection of children, and ways to modify those
| properties.
|
| You don't need subclasses for that; you just need a table
| of entity IDs (where both the things to render and the
| scene itself are entities) and parent IDs, which you can
| then recursively walk to get the entities you want to
| render: WITH RECURSIVE entity_children AS
| ( SELECT id, parent FROM entities
| UNION ALL SELECT ec.id, ec.parent
| FROM entity_children AS ec JOIN entities AS e
| ON ec.id = e.parent ) INSERT INTO
| scene_entities (scene, entity) SELECT
| $scene_entity_id, id FROM entity_children
| WHERE parent = $scene_entity_id;
|
| (Obviously you probably won't actually be running SQL
| queries in a game engine's rendering loop; this is just to
| illustrate the logic.)
|
| Once you've got that list...
|
| > They'll also share methods such as "render" which
| probably must be overridden in every subclass.
|
| You don't need subclasses for that; you just need a table
| of entity IDs and things to render, which you can then
| query and send to the GPU: INSERT INTO
| some_buffer_in_GPU_memory (entity, mesh, texture, position)
| SELECT se.entity, em.mesh, et.texture, ep.position
| FROM scene_entities AS se JOIN entity_meshes AS em
| ON se.entity = em.entity JOIN entity_textures AS et
| ON se.entity = et.entity JOIN entity_positions AS
| ep ON se.entity = ep.entity WHERE se.scene =
| $scene_entity_id;
|
| (Again: you probably ain't actually using SQL for this;
| this is also overly simplified, since most modern game
| engines use all sorts of other stuff besides a mesh,
| texture, and position when rendering something. Note also
| that "em.mesh", "et.texture", and "ep.position" need not be
| actual meshes/textures/positions, but could instead be
| indices into buffers already on the GPU.)
|
| The key advantage in both of these cases is that the
| parent/child data and the render data can live where they
| make the most sense, and can be processed by independently-
| running systems with minimal contention. This is critical
| for processing game logic in parallel - something which the
| game industry is learning the hard way with legacy engines
| that can't fully exploit multicore hardware.
| feoren wrote:
| Quite the opposite: game engines are one of the few places
| where the sub-optimality and fundamental problems with
| object inheritance became so overwhelming that people
| starting abandoning their deeply ingrained CS 101 models of
| Dog : Animal and invented Entity-Component-System
| architecture, which at its extreme uses no object
| inheritance at all and is a deeply "relational" model. Game
| engines which _don 't_ do this were either mostly developed
| before ECS was invented/popularized (Unreal) or are
| specifically targeting beginners who have little more than
| a CS 101 understanding of OO programming (and also
| following Unreal's lead).
|
| DOM elements are a better example, but just because that's
| how they _are_ done doesn 't mean that's how they _should_
| be done. Does a <script> element really need a "focus()"
| method? It has one. Does a <br> element need an "innerHTML"
| property? It has one. Does a <head> element need an
| "offsetHeight" property? It has one. If you look at the
| history of the development of HTML and JavaScript as a
| shining ideal of software engineering, you're certainly in
| the minority (this is all before TypeScript, which _is_ a
| shining ideal of type systems!). The HTMLElement class has
| 134 properties, most of which make no sense for most
| elements. It has a long history and a lot of excuses for
| becoming what it is today, but I would not recommend you
| follow that lead in your own designs.
| whstl wrote:
| Not really. Composition has been the preferred technique
| for a long time already.
|
| A lot of Games and GUIs that use inheritance worked _in
| spite_ of that inheritance. In more complex object graphs
| there were always things like _override boolean
| DoNotActuallyRender()_ in one or two children of the
| _RenderableNode_ class to account for special behaviour.
|
| ECS is just the nail in the coffin of inheritance in game
| engines. And it's not even new anymore, it has been
| fashionable for what, almost 15 years now?
| bob1029 wrote:
| Duplication is a superpower if you can put your OCD into a box
| for a little bit and frame it as a temporary stepping stone.
|
| Refactoring nightmare codebases _can_ become trivial if you don
| 't mind a few copies of "the same thing" being kept around to
| satisfy serializers and other legacy APIs. Writing mappers
| between nearly-equivalent types sucks really hard but it still
| sucks a lot less than saying things like "lets just rewrite the
| whole product".
| disintegore wrote:
| I resent how much we've trained developers to value concision
| over everything else. I can't tell you how many times I've seen
| people use DRY as a justification to alias stuff that's already
| heavily abstracted by the framework that they use, ending up with
| less useful interfaces. Either that, or they'll explode the
| cognitive load by building crazy type hierarchies and inserting
| opaque anti-patterns like factories and decorators and whatnot.
|
| These are "the wrong abstractions" in the sense that they're not
| actually crappy code full of conditionals and are actually well-
| redacted and not all that hard to decipher. They're "the wrong
| abstractions" in the sense that there's either a way to do it
| that is simpler and makes fewer assumptions, or in the sense that
| they are worse than "no abstraction" which is to say sticking to
| the abstractions that have already been invented for you by
| people whose jobs it is to do that exact work for millions of
| engineers and are therefore probably way better equipped.
| HumblyTossed wrote:
| A little bit of code duplication tends to be much less toxic than
| a lot of discussions on proper coding techniques.
| tester756 wrote:
| ehh, mediocre ideas, tricks, dogmas, religious approaches.
|
| You have to evaluate decisions by the case, with the context.
|
| It is called software *engineering* - you have some goals and
| constrains and you design with those in mind.
|
| Sometimes it is better to duplicate, sometimes to it is better to
| have single source of truth.
| lijok wrote:
| > So far so good, perhaps. But, by creating this new abstraction,
| the programmer signals to posterity that this new abstraction is
| "the way things should be" and that this new abstraction ought
| not to be messed with. As a result, this abstraction gets
| bastardized over time as maintainers of the code need to change
| it yet simultaneously feel compelled to preserve it.
|
| How I've been thinking about abstractions that turn bad over time
| is: It was likely the correct abstraction at the time it was
| made, given the requirements the writer had on hand. Now that the
| abstraction is wrong, don't muck with it. Gather the new
| requirements and write a v2.
|
| I think the vast majority of abstractions will go bad over time.
| To abstract is to generalize, and generalizations become invalid
| over time because the world evolves over time. It's sort of like
| trying to preserve a summary of a book that is continuously
| having new pages added and existing pages replaced.
| tracker1 wrote:
| What seems to serve me best is keep things as simple as possible.
| If you add abstractions, do so to make the rest of the code
| easier and less complex. If you must do something complicated,
| break it apart as pragmatically as possible and do it in the
| simplest way possible. Favor YAGNI (you aren't going to need it)
| over corporate-wide libraries that lock you in.
|
| Keep your codebase discoverable first. Structure by
| feature/function not type. Favor the local developer experience
| first. If you cannot open, follow and run the code easily, your
| developers won't be able to onboard quickly. Someone else will
| have to continue with your mess, make it as orderly as possible.
| I find that docker-compose can help a lot on this front, as can
| developer containers.
| aeturnum wrote:
| I think the core of the difference can be found in the _What
| exactly is meant by "the wrong abstraction"?_ paragraph.
| Admittedly, the quoted article is also a bit confusing here, but
| I think it 's easy to resolve.
|
| I think the wisdom of the original saying is hard to understand
| when you just look at any piece of code as it exists. Instead,
| imagine the future. You have two pieces of code that do similar
| things - you can centralize them (with a bunch of conditionals)
| to have a "single" code path, or you can allow them to stay
| separate (perhaps confusing new people). The wisdom of
| "duplication is cheaper" is to observe that it will generally be
| less work to allow the duplication than to maintain the
| circumstantial needs over time. Each time you need to "do the
| same thing again but a little different" you can either add more
| conditionals to a single piece of code, or add another instance
| of 'duplication' which can just deal with the concerns at hand.
| It's not about "crappy code" - it's about the difficulty of
| having one piece of functionality serve many masters over time.
|
| IMO, in general, you will also find that if you have many
| 'duplicated' copies of code, it will often be easier to see the
| truly duplicated sub-sections that you can DRY out into a common
| subroutine. I find that is easier to see with duplicates than
| with a single piece of complex code.
| coldtea wrote:
| > _It seems to me that what's meant by "the wrong abstraction" is
| "a confusing piece of code littered with conditional logic". I
| don't really see how it makes sense to call that an abstraction
| at all, let alone the wrong abstraction._
|
| No, it means the wrong abstraction. Like forcing a one-size-fits-
| all abstraction on a few pieces of duplicated code, and not
| waiting for them to grow to enough cases to hint at what is the
| best pattern/abstraction/architecture to handle them (perhaps
| more than one, for different classes than somebody might just
| shove in a single abstraction prematurely).
| overgard wrote:
| Every time you create an abstraction to remove duplication,
| you're tying two pieces of code together and creating a common
| dependency. The more dependencies you have, the harder it is to
| change code, because a change in one place reverberates in many
| places.
|
| To me, that's the cost. You gain a decrease in code size and
| verbosity at a cost of making localized changes more difficult.
| LeifCarrotson wrote:
| I call this a distinction between "inherent sameness" and
| "incidental sameness".
|
| Yes, right now, those two servers have the same number of
| processor cores. But who's to say that after a hardware update
| that will still be true?
|
| Conversely, the fact that every processor has a certain number
| of cores is inherent to the way we represent a processor.
|
| In my line of industrial automation, it's almost always cheaper
| to pay the cost of complexity up front, and assume that every
| conveyor VFD might get replaced with a different model, or with
| a contactor, somewhere down the line. That duplication is cheap
| when the line is on the integrator's shop floor. Any downtime
| later on, when enormous dependencies have come to rely on that
| line, is more costly.
| tflinton wrote:
| Duplication is bad. In fact, duplication is one of the most
| dangerous mistakes in coding.
|
| I have to disagree with this; the article feels lofty in its
| assumption that when you start to program you know what to
| abstract. More often people begin abstracting due to a misguided
| axioms like "DRY" rather than to solve a problem with a real cost
| benefit trade off. DRY as a goal in itself is fairly dangerous.
|
| I can't count how many convoluted and confusing frameworks people
| have put together under this misguided perspective. It's not
| atypical for an abstraction born of "DRY" motivations to be more
| code and brittle than just copying and pasting 2 lines in 15
| places.
|
| Not to say abstractions are inherently bad, but to the point of
| abstracting for the sake of DRY is a mistake.
| lolinder wrote:
| The problem is with being either dogmatic or thoughtless in
| either direction. I've seen what you're talking about: people
| combine code religiously because of DRY, leading to insane
| pyramids of abstractions that are impossible to modify.
|
| However, I've also seen people copy and paste everything they
| ever need. When that happens, those offshoots gradually evolve
| independently from one another, and introducing a proper
| abstraction becomes a huge slog. I've spent hours reading
| through git blame trying to piece together a phylogenetic tree
| of the various copies of the same code so we can ensure that
| the new abstraction contains all relevant features and bug
| fixes. I wish those developers had thought more carefully about
| DRY.
|
| I think the best balance is to use these catch phrases as
| principles to guide your decision making, while being willing
| to make exceptions when they don't apply. If DRY makes you
| think for a second before copying a piece of code, it's done
| its job, even if you decide that this situation really does
| call for a copy.
| briantakita wrote:
| "Duplication is cheaper than the wrong abstraction" makes sense
| coming from the Rails community. Between the meta-programming,
| lack of static types, large amount of unit tests, etc. Rails has
| a tendency to lock a project into an abstraction choice & is very
| expensive to change. The pain is particularly intense during
| major version Rails upgrades. From my experience, the Rails
| framework got in the way & bogged down project velocity. It was
| difficult to move away up to ~2010 as many of the jobs were
| locked into Rails. There were many frustrated Rails programmers
| around that time. When node.js, Go, & other languages/platforms
| came out, there were finally full stack libraries that did not
| lock in abstractions as heavily as Rails. Nowdays, I use
| astro.js, solid.js, & target isomorphic libraries. The
| flexibility of Javascript with the static types of Typescript
| make changing abstractions significantly easier. The Javascript
| ecosystem spent far too long focusing on SPAs when the isomorphic
| MPA was low hanging fruit.
| emtel wrote:
| I think it comes down to this:
|
| - If your code has a bug, you will be better off without
| duplication, so that the bug must only be fixed once.
|
| - If you will have to change the behavior of your code for
| product reasons, duplication is often better, because user needs
| are idiosyncratic. If the code is fully factored, you may have to
| pass in flags to indicate which behavior should be used in which
| case.
|
| Learning to anticipate which of these two cases you might find
| yourself in in the future comes with experience.
| BurningFrog wrote:
| The question I ask myself is "If X changes, how many places in
| the code need to change?", for any reasonable value of X.
|
| If the answer is 2 or more, you probably want to deduplicate
| something.
|
| If not, it doesn't really matter if different code looks similar.
| nraf wrote:
| > for example, the same several lines of code duplicated across
| distant parts of the codebase dozens of times, and with
| inconsistent names which make the duplication hard to notice or
| track down
|
| While I think there's merit in deduplicating these situations,
| one pitfall is introducing coupling and tangled dependencies when
| DRYing.
|
| There are ways around this of course, but I've come across a
| number of instances where deduplication has led to unnecessary
| coupling between modules.
| andrewprock wrote:
| The underlying problem is that the "don't repeat yourself"
| principle is often in conflict with the "single responsibility
| principle". Structurally, this comes down to the problem of
| managing dependencies. Over the years, the problem of dependency
| management has become bigger and more difficult to tame.
|
| The same problem holds for internal code as well as external
| code. Duplicating code creates one kind of dependency problem
| (feature drift). Shared code creates another kind of dependency
| problem (increased coupling). Broadly speaking, solutions which
| reduce coupling are going to be cheaper to maintain.
|
| Ideally, there would be clear, well defined layers with narrow
| communication protocols.
| tabtab wrote:
| I find it's better to keep abstractions small and independent, so
| you can mix and match. Too big, and they risk not fitting future
| change well. Even if the smaller ones create a bit more work or
| "mini duplication", it's worth it to have that flexibility.
| andybak wrote:
| Typing on phone so I'll be brief. The key concept I find missing
| from this piece is "locality".
|
| When dealing with a complex and/or unfamiliar codebase, locality
| (by which I mean "I can understand this thing here without
| jumping around the codebase") can make up for a lot of other
| deficiencies.
|
| And imho, dedupped code with an excess of if statements is
| actually one of the least worst things to encounter.
| rightbyte wrote:
| Many programmers believe that the more complex the better.
|
| In my experience good code looks silly simple, such that you
| might think the problem was easy. And thus underrate the author
| ...
|
| I have never read someone else code at work and complained that
| a function is too big with too many if clauses.
|
| However, deep call trees are really hard to comprehend.
| Especially if some function is called multiple times in the
| same call stack (unless the algorithm is recursive in a good
| way).
| julienreszka wrote:
| Says the amateur. There is never enough granularity. DRY is only
| useful after a significant threshold of repetition
| aranchelk wrote:
| The elephant in the room is without strong static typing and a
| good type checker changing abstractions is somewhere between a
| significant pain in the ass and downright perilous.
|
| In my experience when you have those things, whether you make
| significant changes to your API or decide to dedupe old divergent
| copy pastes, it's largely just busy work -- very little thought
| involved. The type checker says change line 135 in file foo.
| Okay, next.
| luckycharms810 wrote:
| Whenever this conversation is had - it seems to completely
| dismiss the idea of domain. Duplication doesn't happen in a
| vacuum - it happens within a certain context. Some acceptable
| conditions for duplication include:
|
| * If two things are semantically different within the context of
| a domain but require similar functionality.
|
| * Code paths with different risk profiles.
|
| * When new functionality is evolving with domain learnings.
| MathMonkeyMan wrote:
| I read a blog post somewhere (don't remember where) that
| describes the process of unfactoring (multiplying?) code as an
| exercise. Copy/paste the code until there's one straight code
| path per use case. Then examine the similarities and factor the
| code again. What you end up with will often be different from
| what you started with, and probably simpler, especially if the
| code had begun to drift from its original author's design.
|
| So, "unfactor" the code and then factor it again. Let's call
| it... "refactoring."
|
| My $0.02, then, is that "the wrong abstraction" assumes that you
| are unwilling to change it. What if we were comfortable tearing
| down our classes all willy nilly and replacing them with some
| other thing? Is it too risky? Does it hurt too many feelings?
|
| Maybe the problem lies there, instead of in duplicate vs.
| abstract.
| maximinus_thrax wrote:
| [dead]
| gorjusborg wrote:
| The saying 'duplication is cheaper than the wrong abstraction' is
| a gem of a saying, but like many pieces of wisdom, takes
| experience to fully understand.
|
| I first saw the saying when DRY was being applied without any
| nuance. If a piece of code appeared in two places, it was
| obvious, and important, to factor it out, because that was 'good
| coding practice'.
|
| The saying being discussed was pushback against that kneejerk,
| thoughtless application of DRY. The 'cheaper than the wrong
| abstraction' is pointing out that DRY isn't a 'no tradeoff'
| policy. By factoring out any duplication, many uses pass through
| the same code. If the uses don't quite match, there is a tendency
| for the code to get modified to fit them anyway. This, over time,
| makes the shared code simultaneously unfit for use, and widely
| used. A recipe for poor code quality and system health.
| Ironically, this is the outcome that DRY was called in to
| address.
| devjab wrote:
| I think it's down to the systems, and I think the people who
| favour abstraction often forget who needs to write it.
| Duplication isn't just cheaper than the wrong abstraction, it's
| cheaper than almost any abstraction. Not because it should be,
| mind you, but because duplication works for a tired Thursday
| afternoon programmer and abstraction doesn't. Maybe it's
| because I spent some time in management, but a key concept I
| worked with when I did that was how we have two modes of mental
| capacity. One where we have the energy and wit to do the right
| thing, and one where we haven't slept for a week, and, well...
| it's Thursday afternoon after a day of too many useless
| meetings.
|
| I think the best way I saw it put was for a Theme-park to coin
| a slogan that any employee would be able to find inspiration in
| when dealing with a customer on that Thursday afternoon. To me
| most abstractions are similar to having a slogan along the
| lines of "Think Different", which is an absolutely useless
| concept when you're tired and dealing with an angry customer in
| your summer job about an hour before you clock out.
|
| I obviously don't think you should avoid all abstraction. The
| author of the article is right, theoretically at least, it's
| just that this way of thinking rarely works out. Similar to
| you, my experience is that it tends to fail after a few years
| of changing needs.
|
| These days I favour abstraction only when it's use is never
| altered in the slightest. For everything else duplication is so
| much easier to handle over 5+ year periods. Of course there are
| many ways to deal with this. Small single purpose functions are
| abstractions as well, just don't build big OOP hierarchies.
| Because they just don't work for those Thursday afternoons.
| wvenable wrote:
| The most important thing to note about DRY is that it's not
| about code -- it's about knowledge. You should not repeat
| knowledge -> logic, constants, etc. If the temperature is 87
| and the price of the widget is 87 that is coincidence and not
| repetition.
|
| There should just be one source of truth for any logic or
| process. If you duplicate that then bad things will eventually
| happen.
| jasonswett wrote:
| Totally agree, although I'd maybe replace "knowledge" with
| "knowledge or behavior".
| starbugs wrote:
| Anything pushed to the extreme will result in its opposite.
|
| There's an interesting Wikipedia page about a concept from
| philosophy called "Unity of opposites":
| https://en.wikipedia.org/wiki/Unity_of_opposites
|
| Worth a read IMHO.
| zogrodea wrote:
| Thanks for the link. This sounds more useful to refer to
| (because more general_ than horseshoe theory.
| serial_dev wrote:
| > If the uses don't quite match, there is a tendency for the
| code to get modified to fit them anyway.
|
| And this is essential, this is how you'll end up 5 arguments
| and 6 further bool flags to an 7-line function.
| ilyt wrote:
| And one of them for some reason takes from environmental
| variable...
| _the_inflator wrote:
| DRY should be an option.
|
| Results may vary and depend on the code in question as well as
| the language you are using.
|
| We - a former team a couple of years ago using Java - started
| to duplicate code in Java, because we were totally tired of
| interface'ing and class'ing everything away that was not DRY.
| It became to tedious to bloat code with them as well as
| understanding whole classes when all you got was references to
| other interfaces etc.
|
| If there is a small service architecture like in Angular with
| TypeScript, abstracting away becomes fun and useful.
|
| It all depends. But what I really do not miss is the pile of
| interfaces in Java and C#. These became so tough to grasp and
| entangled, that we DRY'ed this cesspool. DRY on DRY so to say.
| switchbak wrote:
| So your issue was with the nature of the language and the
| size of the project more than the application of DRY?
|
| I think I see what you're getting at, but I've certainly also
| seen very large Java projects that are simple at a high level
| and composed in such a way that they're still legible without
| a ton of duplication. These might be somewhat orthogonal
| concepts.
| thecodrr wrote:
| The author is going into technicalities without much actual
| substance, ending with: it depends.
|
| I think whenever we, as programmers, try to pin down a certain
| principle, it bites us. Hard. DRY was cool as an observation but
| when it got turned into a law we saw the spaghetti code.
|
| Duplication, on the other hand, is detested almost as much as the
| goto statement. Let me tell you, it's not that bad. Duplicate
| code makes everything more flexible. It helps you to NOT bend
| over backwards in order to change a line of code. It allows you
| to NOT touch anyone else's code.
|
| So many good things. Of course, I agree with the author's summary
| of the bad things that can happen with duplicated code. But
| there's a litmus test for that:
|
| If you have to make changes in multiple blocks of duplicated code
| in order to change the behavior of something, there's a problem.
| DRY out the code so you only have to touch 1 place.
|
| If, however, 2 blocks of code LOOK similar but aren't actually
| the same, and changing one block doesn't make the other block
| outdated and stale, you are good to go.
|
| Judge and decide. It's just 2 approaches that when taken to an
| extreme can cause a lot of pain, but if used with common sense,
| nothing is simpler.
| overgard wrote:
| > Duplication, on the other hand, is detested almost as much as
| the goto statement.
|
| Honestly, even the goto statement isn't that bad. It's pretty
| useful in C code. I'm not saying anyone should put it in a new
| language, but the amount of hate it gets is really just related
| to BASIC monstrosities from the 1970s, not any real-world
| applications of it.
| ants_everywhere wrote:
| The wrong abstraction can completely destroy a startup. I've
| never seen duplicate code with that ability to cause damage.
|
| Or consider the centuries humans spent trying to make geocentric
| astronomical models work.
| edgyquant wrote:
| You're lucky then. I joined a company that had a team of
| inexperienced engineers where every form or details page was a
| separate program and the render functions were several hundred
| lines long by themselves. When I joined they had a dozen pages
| that were each so buggy adding new ones was nearly infeasible
| and fixing bugs took most of the dev time. Duplicate code can
| certainly slow down the dev process and kill a startup.
| waffletower wrote:
| I have seen much damage from duplicate code at multiple
| organizations. I have seen thoughtful abstractions work
| successfully to mitigate it, and rarely encountered the
| opposite. I have encountered multiple perjoratives: copypasta
| coders, couch developers et al.
| justincredible wrote:
| [dead]
| DustinBrett wrote:
| Whatever is faster is cheaper because everything needs to
| constantly justify it's existence with new features.
| gosukiwi wrote:
| I like Dan Abramov's "The Wet Codebase"
| (https://www.youtube.com/watch?v=17KCHwOwgms) -- I've been guilty
| of doing just what he says in his talk at first, removing all
| duplications and making the codebase DRY. But then I came to like
| "prefer duplication over the wrong abstraction", as Sandi Metz
| puts it.
|
| Sometimes it's good to wait to have more data to make an easier
| and more informed decision.
| kristov wrote:
| Abstraction is not just about hiding code - its about reducing
| options. You purposefully reduce options to make the system
| easier to reason about. A "function" in a programming language is
| an abstraction over machine code. It looks like variables have
| scope in an isolated environment, and it looks like the braces
| mean something, but it's compiled down to machine instructions
| that have no such concept. Goto considered harmful, but compiled
| machine code is littered with jump instructions (of course). You
| can do a lot of funky tricks with machine code that the higher
| abstraction of a programming language doesn't let you do. When
| you create an abstraction you reduce options for the user of that
| abstraction. So abstractions tend to gather cruft over time
| because users want those restrictions relaxed to do their special
| thing.
| [deleted]
| feoren wrote:
| Absolutely right. One of the most important questions to ask an
| abstraction is: what can I _not_ do with this? If the answer is
| "nothing -- you can do everything you could before", then the
| abstraction is an inner platform. The entire power that
| abstraction brings is in "focusing" on the problems we care
| about solving; it must make other problems impossible (ideally
| ones we don't care about). It follows from the No Free Lunch
| theorem.
|
| One way to make sure your abstractions are focused on solving
| the right problems is to always define them based on what you
| _need_ , not based on what you _have_. The root of the
| abstraction vs. duplication debate comes down to this. Indeed
| it 's unhelpful to look at two pieces of code and say "these
| look the same; I will abstract them!". Instead you say "wow
| these have really similar needs; I will define exactly what
| that need is and they'll both ask for it."
| lamontcg wrote:
| This whole article is based on a bad reading of the problem.
|
| The problem that happens when code is first duplicated is that
| the correct abstraction is a fundamental UNKNOWN.
|
| If you knew the right way to de-duplicate it, you would of course
| always construct that abstraction, because that would always be
| better.
|
| What happens in practice is that the wrong abstraction is usually
| chosen.
|
| Then that incorrect abstraction isn't usually held around because
| of "[feeling] honor-bound to retain the existing abstraction" (if
| that's a direct quote from Sandi then I disagree with the quote
| and feel it has entirely the wrong emphasis). The problem is that
| it is always easier to add a new knob to the bad abstraction than
| it is to go back and de-dup the whole code and fix the
| abstraction. So the bad abstraction tends to accrete more bad
| abstractions on top of it until it becomes a mess because of
| doing the cheap, easy thing.
|
| We should not do that. But the realities of software development
| are that when you are dealing with an orthogonal problem, you
| WILL wind up adding a knob to something that can be done in a
| day, rather than taking 2 weeks to refactor a different subsystem
| that your original problem only barely touches and isn't the
| primary concern of whatever business objective you are trying to
| deliver.
|
| So the advice is to let it sit for awhile. Let the code accrete a
| few more requirements over the weeks or months ahead, and when
| you find yourself doing a double edit to both sides of the code
| and the right abstraction is clear to you then go ahead and de-
| duplicate it.
|
| Note that if the problem is TRIVIAL then go ahead and de-dup it
| right from the start. This isn't advice for junior programmers
| who are faced with something as simple as dropping two hash keys
| into an array and then iterating over it so that it makes it easy
| to add a third key. This is more about having two classes which
| are fairly similar and extracting a whole base class and jamming
| all the shared code into the base with a tightly-coupled poorly-
| thought-out "wide" interface (using inheritance as a hammer to
| de-dup code). And the whole problem becomes even worse if someone
| external might come along and pick up that base class and start
| using it with the existing API and you might be locking yourself
| into a shitty API that you can't change without breaking
| backwards compatibility.
|
| And even if you're in a "non-OO" language like Go you can still
| make this mistake by designing bad interfaces, it is the exact
| same thing.
| highwind wrote:
| The article seems to be arguing against conditionals not
| duplication.
| danbruc wrote:
| That something is the wrong abstraction is something you can only
| know after the fact, at the time you build the abstraction it is
| - or at least it should be - a reasonable choice. And later on as
| the code evolves there are two possible outcomes, the abstraction
| remains a good choice or the abstraction stops being a good
| choice and you have to change it. Maybe it can be saved with some
| refactoring, maybe at has to go completely.
|
| But at the very least you had a working abstraction for some time
| and you can easily figure out all the places where this
| functionality is used and you have a single place to make changes
| when you have to make them instead of having to hunt down all the
| different places with slightly different implementations. Even if
| an abstraction breaks completely down and has to be split up into
| several implementations, each of those will usually have several
| usages which would all still be repetitions without the
| abstraction.
| karmakaze wrote:
| There's an aspect of "not seeing what others are seeing" here.
|
| > I think "the wrong abstraction" is a confused way of referring
| to poorly-de-duplicated code. Here's why. [...]
|
| > So instead of "duplication is cheaper than the wrong
| abstraction", I would say "duplication is cheaper than confusing
| code littered with conditional logic". But I actually wouldn't
| say that, because I don't believe duplication is cheaper. I think
| it's usually much more expensive.
|
| It seems the author is considering 'cost' to be the mechanical
| effort of managing the sync/desync of the DRY code. What it's not
| considering is that _distinct intents_ can _incidentally_ use the
| same implementation at the moment. This is when it 's not a good
| idea to DRY because there they are not _meant_ to stay in sync.
| seadan83 wrote:
| I think a key distinction often lost here is that generic code
| and abstract code are different. Abstract code hides details,
| generic code allows its use in more places. When hiding details,
| often it also becomes more generic. Making code generic does not
| necessarily hide details, it can very well often expose
| additional details
|
| Also seemingly not mentioned - SRP (single responsibility
| principle). SRP & DRY should be considered together. If a person
| DRY'ies up code without regard to SRP, they're making any code
| that can be generic, generic. A rule of thumb is generic code is
| 3x more expensive than non-generic code.
|
| ==============
|
| To illustrate, here is an example (and pretend that these
| examples are duplicated in 20 different places that all need the
| account balance sum):
|
| --------------
|
| Example (1) - non-generic, non-abstact
|
| ```
|
| int savingsBalance = 1;
|
| int checkingBalance = 1;
|
| int totalBalance = savingsBalance + checkingBalance;
|
| ```
|
| --------------
|
| Example (2) - generic, minimally abstract
|
| ```
|
| int savingsBalance = 1;
|
| int checkingBalance = 1;
|
| int totalBalance = addBalances(savingsBalance, checkingBalance);
|
| ```
|
| --------------
|
| Example (3) - abstract, potentially generic:
|
| ```
|
| int totalBalance = addBalances();
|
| ```
|
| ===============
|
| Now consider what happens if we need to add a 'brokerage account
| balance' to the mix (and let's say we get that value via an API
| call). These example change in the following ways:
|
| Example (1), updated:
|
| ```
|
| int savingsBalance = 1;
|
| int checkingBalance = 1;
|
| int brokerageBalance = fetchBrokerageBalance();
|
| int totalBalance = addBalances(savingsBalance, checkingBalance,
| brokerageBalance);
|
| ```
|
| Example (2), updated:
|
| ```
|
| int savingsBalance = 1;
|
| int checkingBalance = 1;
|
| int brokerageBalance = fetchBrokerageBalance();
|
| int totalBalance = addBalances(savingsBalance, checkingBalance,
| brokerageBalance);
|
| ```
|
| Example (3), updated & unchanged:
|
| ```
|
| int totalBalance = addBalances();
|
| ```
|
| Example (1) & Example (2) have similar scaling behavior here
| (scaling relative to complexity). This illustrates a very key
| difference between abstract and generic code.
|
| Now, let's say on another hand that whether we should include
| brokerage balance is conditional. In example 1, we have the same
| logic to be applied in 20 different places. We can mutate example
| 3 to be more generic (EG: pass in flag -
| `addBalances(Flags.includeBrokerageAccount)`). At this point we
| can say that the abstraction is wrong and needs to be split into
| different methods (which is fine!). Making example 3 more generic
| is more complex, we incur the penalty of having generic code.
| Example 1 is arguably the worst to have since we will get subtle
| errors if we fail to update everything. In part these design
| principles are also there to help protect updates and make them
| safe (very similar to the ACID guarantees of database that help
| make it so you can update data without breaking the overall
| database)
|
| Another mention, which I won't go into detail over, boiler plate
| code has yet different characteristics.
|
| In sum, it's largely a question of what kind of coupling is best
| and how to deal with that coupling. Duplicated code is coupled
| without any runtime or compile time checks that it stays in sync
| (if you forget to update something from example 1 above, it's a
| bug!). Keeping code consolidated into a common procedure does not
| remove that coupling, it just changes the nature of the coupling
| and makes it more explicit. Common code between micro-services
| couples those micro-services together (and that can be very bad).
|
| Thus, we need to look at a lot of things when applying DRY: we
| need to consider SRP, whether we are coupling services together,
| and whether or not if we are simplying making non-generic code
| generic.
| seadan83 wrote:
| Typo in that updated Example (1), should have been:
|
| int totalBalance = savingsBalance + checkingBalance +
| brokerageBalance;
|
| It's hard to explain such complicated concepts super concisely.
| What I'm getting is that DRY is often equated to merely making
| code generic and re-used, while the goal of DRY is not at all
| about re-use. Generic code is more complicated than non-generic
| code, thus if we make code generic for the sake of making it
| used in many places - that is likely going to make things more
| complex. It's a fundamental misunderstanding that DRY is simply
| the act of using human pattern matching to make all similar
| looking code generic and re-used. Instead DRY is more about:
| "are we sanitizing data before sending it to the front-end?
| Than that should be done in one place." "Where are we
| configuring database connections?" etc..
|
| Further DRY should not be the only guiding factor, SRP &
| coupling should always be considered at the same time as DRY.
| preommr wrote:
| Duplication is cheaper because of how most programmers write code
| at their job:
|
| - write stuff as fast as possible, without having time to think
| about overall architecture, especially if it involves having to
| cooperate with other devs. It's easier to just implement
| something that's as quick and as simple as possible so that it
| can be passed off to someone else with minimum effort.
|
| - no need to communicate the abstraction semantics - no need for
| documentation outlining the abstraction, reasoning, possible
| expansion, etc.
|
| - it's much easier to make localized changes. A well written
| abstraction will cover some logic that might be spread across
| multiple areas. Changing something major to the abstraction
| requires understanding how the abstraction affects all it's
| applications in the same ticket. Whereas duplicated code can
| result in a ticket being resolved by just making a change to a
| specific code block, like a function.
|
| - Things that work well aren't appreciated. If it's easy to
| update an abstraction to a new feature, it'll be the expected
| outcome. When a change like the previous point is needed, it's
| much more memorable because of the frustrating experience is
| likely to be longer and more strenuous. We also tend to remember
| negative experiences over positive ones.
|
| - Abstractions require reading more code with additional levels
| of indirection and devs don't like reading other people's code.
|
| - Writing things well requires effort, so bad abstractions are
| more likely.
|
| - More mature projects tend to have more abstractions because of
| their additional complexity, so I would guess that there's a
| strong correlation between difficult projects and frequency of
| abstractions.
|
| - Some people went absolutely nuts with writing blogs posts, and
| evangelizing certain techniques which were completely unnecessary
| in an effort to push out content. There's lots of things to write
| about on implementing abstractions. But little in the other
| direction other than don't write unnecessary abstractions.
| AnimalMuppet wrote:
| The flip side is, duplication is bad because when you find a
| bug and fix it, did you fix it _everywhere_? How many places
| were there where the bug needed fixed? Are you sure you got
| them all? It 's much easier when there's only one place that
| you have to fix.
| mfitton wrote:
| I'm getting shot in the foot by this right now as our team
| embarks on tackling some long-term tech debt.
|
| The approach we've found that works is health checks and
| manually looking into cases when we think we've fixed a bug,
| as it will often point us to a piece of duplicated code we
| missed that we can wrap into the fold.
| postalrat wrote:
| Or maybe it's a bug in 80% of the cases and not in the other
| 20%.
| preommr wrote:
| Ticket closed, case closed.
|
| And I am only half-joking about that. I don't think that
| effort is that visible and often goes unrewarded. I feel like
| a lot of managers don't directly, but indirectly use number
| of tickets closed as a sign of productivity which affects
| promotions and compensation.
|
| Obviously YMMV, but teams that care about their code quality
| to such an extent are less likely than places that act as
| ticket factories.
| hooverd wrote:
| Maybe that bug in that abstraction was actually load bearing.
| ozim wrote:
| Flip side of the flip side is bad because when you fix the
| bug in "abstraction" or de-duplicated piece of code: how do
| you know you did not break something you don't know.
|
| Duplication is easier because once you fix that single place
| - you are 100% sure you fixed that place and you did not
| break 10 other places. Maybe you know "by writing unit
| tests", but when you write unit tests, when you find out you
| broke something.
|
| Funny story time: had an add/edit popup in system because
| they looked the same so dev just made it "single thing".
| Something like 3 months dev1 fixed something -> qa2 found a
| bug X -> dev2 fixed something -> qa1 found a bug Y -> dev3
| fixed something -> qa 3 again found bug X. When I got into
| code base I noticed that ping-pong because somehow I was only
| sane person to check git history and I split things up.
| Something like that happened multiple times in my career
| aidenn0 wrote:
| I clicked on the "more nuanced and comprehensive post" and the
| real TL;DR is "I define duplication differently than everybody
| else, and, by that definition, claim that duplication is always
| bad"
| jasonswett wrote:
| > Just because a piece of duplication costs something doesn't
| automatically mean that the de-duplicated version costs less.
| It doesn't happen very often, but sometimes a de-duplication
| unavoidably results in code that's so generalized that it's
| virtually impossible to understand. In these cases the
| duplicated version may be the lesser of two evils.
| janaagaard wrote:
| > My understanding of the "duplication is cheaper than the wrong
| abstraction" idea, based on Sandi Metz's post about it, is as
| follows. When a programmer refactors a piece of code to be less
| duplicative, that programmer replaces the duplicative code with a
| new, non-duplicative abstraction.
|
| I think one of the main takeways from Sandi Metz's quote is that
| you should postpone creating the abstraction until after you have
| the duplicated code. Sometimes you will remove the duplication
| when you have just two implementations, sometimes you will want
| many more. Once you have the repeated code it's relatively easy
| to make the right abstraction.
| olingern wrote:
| I passionately disagree with this. Abstractions inherently
| introduce some level of opaqueness and it's only useful in the
| context of making things more maintainable. Duplicated code is
| easier to reason about because its intent is closer to the
| problem it originally solved.
| wellpast wrote:
| To talk about "duplication is cheaper than the wrong abstraction"
| without invoking "dependencies" at all (and their costs) means
| the entire premise has been missed.
|
| Another tell:
|
| > Don't try to make one thing act like two things. Instead,
| separate it into two things.
|
| If abstractions were so easily split like this, then the advice
| wouldn't hold. But they never are. Abstractions immediately
| accumulate dependencies making it near impossible to split them,
| as we all learn after living in anything other than toy code
| bases.
|
| This is the hallmark of a junior (ie someone who has not been to
| battle much) is making de-duplicating code a priority and not
| understanding the cost of dependencies.
| [deleted]
| bheadmaster wrote:
| It took me a long time and many thousands of lines of code
| written, read and re-written in order to understand one thing:
| Code is supposed to reflect the intention.
|
| Good code reflects that intention smoothly, like a well-written
| paragraph of a book reflects the events that happened in the
| story.
|
| DRY makes sense semantically, when a piece of code _always_ needs
| to be the same as another piece of code - that 's when you
| isolate it into a function with a semantically meaningful name
| and behavior. Applying DRY without understanding and
| indiscriminately leads only to confusion and needless complexity.
| Ensorceled wrote:
| I have a current example where this bit my team.
|
| A fairly common pattern that I've seen over and over in multiple
| domains is this: Given a group of of "things"
| with a start and stop date, list all the things that are
| "active" during a given date range.
|
| Some one abstracted it because we have several "things" that use
| this logic.
|
| Then it had a bug because some of the things are inclusive and
| some are exclusive.
|
| Then it had a bug because some of the things use dates and some
| timestamps.
|
| Then it had a bug because some of the things are timezone aware
| and some are not.
|
| So we started down the path of a rather simple query construction
| becoming a complex thing with flags for inclusive/exclusive for
| start and end, timezone settings ...
| IshKebab wrote:
| That sounds like your code just isn't properly typed.
|
| For example in Rust the first bug would be caught by `Range` vs
| `RangeInclusive`.
|
| The second bug would trivially be caught because dates and
| timestamps are different types.
|
| The third is trickier, but (depending on exactly what you mean)
| that can be caught with static types too.
|
| Pointing your finger in the wrong place IMO. If anything this
| refactoring highlighted worrying inconsistencies in your code
| that probably would have cropped up as bugs elsewhere.
| willio58 wrote:
| Great example. One way to avoid these problems is having lots
| of tests written for the various uses of the abstracted thing
| so you know they're all covered. But also, if all of these
| things function in different nuanced ways, is it really any
| benefit to have them all jammed into the same abstraction in
| the first place? I've found this comes down to personal taste.
| I prefer a little duplication if it means not having to "own"
| an abstraction that I'll need to heavily document and hope
| people read the documentation for in order to not break. But
| some would rather own the one point of failure.
| williamdclt wrote:
| > So we started down the path of a rather simple query
| construction becoming a complex thing with flags for
| inclusive/exclusive for start and end, timezone settings ...
|
| Forcing the caller to _think_ about inclusivity and timezone
| awareness is not a bad thing, rather the opposite. These are
| important decisions to be taken: the abstraction is not trivial
| because what it abstracts actually does have inherent
| complexity.
|
| If the abstraction forces you to take the necessary decisions
| (inclusive? timezone?) without having to think of how to
| implement them, it doesn't sound like a bad abstraction. Too
| often these decisions are not thought about, and the expected
| behaviour is "whatever is implemented".
| alphanumeric0 wrote:
| Sounds like each thing should know how to search for active
| instances of itself, given a date range, which is a common OO
| abstraction.
| chiefbucket wrote:
| The point isn't the interface though it's the implementation.
| And if many of those things are implementing the same search
| functionality slightly differently, you're back to the same
| spot, except now your bugs are spread across multiple sites,
| often with duplication.
|
| The underlying issue is just that correctness is hard I
| think.
| marcosdumay wrote:
| One should at minimum name the things that behave differently
| by different names, what is a common practice in data
| modeling.
|
| I expect all those bugs to return again and again as
| different people maintain that code. At least with code
| deduplication they would have a clear alarm telling them
| their knowledge is wrong and they must pay attention. But
| with each query doing everything people will just assume they
| know it all.
| wvenable wrote:
| Who knows whether some things are inclusive or not? Who knows
| what use dates and timestamps? It seems like this should be
| abstracted somewhere and this knowledge codified in one single
| place. It sounds like your abstraction, in this case, isn't
| very abstract at all.
|
| That is common for bad abstractions -- they add a layer but
| they don't actually encapsulate any knowledge. To use this
| abstraction, you shouldn't be passing any flags for
| inclusive/exclusive, etc -- it should know that for you.
| opportune wrote:
| When you work in a very large and complex codebase you encounter
| a few things that this author doesn't seem to consider or thinks
| are very minor:
|
| 1. Refactoring something introduces non-negligible risk. Consider
| a class with many fields and multiple mutexes it uses to control
| concurrent access to those fields. Even just consolidating those
| mutexes introduces the hard-to-conclusively-find-in-testing risks
| of introducing a deadlocks and livelocks. And that's like the
| base case of refactoring the class: anything involving splitting
| the class up, moving data fields up or down the stack, changing
| the way member functions (which acquire locks) call each together
| is even more complicated and risky. It is just not worth
| refactoring this thing unless you have a very very good reason.
|
| 2. A function or object often has a many-to-many relationship in
| what it touches: it is called or accessed from multiple places
| and it calls and accesses many things. Non-trivial improvements
| to abstractions typically involve changes at both ends: which may
| be "as simple" as updating all the call sites to take a new
| argument or handle a different kind of error (hopefully all your
| call sites are structured so error handling is compatible with
| _their_ abstraction!) or as complex as completely refactoring
| multiple levels up and down the stack to reflect better-
| abstracted semantics.
|
| No you shouldn't lazily copy-paste around such problems when they
| are straightforward enough. But it can so so much less work (and
| again, less risk of breaking things) to use composition +
| wrappers, or inheritance, or to copy some little chunk to code
| than to do things the "right" way.
|
| 3. Let's face it, your cool new abstraction sounds right in your
| head, but in a complex system it may just be playing abstraction
| whackamole once all the bugs and edge cases you're not initially
| considering get addressed. It may be impossible to fully
| understand the entire system from beginning to end, without which
| it's hard to be confident you're actually improving things before
| embarking on your epic partial rewrite, or at the very least know
| you're not changing semantics around some arbitrarily-drawn box.
| But if you're not even changing the semantics, see point 1.
| JohnMakin wrote:
| We need a new saying - "Premature DRY is the root of all evil."
| vemv wrote:
| I wish more people simply were happy with using _themselves_
| whatever set of beliefs /techiques they deemed best (abstraction,
| duplication, whatever), preaching nothing, and arguing less.
|
| Which is to say, there will never be a single truth for these
| topics. So why not build a mindset that is ready for encountering
| differing opinions, diverse code?
| hinkley wrote:
| When your job is mentoring, RCA or cleaning up after other
| people (hello) then these aren't opinions and aesthetics.
| They're empirical evidence and/or coping mechanisms.
|
| Invalidating people's coping mechanisms without proposing your
| own never goes over well. And sometimes even then.
|
| When diagnosing a production issue, we don't have the luxury of
| entertaining five different ways to solve the same problem. And
| code smells slow debugging-under-the-gun because most bugs are
| in code smells, so they draw your attention only to prove to be
| a false signal.
|
| If you don't do any of these things, then it's challenging to
| have empathy for or understanding of the people who do. The
| people keeping the wheels on deserve the benefit of the doubt.
| In fact anyone who will stand up and fix problems when they
| arise deserves a bigger vote on how things get done. Everyone
| else's opinions are theoretical rather than vocational.
| dathinab wrote:
| I do.
|
| I have seen ton of time wasted due to the wrong abstraction.
|
| Through it's a question about how much and what you duplicate.
|
| Which means I somewhat partially agree with the articles which is
| more well nounced then the title implies.
|
| One of the most common case of bad de-duplication is doing so
| with code which happens to be mostly same but there is nothing on
| a business logic pov which makes it the same.
|
| Or code which differs mainly in points which the language used
| needs a lot of complexity to abstract over.
|
| In my experience having a more power full type system, like in
| Scala, Haskell or Rust one one side has the benefit of making the
| refactoring much less bug prone, but also are easier to go into
| the "abstraction introduces too much complexity" territory. In
| the end using a type system _appropriately_ is a skill. One some
| which some technical very skilled people are missing.
|
| Through what I also realized is that with strict type system a
| "top down abstractions" using e.g. custom
| traits/interfaces/abstract classes tend to be much more likely to
| cause issues compared to composite bottom up abstractions using
| closures to fill in the missing part. Sadly this kind of
| abstraction while simple in the simple case are also prone to
| need some limited degree of higher kindred typing in the less
| simple case. This is putting limits on how much you can
| practically apply them in many languages (or it accidental
| becoming to complex due to missing intuitive notation for the
| limited higher kindred type parts needed).
|
| Through the most important thing for many projects is to make the
| code easy to change. And with this I mean changing the source
| code, not having complicated abstractions allowing you to use the
| same source code in many different ways even through you only do
| use it in one way at any point in time.
| hinkley wrote:
| There are two situations I've observed where Sunk Cost Fallacy
| reliably doesn't kick in. One is three line functions and unit
| tests. The other is duplicated code. It's better to err on the
| side of mistakes that people don't get precious about fixing
| later.
|
| A lot of the arguments I have with coworkers end up being about
| friction and blind spots about friction. "You" think these
| things don't slow you or others down later, but I have a
| bibliography of incidents that say you're wrong. Wishful
| thinking is married to magic thinking, and they have a child
| named "mortgaging the future".
| stephc_int13 wrote:
| Software architecture is a domain where hard and fast rules don't
| work.
|
| This is all about understanding tradeoffs and nuance.
|
| In general, I believe that abstractions should be used with
| moderation, de-duplication is not always an improvement,
| especially in the long run.
|
| I've made this mistake a lot as I tend to be quite obsessive with
| so-called code "cleanliness".
|
| It is good that novice programmers are warned about the dark side
| of abstractions, but ultimately they'll have to experience it by
| themselves to fully grasp why and how they can be detrimental.
| jhp123 wrote:
| For some reason programmers think that an "abstraction" is the
| same as just naming something. If I take a bunch of code that
| will only work given specific, concrete conditions and give it a
| name like "setup()" then I have "abstracted" it.
|
| People who know what abstraction means, and people who use it to
| mean indirection or naming things, will of course never agree
| about how useful it is.
| [deleted]
| yowlingcat wrote:
| I found this blog post low on insight and thoughtfulness. I've
| worked with engineers in the past who had an inflated esteem not
| just of their own abilities but of the nature of the business
| domain they were ostensibly building solutions inside. I have
| found that in many cases, there's a level of naievete commingled
| with arrogance that comes from never having worked with an
| intrinsically complex enough problem to understand the true cost
| of abstraction, which is always nonzero.
|
| Now, it is the case that there are many cases where the cost of
| abstraction is low enough to not be ROI negative. But there are
| many cases otherwise. Other commentators here have done a great
| job of detailing that space -- that incidental and actual
| repetition vary, that abstractions should exist to reduce
| optionality and ease of reasoning rather than simply reducing
| code, and those are all correct. But at a very basic level, all
| of those observations reflect the most critical missing factor
| from this post, which is context.
|
| No software is created or operated in a vacuum. Every piece of
| software is created by humans to solve problems for themselves or
| other humans. So every piece of downstream of the working process
| of those humans. Given that these working processes are subject
| to change and evolution, changes in requirements aren't edge
| cases but table stakes. This means that often the cost of an
| abstraction is not just whether it's the wrong abstraction at a
| point of time, but also whether it's an abstraction that is
| likely to erode over time given a particular working process.
|
| With that said, a lot of this post seems like an exposition of
| this central point:
|
| ```
|
| If I were to see a confusing piece of code littered with
| conditional logic, I wouldn't see it and think "oh, there's an
| incorrect abstraction", I would just think, "oh, there's a piece
| of crappy code". It's neither an abstraction nor wrong, it's just
| bad code.
|
| ```
|
| I've seen this dismissal from many engineers over my career, and
| in every case, without fail, it reflected an inability to deeply
| read and understand the code, its history, and likely its future.
| To all the engineers out there reading this: thinking like this
| will prevent you from maturing from a junior engineer to a mid
| level engineer, never mind a mid level engineer to a senior
| engineer or engineering leader. You've been forewarned.
| henrydark wrote:
| I have a hot take on this, which I hope will resonate with at
| least a few people: duplication, even of blocks of up to a few
| long statements, rarely bothers me, because I remember all the
| duplications as a single instance. I have extra ordinary memory,
| and this makes a huge difference in how I think of and write
| code. Or anything really. I save everything I've ever written,
| like bash history, but everything, and refer beck to it and copy
| paste somewhere else. I wonder if anyone else has this. This
| doesn't affect how I think of production code, but it hugely
| affects my work flow.
| jmull wrote:
| The key factor when de-duping some code is to know whether the
| code is the same because they express _the same abstraction_ or
| due to _coincidence_.
|
| If they are the same abstraction then they should always be the
| same and you're doing the right thing to de-dupe.
|
| If they are the same due to coincidence de-duping will tie
| together things that should be independent. As development
| continues the implementations will need to diverge. That's when
| you get the rat's nest of conditional logic. It's a lot easier to
| add a parameter and conditional logic to a function than rip it
| out.
|
| It's not always easy to tell if two bits of code are the same due
| to coincidence or not... it might come down to nuances of
| business considerations that the developer has no idea about (or,
| since we're talking about predicting the future, no one knows
| about).
|
| I don't think it can be done perfectly. But it's worth
| considering why _not_ to de-dupe before you do it.
| aleksiy123 wrote:
| I think this is the biggest thing people generally
| misunderstand about "duplication".
|
| It's really about if two concepts need to change together over
| time. They should be singularly defined.
|
| If they can move independently they should be two definitions.
|
| It's not literally about the code looking the same.
| kevincox wrote:
| I was looking for this. There are definitely two types of
| duplication. For example not every use of the number 16 should
| be replaced with a SIXTEEN constant. However if the maximum
| allowed password length is 16 you shouldn't be writing 16 all
| over you code, you should be writing MAX_PASSWORD_CODEPOINTS
| because your system may depend on that value being consistent.
|
| Although I would disagree that you should never deduplicate
| things that are coincidentally the same. Sometimes code that is
| coincidentally the same can have the same bugs and require the
| same updates over time, so deduplicating them can reduce
| maintenance cost and remove bugs. However I wouldn't race to
| deduplicate these things. Just if they become frequent patterns
| or have remained the same for long enough to justify the effort
| to unify them.
| djha-skin wrote:
| I'm a DevOps engineer. I totally buy duplication is better than
| the wrong abstraction but I'd like to nuance it: duplication is
| better than an abstraction used by two disparate parties (groups
| of people that don't talk to each other).
|
| This is in agreement with Conway's law, which absolutely governs
| everything I do. I work on a DevOps team that supports several
| different development teams all working on different things. The
| code I write for those teams I often duplicate along team
| boundary lines. Build scripts, for example, I write and I put
| them in each team's git repository. These might look very
| similar. This allows the scripts to grow and change and evolve
| according to the different teams needs without the teams needing
| to talk to each other.
|
| "Proper duplication" goes back to separation of concern. If you
| have two different concerns (using the lens of Conway's law, two
| very different teams) using the same code, perhaps they should
| not be using the same code because that is not a separation of
| concern. Separate the concerns by separating the code paths both
| concerns use.
|
| This type of duplication is praised in more depth on wingolog[1].
| I highly recommend reading it as something every engineer should
| read.
|
| It's very important to know when to duplicate and when not to do
| so, because duplicating it the wrong time can lead to pain, but
| not duplicating can lead to pain also.
|
| 1: https://wingolog.org/archives/2015/11/09/embracing-
| conways-l...
| waffletower wrote:
| Very sad hearing this from a DevOps engineer. While the config
| smear of ops is encouraged by their tools (Terrorform is a
| fantastic example), a DevOps engineer that does not dedicate
| themselves to DRY practices will erode the productivity of an
| organization by default. I remember how Terrable things were
| before our Ops team developed strong module abstractions for
| our infrastructure. And get them to talk to each other.
| schnable wrote:
| 100% agree. Coupling between systems and teams is very
| expensive and should be done as deliberately as possible.
| dahart wrote:
| This. I think you're hitting the nail on the head. The question
| is whether there are multiple dependencies on a given bit of
| code. When there are multiple dependencies, changing the code
| because one of them wants something means the code needs to be
| checked and tested against all the other places the code is
| being used. And it's really really common to have inadequate
| understanding and inadequate test coverage, so things break,
| and hence people develop superstitions about code that
| shouldn't be touched.
|
| Another way of putting it is that if the code is really truly
| duplicated, then it doesn't need to change at all. If it has to
| change, the need for change is there because the multiple
| parties depending on that code have slightly different needs
| and slightly different ideas about what they want. Abstracting
| the code to make deduplication happen is just a way of
| spackling over those differences, but it can and does often
| cause trouble down the road, even when it's done well. Once
| abstracted for two dependencies, a third dependency or more
| without test coverage can make changes exponentially more
| dangerous and error prone.
|
| Duplication is good when forking for separate parties (or
| separate dependencies), each of whom may wish to customize the
| code, and now they are free to do so without the risk or fear
| of breaking someone else. I feel like the author of the article
| didn't understand the benefits of duplication.
| ncruces wrote:
| This is an example of Go's proverb: A little copying is better
| than a little dependency.
| yoyohello13 wrote:
| Here's my hot take: It's not worth thinking about abstraction
| until you've implemented the same thing 3 times.
| samsquire wrote:
| I think I would want to look for an accurate "representation" and
| expression of the right problem, not any particular abstraction
| technique or mechanical refactoring.
|
| Refactoring code to your understanding helps you understand the
| code but leaves the code in a different organisation to how it
| was, adapted for your mental model of the problem.
|
| If programming languages were expressive enough, we could
| represent things how they are and replicate that base pattern to
| different cases or scenarios and that would be enough but
| unfortunately our languages are not expressive of our high level
| intent and invariants we want to maintain. (Such as extensibility
| or hookability)
|
| In other words, get the mental model for the problem right and
| the abstraction will be invisible and the solution shall be
| obvious.
|
| Abstraction impedance mismatch is when people introduce a design
| pattern or a strategy that is harder to understand than the
| problem that was being solved and obfuscates it.
| ckdot2 wrote:
| Don't re-use using inheritance, but dependency injection. A
| (well-tested) software component that get's dependency-injected
| should get considered "final". If it makes sense to adjust it,
| you may still can do it - there's nothing preventing you from
| this. But you should always be aware that logic relying on the
| dependency may behave different in a way you haven't forseen. If
| you just want to make a change for a single place in your
| software, you can easily replace the dependency with another one
| implementing the same interface. You could even decorate the
| original dependency if you want to re-use most of it's code. What
| I want to say, nearly all of the abstraction issues come from
| inheritance and in many, many cases there's no need to use
| inheritance at all.
| dgb23 wrote:
| The most important underlying issue isn't discussed in the
| article:
|
| DRY must be understood and applied correctly.
|
| "Every piece of knowledge must have a single, unambiguous,
| authoritative representation within a system"
|
| The keyword here is _knowledge_.
|
| When we see duplication, repetition and so on, then that might be
| because that piece of code represents:
|
| - data of different entities that have similar structures
|
| - logic that just happens to be similar.
|
| - boilerplate code
|
| None of these things have anything to do with representing the
| same piece of knowledge in a program. In fact, you can easily get
| into trouble _especially_ if you think the first two things are
| violating DRY when they are not.
|
| I agree with the article, wholeheartedly though. If your code or
| data _model_ is not DRY, you can get into trouble very easily.
| Very nasty bugs, regressions during maintenance or extension,
| hours spent in frustration, money lost etc. On top of that: Non-
| DRY code almost always _proliferates incidental complexity_,
| because if you don't fix it, then eventually you patch over it.
|
| Here's the best case scenario: Even if you are aware of code not
| being DRY, do everything right and turn multiple knobs at the
| same time to change or extend it correctly instead of fixing it,
| you will do so with much more reluctance and it will be much more
| mentally taxing.
|
| Non-DRY code is by definition complex: You now have more
| interconnected parts than you need. So really, if you make your
| code more DRY, you _simplify_ it.
| dahart wrote:
| My favorite counter-acronym to DRY is WET: write everything
| twice (or thrice!). Doing and then redoing it once you
| understand it better is the best way to learn how to apply DRY
| correctly.
|
| > Non-DRY code is by definition complex: You now have more
| interconnected parts than you need. So really, if you make your
| code more DRY, you _simplify_ it.
|
| It really depends, I think there are some assumptions here that
| could use clarification. The whole point of choosing
| duplication is to _disconnect_ parts that shouldn't be
| connected, so I don't understand what you mean about non-DRY
| code being more interconnected than duplicated code. Conscious
| duplication (often called "forking") allows people who depend
| on a piece of code to change it without breaking anyone else.
| When you merge two pieces of similar code, they already had two
| or more separate uses, and you're adding a new connection,
| tying together the fates of two or more different users. From
| now on, if they don't have exactly the same agenda, there will
| be tension and /or bugs.
|
| If deduplication requires adding an abstraction layer, then
| that absolutely is adding complexity, and it happens because
| the code being de-duplicated was not _exactly_ the same. Code
| that's truly duplicated doesn't need to change in order to de-
| duplicate. So you can delete a copy in that case and centralize
| the dependencies onto the remaining copy. That eliminates code
| but doesn't really simplify; it has the potential to simplify
| future development, but it doesn't simplify the code at the
| moment of deletion. With modern build systems and project
| structures, however, it might take a lot of work and it might
| add complexity to get the DRY code into the right spot where
| it's visible to everyone who needs it. Another reason for
| duplication is to avoid having to do backflips to get the code
| into the right file or scope.
| dgb23 wrote:
| > Conscious duplication (often called "forking") allows
| people who depend on a piece of code to change it without
| breaking anyone else.
|
| Then that code is DRY by definition and simpler.
|
| It can't be Non-DRY because that would imply you'd need to
| change things in multiple places at once in order to avoid
| breakage.
|
| If you have two separate parts that can evolve independently,
| without coordination then those parts don't represent the
| same piece of knowledge.
| hinkley wrote:
| For testing I prefer DAMP. Descriptive And Meaningful
| Prose/Phrases. I've watched otherwise smart people wrestle
| with testing boilerplate when requirements change and I've
| had my fill for this lifetime.
|
| Each test is a separate story. At most tests in a suite
| should share setup code. Anything more than that is coupling
| of tests, which is a no-no. The distinction between mocks and
| fakes are the most common place I see this blow up in our
| faces. Fakes result in coupling of tests. They were difficult
| to write so they get amortized across ten tests, making new
| requirements difficult to impossible to add without
| accidentally removing coverage of other requirements.
| Attummm wrote:
| Until you have to maintain a codebase that has horrible
| abstractions, and everything is to coupled.
___________________________________________________________________
(page generated 2023-08-29 23:01 UTC)