[HN Gopher] Don't DRY Your Code Prematurely
___________________________________________________________________
Don't DRY Your Code Prematurely
Author : thunderbong
Score : 394 points
Date : 2024-05-30 15:45 UTC (7 hours ago)
(HTM) web link (testing.googleblog.com)
(TXT) w3m dump (testing.googleblog.com)
| PaulHoule wrote:
| Not a conclusive example.
|
| In the industry code that isn't DRY is a much bigger problem than
| code that is too DRY.
| znkr wrote:
| I am the industry for over 10 years now. Whenever I have to
| work with a project where someone used DRY consciously, I know
| I am in for a world of pain. Consolidating code is easy,
| pulling it apart is a lot harder.
| actionfromafar wrote:
| Can concur. Mostly it was I causing the pain, earlier.
| lpapez wrote:
| > Consolidating code is easy, pulling it apart is a lot
| harder.
|
| I absolutely agree with this, and the only thing I would add
| is that is difference is even more pronounced in codebases
| using a dynamic language.
|
| Sure it's not easy to navigate a bowl of duplicated
| spaghetti, but navigating opaque DRY service classes without
| explicit types is a _nightmare_.
|
| Luckily as an industry we've realized the benefits of static
| typing, but your point still holds true there.
| mytailorisrich wrote:
| How do you consolidate code?
|
| Good way to go at it is to isolate the functionality that is
| used many times and to pull it aside in its own function (or
| similar). That's just good code practice and also makes it
| easy to refactor and modify as needed.
| znkr wrote:
| It's not about being used many times, but about the
| necessity to evolve in the same direction. When that
| happens, it usually manifests as toil for the team.
| Consolidating code means to change the structure of the
| code so that only one piece needs to be modified in the
| future. That can take many forms, but it usually involves
| creating a new shareable component.
|
| Shareable components are more effort to maintain, so just
| creating them because they consolidate code is not always a
| good idea. You really want to have positive ROI here and
| you only get that if you actually reduce maintenance
| burden. For raw code duplication that doesn't have a
| maintenance issue on it's own, the bar is a lot higher than
| most people think.
| PaulHoule wrote:
| Well, this morning I just fixed a case where somebody had
| used btoa to base64 encode something in Javascript and
| used methods from Buffer somewhere else because they'd
| been intimidated away from using btoa. (Ok, it is dirty
| to use UTF-8 codepoints if it is byte values, you can
| write btoa("A") but btoa("Zhong ") is a crash.)
|
| It would have been OK if they'd used the right methods on
| Buffer but they didn't.
|
| These encoding/decoding methods are a very good example
| of code that should be centralized, not least so you can
| write tests for them. (It is a favorable case for testing
| because the inputs and outputs are well defined and there
| are no questions of whether execution is done like you
| might encounter testing a React component) It is so easy
| to screw this kind of thing up in a gross way or an a
| subtle way (I'm pretty sure btoa's weirdness doesn't
| affect my application because codepoints > 255 never show
| up... I think)
|
| There's the meme that you should wait until something
| used 3 times before you copy it but here is a case where
| two repetitions were too many and it had a clear impact
| on customers.
| mytailorisrich wrote:
| Raw code duplication is always a maintenance issue when
| centralising it when you notice the duplication (instead
| of keeping copy-pasting it) costs nothing.
| EugeneOZ wrote:
| I have 20 years in the industry and one of the rules I
| learned is: Articles justifying laziness are ALWAYS warmly
| welcomed and praised.
|
| To get internet points easily, write something of that:
|
| "Clean code is overrated"
|
| "SOLID is holding you back"
|
| "Tests are less important than profits"
|
| "KISS is the only important principle"
|
| "Declarative programming is only suitable for pet projects"
|
| "Borrow checker is the plague of Rust"
|
| and so on.
| 12_throw_away wrote:
| I basically agree, but doesn't this just mean, if I'm
| consolidating non-DRY code, that I'm now the one using DRY
| consciously, and the next dev will be cursed with all of my
| newly introduced DRY abstractions?
| a1369209993 wrote:
| > Whenever I have to work with a project where someone used
| DRY [ _]consciously[_ ], I know I am in for a world of pain.
|
| Huh. When you put it that way, that's actually a good point.
| In my experience, competent programming will try to
| consolidate repeated code, and then cite "because DRY" if
| asked why, but I can't think of any case where I or anyone
| else competent _started_ with "needs more DRY" as the
| original motivation (as opposed to "this is a
| incomprehensibly verbose mess" or the like).
|
| Conversely, _starting_ with "don't repeat yourself [and
| don't let anything else repeat itself]" as a design goal does
| seem to correlate well with cases where someone temporarily
| (newbie) or permanently (moron/ideologue) incompentent
| followed that design principle off a cliff.
| rqtwteye wrote:
| "Consolidating code is easy, pulling it apart is a lot
| harder."
|
| My experience is the opposite. The less code, the better. I
| just spent a week on refactoring UI automation test code
| where they had copied the same 30 lines of code into almost
| 100 places. Every time with an ID changed and some slightly
| different formatting. It took me a few days to figure out
| that these sections do the same thing so I decided to
| introduce a function with ID as parameter. It was a lot of
| work to identify all sections and then to make sure they are
| really equivalent.
|
| Saved us 3000 lines of code and now we can be sure that
| timeouts and other stuff is handled correctly everywhere. An
| we can respond to changes quickly.
|
| that's DRY to me. Don't copy/paste code. Introduce functions.
| Ideally in the simplest way. When you have functions, you
| declare the same behavior everywhere.
| nkozyra wrote:
| > In the industry code that isn't DRY is a much bigger problem
| than code that is too DRY.
|
| As with anything dogmatic, it truly depends. There are times
| when the abstraction cost isn't worth it for a few semi-
| duplicate implementations you want to combine into a single
| every-edge-case function/method.
| PaulHoule wrote:
| There's a certain psychological attraction to messy and
| confused situations which people are just too comfortable
| with but it explains why things like GraphQL (didn't have a
| definition for how it worked for years because "Facebook is
| going to return whatever it wants to return") inevitably win
| out over SPARQL (which has a well-defined algebra).
|
| One of my biggest gripes (related to the post) is the data
| structure create table student (
| ... applied_date datetime,
| transcript_received datetime,
| recommendation_letter1_received datetime,
| recommendation_letter2_received datetime,
| rejected_date datetime,
| accepted_date datetime,
| started_classes_date datetime,
| suspended_date datetime,
| leave_of_absence_start_date datetime,
| leave_of_absence_end_date datetime, ...
| graduated_date datetime, ...
| gave_money_date datetime,
| died_date datetime )
|
| which is of course an academic example but that I've seen in
| many kind of e-business application. Nobody ever seems to
| think of it until later but two obvious requirements are: (1)
| query to see what state a user was in at a given time, (2)
| show the history of a given user. The code to do that in the
| above is highly complex and will change every time a new
| state gets added. The customer also has experiences like "we
| had a student who took two leaves of absence" or "some
| students apply, get rejected, apply again, then get accepted"
| When you find data designs like this you also tend to find
| some of the records are corrupted and when you are recovering
| the history of users there will be some you'll never get
| right.
|
| If you think before you code you might settle on this design
| create table history ( student_id
| integer primary key, status
| integer not null, begin_date
| datetime not null, end_date
| datetime )
|
| which solves the above problems and many others in most
| situations. (For one thing the obvious queries are trivial
| and event complex queries about times and events can be
| written with the better schema.) I can't decide if the thing
| I hate the most about being a programmer is having to clean
| up messes like the above or having to argue with other
| developers about why the first example is wrong.
|
| If "No code" is to really be revolutionary it's going to have
| to have built-in ontologies so that programmers get correct
| data structures for situations like the above that show up
| everyday in everyday bizaps where there is a clear right
| answer but it is usually ignored.
| gls2ro wrote:
| Two points here just for fine grain discussion:
|
| 1. The first table structure is a flat non-normalized table
| structure that trades normalization for easy to query and
| select computed properties
|
| 2. Second structure is a normalized table structure that
| trades the normalization for joins.
| PaulHoule wrote:
| Either one is normalized so far as I know.
|
| It is easy to write a query for the first that gets a
| list of students names and the dates they applied. That
| query is harder for the second one. On the other hand
| figuring out what state a user was in at time _t_ could
| be a very hard problem with the first table.
|
| My experience with the first is that you find corrupted
| data records, one cause of that will be that people will
| cut and paste the SQL queries so maybe 10% of the time
| they wind up updating the wrong date. Systems like that
| also seem to have problems with data entry mistakes.
|
| The biggest advantage of #2 is ontological and not
| operational, which is that in a business process an item
| is usually in exactly one state out of a certain set of
| possible states. Turns out that this invariant influences
| the set of reasonable requirements that people could
| write, the subconscious expectations of what users
| expect, needs to be implicitly followed by an
| application, etc.
|
| Granted some of the dates I listed up there don't quite
| correspond to a state change, for instance the system
| needs to keep track of when a student started an
| application and when the last document (transcripts,
| letters, etc.) has been received. With 5 documents you
| would have 32 possible states of received or not and
| that's unreasonable, particularly considering that a
| student with just one letter and a very strong
| application in every other way might get accepted despite
| that. It's fair to say the student can have an "open
| application" and a "complete application". Similarly you
| could say the construction of an airplane or a nuclear
| power plant can be defined by several major phases but
| that these systems have many parts installed so if the
| left engine is installed but the right engine is not
| installed these are properties of the left and right
| engine as opposed to the plane.
| danielmarkbruce wrote:
| In your part of the industry, perhaps. My experience has been
| the opposite.
| ravenstine wrote:
| Same. From what I've seen, most code is written with
| abstractions and DRY as a high priority rather than writing
| code that is performant and doesn't take jumping between 5
| different files to make sense of it.
| danielmarkbruce wrote:
| I started writing Go around 2012 or so because of the file
| jumping thing. Drove me nuts. I'm sure there were many
| folks doing the same thing.
| jacknews wrote:
| "In the industry code that isn't DRY is a much bigger problem
| than code that is too DRY."
|
| which industry is that?
|
| in general programming, absolute nope
|
| not-DRY code can be weaseled out with a good ide
|
| badly abstracted code, not so much
|
| in fact in a way, DRY is the responsibility of the IDE not the
| programmer - an advanced IDE would be able to sync all the
| disparate code segments, and even DRY them if necessary
|
| but when I read DRYed code, the abstraction better be a
| complete and meaningful summary, like 'make a sandwich', and
| without many parameters (and no special cases), or else I'd
| rather read the actual code
|
| i understand the impulse to try to factorize everything but it
| just doesn't work beyond a certain point in the real world;
| it's too difficult to read, and there's always an 'oh, can you
| just' requirement that upends the entire abstract tower.
| goatlover wrote:
| You didn't provide any evidence for this, you just stated
| your coding preference. Which is usually the case in these
| discussions. Some anecdotes, and then people making grand
| claims based on personal preference. Obviously, some
| programmers have thought the opposite and have their own
| anecdotes.
| ldjkfkdsjnv wrote:
| Abstraction too early is usually a mistake, no one is smart
| enough to predict all the possible edge cases. Repeated code
| allows someone to go in there and add an edge case easily. Its
| a more fool proof way of programming
| swatcoder wrote:
| Having specialized in project rescue, touring all over "the
| industry", you can't possibly make that generalization.
|
| For every purported best practice, there are teams/orgs that
| painted themselves into a corner by getting carried away and
| others that really would have benefited from applying it more
| than they did.
|
| In the case of DRY, it's an especially accessible best practice
| for inexperienced developers and the project leads many of them
| become. Many many teams do get carried away, mistaking "these
| two blocks of code have the same characters in the same
| sequence" with "these two delicate blocks of code are doing the
| same thing and will likely continue to do so"
|
| Having advice articles floating around on both sides of
| practices like this helps developers and teams find the
| guidance that will get them from where they are to where they
| need to be.
|
| Context, nuance, etc, etc
| PaulHoule wrote:
| If that's what they wanted to prove they should have shown a
| better example.
| swatcoder wrote:
| That's fair. I think the insight/concept behind the essay
| is sound, but I agree that the example (and writing) could
| be a lot better.
| AnimalMuppet wrote:
| In Zion National Park, there's a hike called Angel's Landing.
| For part of the hike, you go along this ridge, where on one
| side you have a cliff of 500 feet straight down, and on the
| other side, you have a cliff of 1000 feet straight down. And
| in places, the ridge is only a couple of feet wide.
|
| Best practices can be like that. "Here's something to avoid!"
| "OK, I'll back far away from that." Yeah, but there's another
| cliff behind you, of the opposite error that is also waiting
| to wreck your code base.
|
| Listen to best practices. Don't apply them dogmatically, or
| without good judgment.
| jeltz wrote:
| Not from my experience. Unnecessarily duplicated code, even
| when there are small differences which are likely accidental,
| is usually much easier to fix than too DRY code. Pulling apart
| false sharing can be really hard.
| barryrandall wrote:
| The number of person-hours wasted on over-engineered products
| that never even made it to release could have: solved the
| halting problem, delivered AGI v2.0, made C memory-safe without
| compromising backward-compatibility, or made it easy to adjust
| mouse pointer speed on Linux.
| idontwantthis wrote:
| Your code should be WET before it's DRY (Write Everything Twice).
| thefaux wrote:
| Yes, why?
| idontwantthis wrote:
| Because you're unlikely to write a good abstraction until you
| need it more than twice.
|
| And if you only need the code twice, you very likely wasted
| time writing the abstraction because copying updates between
| the two locations is not hard.
|
| This is a rule of thumb, I'm not trying to tell anyone how to
| do their job.
| zendist wrote:
| The rule of three[1] also comes to mind and is a hard learned
| lesson.
|
| My brain has a tendency to desire refactoring when I see two
| similar functions, I want to refactor--it's almost always a bad
| idea. More often than not, I later find out that the premature
| refactoring would've forced me to split the functions again.
|
| 1:
| https://en.m.wikipedia.org/wiki/Rule_of_three_(computer_prog...
| swader999 wrote:
| Nice, I advocate for this but never new it was a more formal
| thing.
| fellowniusmonk wrote:
| Or alternatively, Write Everything Today.
|
| DRY when it's a wielded as a premature optimization (like all
| other premature optimization) prevents working code that is
| tailored to solving a problem from shipping quickly.
| cjbgkagh wrote:
| I thought it was Don't Repeat Yourself more than three times.
| S0y wrote:
| "premature optimization is the root of all evil"
| blowski wrote:
| I find the term DRY to be pretty vague.
|
| Let's say you have a business rule that you can never have more
| than 5 widgets. You can make this assumption in multiple places,
| even with totally different code, and that's damaging DRY when
| the rule changes to allowing 6. On the other hand, having a bit
| of duplicated HTML can help, as they may only be the same by
| accident.
| mytailorisrich wrote:
| In thise case, '5 widgets max' would be a parameter that should
| be defined as a global constant instead of having a hard-coded
| 5s all over the place or, worse, pieces of code copy-pasted 5
| times... That's a standard good coding practice.
| softwaredoug wrote:
| Generality can really hurt performance. Duplicating specialized
| code to handle different cases can really help optimize specific
| code hot spots for certain data patterns or use cases.
|
| So DRY isn't an obvious default for me.
| marcandre wrote:
| I'd love examples where DRY can really hurt performance.
| Typically what matters most in terms of performance is the
| algorithm used, and that won't change.
|
| More importantly, cleverer people than me said "premature
| optimization is the root of all evil"
| mikepurvis wrote:
| IMO it hurts developer productivity more than performance,
| because it introduces indirection and potentially unhelpful
| abstractions that can obscure what is actually going on and
| make it harder to understand the code.
|
| In raw performance this could manifest as issues with data
| duplication bloating structures and resulting in cache
| misses, generic structures expressed in JSON being slower
| then a purpose-built struct, chasing pointers because of
| functions buried in polymorphic hierarchies. But I doubt that
| any of this would really matter in 99% of applications.
| rhdunn wrote:
| Premature optimization is about not making a micro-
| implementation change (e.g. `++i` vs `i++`) for the sake of
| percieved performance. You should always measure to identify
| slow points in expected workloads, profile to identify the
| actual slow areas, make high-level changes (data structure,
| algorithm) first, then make more targetted optimizations if
| needed.
|
| In some cases it makes sense, like writing SIMD/etc. specific
| assembly for compression/decompression or video/audio codecs,
| but more often than not the readable version is just as good
| -- especially when compilers can do the optimizations for
| you.
|
| A lot of times I've found performance increases have come
| from not duplicating work -- e.g. not fetching the same data
| each time within a loop if it is fixed.
| nsguy wrote:
| Not really. Knuth was talking about putting effort to make
| a non-critical portion of the software more optimized. He's
| saying put effort into the smaller parts where performance
| is critical and don't worry about the rest. It's not about
| `++i` vs. `i++` (which is semantically different but
| otherwise in modern compilers not an optimization anyways
| but I digress).
| ummonk wrote:
| The optimizations he was talking about were things like
| writing in assembly or hand-unrolling loops. It was
| assumed that you've already picked an performant
| algorithm / architecture and are writing in a performant
| low level language like C.
|
| Also, your digression about modern compilers is
| irrelevant to the context of the quote, since Knuth
| talked about premature optimization at a time when
| compilers were much simpler than today.
| rhdunn wrote:
| That was my point, though. Don't worry about minor
| possible changes to the code where the performance
| doesn't matter. For example, if the ++i/i++ is only ever
| executed at most 10 times in a loop, is on an integer
| (where the compiler can elide the semantic difference)
| and the body of the loop is 100x slower than that.
|
| If you measure the code's performance and see the ++i/i++
| is consuming a lot of the CPU time then by all means
| change it, but 99% of the time don't worry about it. Even
| better, create a benchmark to test the code performance
| and choose the best variant.
| nsguy wrote:
| That's not my interpretation. If you're profiling and
| benchmarking you're already engaging in (premature)
| optimization. This process you're describing of finding
| out whether `i++` is taking a lot of CPU time and then
| changing it is exactly what Knuth is saying not to worry
| about for 97% of your code. Knuth is saying it doesn't
| matter if `i++` is slow if it's in a non-performance
| critical part of your code. Any large piece of software
| has many parts where it doesn't matter for any practical
| purpose how fast they run and certainly one loop in that
| piece of software doesn't matter. For example, the
| software I'm working on these days has some fast C code
| and then a pile of slow Python code. In your analogy all
| the Python code is known to be much slower than the C
| code, we don't need a profiler or benchmarks to tell
| that, but it also doesn't matter because the core
| performant functionality is in that C code.
| randomdata wrote:
| Knuth says forget about _small_ efficiencies in 97% of
| your code. Indeed, the `i++` optimization isn 't apt to
| make more than a small difference, even with the most
| naive compiler, but other decisions could lead to larger
| chasms. It seems he is still in favour of optimizing for
| the big wins across the entire codebase, even if it
| doesn't really matter in practice.
|
| But it's your life to live. Who cares what someone else
| thinks?
| candiddevmike wrote:
| In an effort to DRY, you add a bunch of if statements to
| handle every use case.
| swatcoder wrote:
| In the general case, it usually depends on the latency of
| what you'd DRY your code to vs the latency of keeping the
| implementation local and specialized.
|
| If you're talking about consolidating some code from one in-
| process place to another in the same language, you're mostly
| right: there's only going to be an optimization/performance
| concern when you have a very specific hotspot -- at which
| point you can selectively break the rule, following the
| guidance you quoted. This need for rule-breaking can turn out
| to be common in high-performance projects like audio,
| graphics, etc but is probably not what the GP had in mind.
|
| In many environments, though, DRY'ing can mean moving some
| implementation to some out-of-language/runtime, out-of-
| process. or even out-of-instance service.
|
| For many workloads, the overhead of making a bridged, IPC, or
| network call swamps your algorithm choice and this is often
| apparent immediately during design/development time. It's not
| premature optimization to say "we'll do a lot better to
| process these records locally using this contextually tuned
| approach than we will calling that service way out over
| there, even if the service can handle large/different loads
| more efficiently". It's just common sense. This happens _a
| lot_ in some teams /organizations/projects.
| laborcontract wrote:
| Langchain. Helps on the initial productivity, is a nightmare
| on the debugging and performance improvement end.
| eyelidlessness wrote:
| > I'd love examples where DRY can really hurt performance.
|
| A really common example is overhead of polymorphism, although
| that overhead can vary a lot between stacks. Another is just
| _the effect_ caused by the common complaint about premature
| abstraction: proliferation of options /special cases, which
| add overhead to every case even when they don't apply.
| nsguy wrote:
| This quote is often taken out of context, here's the full
| quote: "Programmers waste enormous amounts of time thinking
| about, or worrying about, the speed of noncritical parts of
| their programs, and these attempts at efficiency actually
| have a strong negative impact when debugging and maintenance
| are considered. We should forget about small efficiencies,
| say about 97% of the time: premature optimization is the root
| of all evil. Yet we should not pass up our opportunities in
| that critical 3%."
|
| If you want a specific example look at something that needs
| to be performant, i.e. in those 3%, let's say OpenSSL's AES
| implementation for x86, or some optimized LLM code, you'll
| see the critical performance sections include things that
| could be reused, but they're not.
|
| Also the point Knuth is making is don't waste time on things
| that don't matter. Overuse of DRY falls squarely into that
| camp as well. It takes more work and doesn't really help. I
| like Go's proverb there of "A little copying is better than a
| little dependency."
| rgrmrts wrote:
| Knuth was talking about a very specific thing, and the
| generalization of that quote is a misunderstanding of his
| point.
|
| Source: Donald Knuth on the Lex Fridman podcast, when Lex
| asks him about that phrase
| nsguy wrote:
| I wasn't aware this was discussed, thanks for the
| pointer! I'm curious now what _he_ says he was talking
| about ;)
| xiasongh wrote:
| This might not be a perfect example, but there's a paper by
| Michael Stonebraker "One size fits all": an idea whose time
| has come and gone
|
| It might not specifically be DRY, but still related generic
| vs specialized code/systems.
|
| https://ieeexplore.ieee.org/document/1410100
| mikepurvis wrote:
| I think it really depends and it's a case where a lot of
| engineering judgment and taste comes to bear. For example right
| now I'm maintaining a Jenkins system that has two large and
| complicated pipelines that are about 90% overlapping but for
| wretched historical reasons were implemented separately and the
| implementations have diverged over the years in subtle ways
| that now make it challenging to re-unify them.
|
| There is no question in my mind that this should always have
| been built as either a single pipeline with some parameters to
| cover the two use-cases, or perhaps as a toolbox of reusable
| components that are then used for the overlapping area. But I
| expect the mentality at the time the second one was being stood
| up was that it would be less disruptive to just build the new
| stuff as a parallel implementation and figure out later how to
| avoid the duplication.
| kccqzy wrote:
| You are describing technical debt, not conscious decisions to
| be DRY or not DRY.
| mikepurvis wrote:
| Hmm. Certainly there's no doubt that there's technical debt
| ("do it this way for now, we'll clean it up later") here
| too, but I think there was also a conscious decision to
| build something parallel _rather_ than generalizing the
| thing that already existed to accommodate expanding
| requirements.
| sys_64738 wrote:
| > Generality can really hurt performance.
|
| Only in critical regions of code though.
| jprete wrote:
| I agree for very specific situations, but compilers tend to get
| better at optimization over time, and it can be better to
| express plain intent in the code and leave low-level
| optimization to the compiler, rather than optimizing in code
| and leaving future hardware/compiler improvements on the table.
| DanielHB wrote:
| Boilerplate that you can't get wrong is better than DRY in most
| cases
|
| by "get wrong" I mean through static analysis (linters or type
| checkers) or if it is plainly obvious by running it.
| pydry wrote:
| I don't know about anyone else, but I've been _deeply_
| unimpressed with the output of the google testing blog.
|
| This example is not wrong, but it's not particularly insightful
| either. Sandi Metz said it better here, 8 years ago
| https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction
|
| The testing pyramid nonsense is probably the worst one though.
| Instead of trying to find a sensible way to match the test type
| to the code, they pulled some "one size fits all" shit while
| advertising that they aren't that bothered about fixing their
| flaky tests.
| sitkack wrote:
| Google doesn't test! That is what production and SREs and users
| are for.
| nrook wrote:
| I think you're holding some of these to too high of a bar. This
| is a one-page article intended to be posted in company
| bathrooms. Of course it's less comprehensive than a longer blog
| post.
| pydry wrote:
| It's not like the subtitle says "not to be taken seriously"
| and they are representing a brand that is supposed to stand
| for engineering excellence.
| bitcharmer wrote:
| Most seasoned software engineers stopped following google in
| that respect a long time ago. They are not a tech shop any
| more; it's just an add business now with lots of SRE work.
| localfirst wrote:
| also never write tests for code that doesn't exist because you
| gradually slow down learning to a crawl and you are no longer
| writing features but tests and mockups that offer nothing to the
| end user.
| hardwaregeek wrote:
| It's important to remember that all best practices are not
| created equal. I'd prioritize readability over DRY. I'd
| prioritize cohesion over extensibility. When people talk about
| best practices, they don't talk about how a lot of them are
| incompatible, or at least at odds with each other. Writing code
| is about choosing the best practices you want to prioritize as
| much as it's about avoiding bad practices.
| unnouinceput wrote:
| Maintenance is 90% of a project life time. Sometime those "best
| practices" rigid implemented means the project won't live to
| see even it's 1st birthday.
| 0xbadcafebee wrote:
| Readability doesn't matter much when you have 10,000+ lines of
| code. You aren't going to read all that code, and new code
| introduced by other people continuously isn't something you can
| keep track of, so even if you understand one tiny bit of code,
| you won't know about the rest. You need a system of code
| management (documentation, diagram, IDE, tests, etc), to
| explain in a human-friendly way what the hell is going on.
| Small chunks of code will be readable enough, and the code
| management systems will help you understand how it relates to
| other code.
| ugh123 wrote:
| > You need a system of code management (documentation,
| diagram, IDE, tests, etc), to explain in a human-friendly way
| what the hell is going on
|
| I think this is where AI could be helpful in explaining and
| inspecting large codebases, as an assist to a developer.
| rohansingh wrote:
| Maybe but hallucinations become a real problem here. Even
| with publicly available API's that are just slightly off
| the beaten path, I've gotten full-on hallucinations that
| have derailed me and wasted time.
| dieortin wrote:
| Even if you're not going to read 10.000+ lines, if the few
| you read are easy to understand you're still going to have a
| much better time maintaining the codebase.
| chipdart wrote:
| > Readability doesn't matter much when you have 10,000+ lines
| of code. You aren't going to read all that code (...)
|
| You got it entirely backwards. Readability becomes far more
| important with the size of your project.
|
| When you get a bug report of a feature request, you need to
| dive into the code and update the relevant bits. With big
| projects, odds are you will need to change bits of the code
| you never knew they existed. The only way that's possible is
| if the code is clear and it's easy to sift through,
| understand, and follow.
|
| > You need a system of code management (documentation,
| diagram, IDE, tests, etc), to explain in a human-friendly way
| what the hell is going on.
|
| That system of code management is the code itself. Any IDE
| supports searching for references, jump to definitions, see
| inheritance chains, etc. Readable code is code that is easy
| to navigate and whose changes are obvious.
| foresto wrote:
| > Readability doesn't matter much when you have 10,000+ lines
| of code. You aren't going to read all that code,
|
| As someone who has read 10,000+ lines in order to track down
| surprising behavior in other people's code, I can say without
| a doubt that readability still matters at that scale.
|
| Code management systems can sometimes be helpful, but they
| are no substitute.
| fmbb wrote:
| > Small chunks of code will be readable enough
|
| Ravioli code is a real problem though. Saying small chunks
| are readable is not enough. The blast radius of a five byte
| change can be fifteen code paths and five million requests
| per hour.
| HumblyTossed wrote:
| 10KLoC is a very small app. Ours isn't that big and it's
| 140KLoC and I have read almost all of it.
| drojas wrote:
| I agree and would add that one of the goals for technical
| design or architecture work is to choose the architecture that
| minimizes the friction between best practices. For example if
| you architecture makes cohesion decrease readability too much
| then perhaps there is a better architecture. I see this
| tradeoff pop up from time to time at my work for example when
| we deal with features that support multiple "flavors" of the
| same data model, then we have either a bunch of functions for
| each providing extensibility or a messy root function that
| provides cohesion. At the end both best practices can be
| supported by using an interface (or similar construct depending
| on the language) in which cohesion is provided by logic that
| only cares about the interface and extensibility is provided by
| having the right interface (offload details to the specific
| implementations)
| englishspot wrote:
| I have a pessimistic view that ultimately the only best
| practices that matter are the ones your boss or your tech lead
| likes.
| jkaptur wrote:
| What about when you are the boss or tech lead?
| znkr wrote:
| Then the only best practices that matter are the ones that
| your team believes are correct
| rvnx wrote:
| The best practices are the ones that allow you to do
| business and where the maintenance work is relatively not
| too painful considering the budgeted development time.
|
| Your task is to deliver a good product, not necessarily
| good code.
| englishspot wrote:
| > The best practices are the ones that allow you to do
| business and where the maintenance work is relatively not
| too painful considering the budgeted development time.
|
| the problem is even that in concrete terms can be
| controversial. everyone wants to minimize maintenance
| work; not everyone agrees on what kind of code will
| achieve that.
| delichon wrote:
| "Read" isn't quite the right word for code. "Decode" is better.
| We have to read to decode, but decoding is far less linear than
| reading narrative text. Being DRY usually makes decoding
| easier, not harder, because it makes the logic more cohesive.
| If I know you only fromajulate blivers in one place I don't
| have to decode elsewhere.
| hathawsh wrote:
| Well, "read" is still the verb we use most often to describe
| a human interpreting code. Also, many information-dense books
| are not intended to be read linearly, yet we still say we're
| "reading" (or "studying") the book.
| wwfn wrote:
| I was just mulling this over today. DRY = easier-to-decode is
| probably true if you're working on groking the system at
| large. If you just want to peak in at something specific
| quickly, DRY code can be painful.
|
| I wanted to see what compile flags were used by guix when
| compiling emacs. `guix edit emacs-next` brings up a file with
| nested definitions on top of the base package. I had to trust
| my working memory to unnest the definitions and track which
| compile flags are being added or removed. https://git.savanna
| h.gnu.org/cgit/guix.git/tree/gnu/packages...
|
| It'd be more error prone to have each package using redundant
| base information, but I would have decoded what I was after a
| lot faster.
|
| Separately, there was a bug in some software aggregating
| cifti file values into tab separated values. But because any
| cifti->tsv conversion was generalized, it was too opaque for
| me to identify and patch myself as a drive-by contributor.
| https://github.com/PennLINC/xcp_d/issues/1170 to https://gith
| ub.com/PennLINC/xcp_d/pull/1175/files#diff-76920...
| emidln wrote:
| Bazel solves this exact problem (coming from its
| macrosystem) by allowing you to ask for what I term the
| "macroexpanded" BUILD definition using `bazel query
| --output=build //some/pkg/or:target`. When bazel does this,
| it also comments the file, macro,and line number the
| expanded content came from for each block.
|
| This gives us reuse without obscuring the real definition.
|
| I automated this in my emacs to be able to "macroexpand"
| the current buid file in a new buffer. It saves me a lot of
| time.
| withinboredom wrote:
| > Being DRY usually makes decoding easier, not harder
|
| "Usually" being the keyword and what the article is all about
| IMHO. I work in a codebase so DRY that it takes digging
| through dozens of files to figure out what one constant
| string will be composed as. It would have been simpler to
| simply write it out, ain't nobody going to figure out
| OCM_CON_PACK + OCM_WK_MAN means at a glance.
| djeastm wrote:
| >I work in a codebase so DRY that it takes digging through
| dozens of files to figure out what one constant string will
| be composed as.
|
| I don't know the codebase, but to my mind that level of
| abstraction means it's a system-critical string that
| justifies the work it takes to find.
| RussianCow wrote:
| Sorry, but this doesn't make sense. Why should system
| critical things be more difficult to understand? Surely
| you want to _reduce_ room for error, not increase it?
| withinboredom wrote:
| I mean, sure, I guess API urls could be system-critical.
| But generally, I prefer to grep a codebase for a url
| pattern and find the controller immediately. Instead, you
| have to dig through layers of strings composed of other
| strings and figure it out. Then at the end, you're
| probably wrong.
| jbverschoor wrote:
| Function calls, the essence of DRY, are only readable if it
| is well known and well understood what it does.
|
| When code is serial, with comment blocks to point out
| different sections, it is much easier to read, follow, and
| debug.
|
| This is also a little bit of a tooling problem
| dsego wrote:
| Visually parse.
| Nevolihs wrote:
| Does it? Every time I see DRY'd code, it usually makes the
| project it's in more difficult to understand. It's harder to
| understand where values come from, where values are changed,
| what parts of the codebase affect what. And that's before
| trying to figure out where to change something in the right
| place, because it's often unclear what other parts of the
| code are coupled to it through all the abstractions.
|
| At a high level, at first glance, the code might look good
| and it "makes sense". But once you want to understand what's
| happening and why, you're jumping through five different
| classes, two dozen methods and you still don't know for sure
| until you run a test request against the API and see what
| shows up where in the debugger. And you realize your initial
| glimpse of understanding was just window dressing and
| actually nothing makes sense unless you understand every
| level of the abstractions being used.
|
| It's suddenly a puzzle to understand another software
| developer instead of software engineering.
| cloverich wrote:
| One area I find DRY particularly annoying is when people
| overly abstract Typescript types. Instead of a plain
| interface with a few properties, you end up with a bunch of
| mushed together props like { thing: boolean } &
| Pick<MyOtherObj, 'bar' | 'baz'} & Omit<BaseObj, 'stuff'>
| instead of a few duplicated but easily readable interfaces:
|
| interface MyProps { thing: boolean; bar: string; baz: string;
| stuff: string; }
| jddj wrote:
| Am I crazy for almost exclusively just using _type_ and sum
| types and no generics or interfaces and somehow being able
| to express everything I need to express?
|
| Kind of wondering what I'm missing now.
| Aeolun wrote:
| Hmm, you can do pretty nice things with generics to make
| some things impossible (or at least fail on compile), but
| I agree it's hardly readable. In some cases you need that
| though.
| blackoil wrote:
| I am of opinion, code should be written to be readable. Rest of
| the desirable properties are just side-effects.
| gavmor wrote:
| I think it's fair to say that between behavior and
| maintainability, one is inflexible and the other hangs from
| it in tension.
| jay-barronville wrote:
| Fully agree. I think this is something that takes some
| time/experience to appreciate though. Junior engineers will
| spend countless hours writing pages of code that align with
| the "design patterns" or "best practices" of the day when
| there's a simpler implementation of the code they're writing.
| (I'm not saying this condescendingly--I was once a junior
| engineer who did that too!)
| marcosdumay wrote:
| Most commonly, code should optimized into being easy to
| change.
|
| That's almost entirely coincidental with being easy to read.
| But even easiness to read is a side effect.
| drewcoo wrote:
| "Side effects" are not the same as "less important traits."
|
| Side effects are usually unrelated or unwanted.
| jay-barronville wrote:
| Readability is almost always (almost only because there are
| some rare exceptions) the most important thing to me, even for
| low-level systems software. I always ask myself, "If I don't
| touch this code for a year and then come back to it, how long
| will it take me to understand it again? How long will it take
| someone who's never been exposed to this code to understand
| it?"
|
| Luckily, our compilers and interpreters have gotten so good and
| advanced that, in 95%+ of cases, we need not make premature
| "optimizations" (or introduce hierarchies of "design patterns")
| that sacrifice readability for speed or code size.
| arp242 wrote:
| Was reading 1978 Elements of Programming Style a while ago.
| It's mostly Fortran and PL/I. Some of it is outdated, but a
| lot applies today as well. See e.g. https://en.wikipedia.org/
| wiki/The_Elements_of_Programming_St...
|
| They actually have a Fortran example of "optimized" code
| that's quite difficult to follow, but allegedly faster
| according to the comments. But they rewrote it to be more
| readable and ... turns out that's actually faster!
|
| So this already applied even on 197something hardware. Also
| reminds me about this quote about early development of Unix
| and C:
|
| _" Dennis Ritchie encouraged modularity by telling all and
| sundry that function calls were really, really cheap in C.
| Everybody started writing small functions and modularizing.
| Years later we found out that function calls were still
| expensive on the PDP-11, and VAX code was often spending 50%
| of its time in the CALLS instruction. Dennis had lied to us!
| But it was too late; we were all hooked..."_
|
| And Knuth's "premature optimisation is the root of all evil"
| quote is also decades old by now.
|
| Kind of interesting we've been fighting this battle for over
| 50 years now :-/
|
| (It should go without saying there are exceptions, and cases
| where you _do_ need to optimize the shit out of things, after
| having proven that performance may be an issue. Also at scale
| "5% faster" can mean "need 5% less servers", which can
| translate to millions/dollars saved per year - "programmers
| are more expensive than computers" is another maxim that
| doesn't always hold true).
| semi-extrinsic wrote:
| > Dennis Ritchie encouraged modularity by telling all and
| sundry that function calls were really, really cheap in C.
|
| The old salty professor who taught numerical physics at my
| uni insisted that function calls were slow and that it was
| better to write everything in main. He gave all his
| examples in Fortran 77. This was in the 2010s...
| coliveira wrote:
| In fact he is right. The advantage of writing modular
| code, however, is that we can test the locations where
| performance is needed and optimize later. With a big main
| it becomes very hard to do anything complex.
| Archelaos wrote:
| This is why I liked it when the language I was coding in
| supported inline expansion: I could keep my code modular
| but nevertheless avoid the penality of function calls in
| performance critical functions in the compiled code.
| jghn wrote:
| The one gotcha with optimizing for "readability" is that at
| least to some extent it's a metric that is in the eye of the
| beholder. Over the years I've seen far too many wars over
| readability during code review when really people were
| arguing about what seemed readable *to them*
| smrq wrote:
| This is the reason I refuse to use the word "clean" to
| describe code anymore. It's completely subjective, and far
| too many times I've seen two people claim that their
| preferred way of doing things is better because it's
| "clean", and the other's way is worse because it's "less
| clean", no further justification added. It's absolutely
| pointless.
| zooq_ai wrote:
| aka "Engineering is about trade-offs"
| Guvante wrote:
| DRY is IMHO a maintenance thing.
|
| If "I don't want to maintain three copies of this" is your
| reaction unifying likely makes sense.
|
| But that assumes the maintenance would be similar which is
| obviously a big assumption.
| coffeebeqn wrote:
| DRY often gives you the wrong or a leaky abstraction and
| creates dependencies between sometimes unrelated pieces of
| code. It's got tradeoffs rather than being a silver bullet
| for improving codebases.
|
| Having 0% DRY is probably bad, having 100% DRY is probably
| unhinged
| insane_dreamer wrote:
| > I'd prioritize readability over DRY.
|
| Yes. Especially at the beginning when it's critical to ensure
| that the logic is correct.
|
| You can then go back and DRY it up while making sure your unit
| tests (you did write those, right?) still pass.
|
| PS: same applies to "fancy" snippets that save you a few lines;
| write it the "long way" first and then make it fancy once
| you're sure it runs the way it's supposed to
| hot_gril wrote:
| I place copy-pastability somewhere into those priorities too :)
| renegade-otter wrote:
| I think the only hard and fast rule is to DRY the code that will
| introduce a bug if you change it in one place and not the other.
| And if it will, _at least_ do a fat comment in both places for
| posterity.
|
| Whenever I have to have a "mental model" of the code, I know I
| screwed up.
| senkora wrote:
| +1. If I go with the comment option, then I'll sometimes write
| a comment like "If you change this here, then you must change
| it everywhere with this tag: UNIQUE-TAG".
|
| This way, the reader can just do a global grep to find all the
| places to change, and you don't have to list them in each place
| and keep them in sync.
| randomdata wrote:
| A comment is a nice addition, but the very least is to ensure
| that your test suite properly covers cases where changing one
| and not the other will introduce a problem. This not only
| ensures that both are changed, but that both are changed in the
| way they need to be. A comment alone may prompt you to change
| both (if you ever read it - I bet a lot of developers don't),
| but you may not notice when you fail to change them in the same
| way, which is no better than not changing one.
| haswell wrote:
| One of the #1 issues I've seen with DRY over the years seems to
| stem from a misunderstanding of what it means.
|
| DRY is not just about _code_ duplication, it's about
| _information_ / _knowledge_ duplication, and code happens to be
| one representation of information.
|
| Hyper focusing on code duplication quickly gets into premature
| optimization territory, and can result in DRYing things that
| don't make sense. Focusing on information duplication leaves some
| leeway for the code and helps identify which parts of the code
| actually need DRY.
|
| The difference is important, and later editions of the Pragmatic
| Programmer call this out specifically. But the concept of DRY
| often gets a bit twisted in my experience.
| HanClinto wrote:
| I feel like I hear de-duplication of information / knowledge
| often referred to as "Single Source of Truth"
| derefr wrote:
| I think the best way to understand DRY is by thinking about
| the practical problem it solves: you don't want footguns in
| the codebase where you could change something in one place,
| but forget to change the same thing in other places (or
| forget to change the complementary logic/data in other
| components.)
|
| The goal of DRY as a refactoring, is first-and-foremost to
| obviate such developer errors.
|
| And therefore -- if you want to be conservative about
| applying this "best practice" -- then you could do that by
| just never thinking "DRY" until a developer _does in fact_
| trip over some particular duplication in your codebase and
| causes a problem.
| rezonant wrote:
| > you don't want footguns in the codebase where you could
| change something in one place, but forget to change the
| same thing in other places
|
| This. Ironically the example on TFA is vulnerable to this
| issue. Each of the deadline setting methods has a copy of
| the validation ensuring that the date is in the future. If
| it's discovered that we need to ensure deadlines are set no
| later than the project deadline (since that wouldn't
| generally make sense), it's awfully easy to only update one
| and miss the others, especially after code has been added
| and these implementations are no longer visually near each
| other. I'm not saying that this means the code must be
| DRY'ed, but it is a risk from the beginning of the project,
| so one that needs to be weighed during initial
| implementation.
| hot_gril wrote:
| The Two Generals Problem is mentioned a lot in databases and
| networking, and you can take some liberties to extend it to
| human orgs.
| schneems wrote:
| Someone "yes and"-ed a comment of mine awhile ago to teach me
| DRY SPOT. Don't repeat yourself - Single Point of Truth.
|
| I.e. what you said. Couple logic that needs to be coupled.
| Decouple logic that shouldn't be coupled.
| joe_fishfish wrote:
| This is a more insightful comment than the comment at the top,
| and also more useful than the blog post.
| tetha wrote:
| This is why some advice from Sandy Metz really stuck with me.
|
| It is not a problem to /have/ the same code 2, 3 or even 4
| times in a code base. In fact, sometimes just straight up copy-
| paste driven development can be a valid development technique.
| Initially that statement horrified me, but by now I understand
| that just straight up copy-pasting some existing code can be
| one of these techniques that require some discipline to not
| overdo, but it's legit.
|
| And in quite a few cases, these same pieces of code just start
| developing in different directions and then they aren't the
| same code anymore.
|
| However, if you have to /change/ the same code in the same way
| in multiple places, then you have a problem. If you have to fix
| the same bug in multiple places in similar or same ways, or
| have to introduce a feature in multiple places in similar way -
| then you have a problem.
|
| Once that happens, you should try to extract the common thing
| into a central thing and fix that central thing once.
|
| It feels weird to work like that at first, but I've found that
| often it results in simpler code and pretty effective
| abstractions, because it reacts to the actual change a code
| base experiences.
| nostrademons wrote:
| The challenge is that if you're not careful, you can end up
| copy-pasting the same bit of code hundreds of time before
| realizing it has to be changed.
|
| I once worked in a year-old startup of ~5 developers that
| found it had written the same line of code (not even copy-
| pasted, it was only one line of code so the devs had just
| written it out) 110 times. A bug was then discovered in that
| line of code, and it had to be fixed in 110 places, with no
| guarantee that we'd even found all of them. This was a very
| non-obvious instance of DRY, too, because it was only one
| line of code and the devs believed it was so simple that it
| couldn't possibly be wrong. But that's why you sometimes need
| to be aware of what you're writing even on the token level.
|
| That's why we have principles like "3 strikes and then you
| refactor". 3 times fixing a bug isn't too onerous; even 4-6
| is pretty manageable. Once you get to 20+, there starts to be
| a strong disincentive to fixing the bug, and even if you want
| to, you aren't sure you got every instance.
| tetha wrote:
| Oh yeah we've had those as well. I kinda feel two things
| about these at the same time.
|
| At a practical level, these situations sucked. Someone had
| to search for the common expression, look at each instance,
| decide to change it to the central place or not. They spent
| 2-3 days on that. And then you realize that some people
| were smart and employed DRY - if they needed that one
| expression 2-3 times, they'd extracted one sub-expression
| into a variable and suddenly there was no pattern to find
| those anymore. Those were 2-4 fun weeks for the whole team.
|
| But at the same time, I think people learned an important
| concept there: To see if you are writing the same code, or
| if you're referring to the same concept and need the same
| source of truth, like the GP comment says. I'm pretty happy
| with that development. Which is also why my described way
| is just one tool in the toolbox.
|
| Like, one of our code bases is an orchestration system and
| it defines the name of oidc-clients used in the
| infrastructure. These need to be the same across the
| endpoints for the authentication provider, as well as the
| endpoints consumed by the clients of the oidc provider -
| the oauth flows won't work otherwise.
|
| And suddenly it clicked for a bunch of the dudes on the
| team why we should put the pedestrian act of jamming some
| strings together to get that client-id into some function.
| That way, we can refer to the concept or naming pattern and
| ensure the client will be identical across all necessary
| endpoints, over hoping that a million different string
| joins all over the place result in the same string.
|
| In such a case, early or eager DRY is the correct choice,
| because this needs to be defined once and exactly once.
| spion wrote:
| This really makes me think we should be focusing on
| cost/benefit, risk/reward, pros/cons at all times. If we
| have a bug in these 5 copies, will it be too hard to fix in
| all of them? No? What about these 10 copies? If that sounds
| like its starting to get difficult, maybe now is the time.
| danielmarkbruce wrote:
| That means you have to think. Most people hate thinking.
| Seriously.
| laserlight wrote:
| > If we have a bug in these 5 copies, will it be too hard
| to fix in all of them?
|
| Yes, it will be, because copy-pasted code is never the
| same verbatim. First and foremost, name changes make it
| almost impossible to identify different copies. Then,
| there are different tweaks for each copy to make it
| suitable for the context. I always DRY early, because
| it's always free to copy-paste later.
| withinboredom wrote:
| This is why you shouldn't write one line of code, ever
| again. /s
|
| We've all been there though, at some point in our careers.
| Possibly multiples of times (try changing thousands of
| "echo" statements to call a logger because it was initially
| meant to be a simple script that just kept growing).
|
| It sucks but I've also been on the other side, where it was
| DRY but 20% of the calls to the function now needed
| different behavior. Finding all of those usages was just as
| hard.
| sodapopcan wrote:
| Metz says she adds TODOs and comments that it has been
| duped. It's one of those things that requires thought, and
| she even says it's an advanced technique. How many times is
| too many? I'm not sure, but I can safely say over 100 is
| WAY too many. Probably 10 is too many. Heck, if you find
| yourself updating the same code in four different places
| over and over and over, it's time to abstract. The idea is
| to let the code sit and let the abstraction reveal itself
| _if there isn 't already an OBVIOUS one_. As mentioned by
| the parent poster, you're looking out for these copies to
| diverge. If four or five copied codepaths haven't diverged
| after some time, there's a good chance that just from
| working on it every day you will have realized the proper
| way to abstract it.
|
| You absolutely do have to be careful. But even so, it's
| arguable that having to update something in 100 different
| places is better than updating in one place and having it
| affect 100 different paths where you only want 99 of them
| (this is some hyperbole, of course).
| gmueckl wrote:
| Conversely, trying too hard to DRY when requirements at call
| sites start to diverge can lead to an unnecessary complex
| single implementation of something where there could be two
| very similar but still straightforward pieces of code.
| wging wrote:
| You're thinking of Sandi Metz:
| https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction
| bojanz wrote:
| Couldn't agree more. There's a great decade-old blog post by
| Mathias Verraes which illustrates this well, I keep coming back
| to it: https://verraes.net/2014/08/dry-is-about-knowledge/
| pineapple_sauce wrote:
| How is applying DRY entering premature optimization territory
| (maybe relative to LOC?)? I argue it is instead: premature
| abstraction.
|
| Optimization is specialization (which is the opposite of DRY):
| to enable DRY you likely need to generalize the problem (i.e.
| abstract) such that you remove duplication.
| haswell wrote:
| I've always seen "Premature Optimization" as an umbrella that
| covers a variety of cross-cutting concerns, ranging from:
|
| - Performance - Code structure / abstraction - Data structure
| - Team organization / org structure
|
| I'd argue that DRY (and a focus on abstractions more
| generally) are optimizations of the codebase. Not all
| optimizations are optimizing the same thing.
| hot_gril wrote:
| Yeah, it's like reminding people that code is mutable, so
| it's ok to have known flaws day 1. Something forgotten too
| often.
|
| One thing that really goes against the usual programming
| grain is DBMSes. You're taught to always decouple/abstract
| things, but I'm convinced that it's impossible to abstract
| away your DBMS in most applications. It's just too big of
| an interface, and performance considerations leak right
| through it. It's always one of the selling points of an
| ORM, "you can switch databases later," and then you never
| actually switch.
| BeetleB wrote:
| Indeed - the acronym comes from _The Pragmatic Programmer_ ,
| and the author defined it in this way. Every blog post I've
| read criticizing/cautioning against DRY were not doing DRY as
| originally defined.
|
| DRY is almost always a good thing to do. Coupling superficially
| similar code is definitely not a good thing to do.
| haswell wrote:
| Yeah, here's the quote from the later editions addressing
| this:
|
| > _Let's get something out of the way up-front. In the first
| edition of this book we did a poor job of explaining just
| what we meant by Don't Repeat Yourself. Many people took it
| to refer to code only: they thought that DRY means "don't
| copy-and-paste lines of source." That is part of DRY, but
| it's a tiny and fairly trivial part._
|
| > _DRY is about the duplication of knowledge, of intent. It's
| about expressing the same thing in two different places,
| possibly in two totally different ways._
|
| > _Here's the acid test: when some single facet of the code
| has to change, do you find yourself making that change in
| multiple places, and in multiple different formats? Do you
| have to change code and documentation, or a database schema
| and a structure that holds it, or...? If so, your code isn't
| DRY._
| dllthomas wrote:
| > Coupling superficially similar code is definitely not a
| good thing to do.
|
| I've taken to calling that activity (removing syntactic
| redundancy that is only coincidental) "Huffman coding".
| drewcoo wrote:
| > misunderstanding of what it means
|
| And in response, people will complain that they're being
| dismissed with "you're doing it wrong!" Because that happens
| with everything in programmer-land.
| haswell wrote:
| The easy response to someone feeling this way is to point
| them to the origin of DRY: _The Pragmatic Programmer_.
|
| In the book, the authors explicitly call out that many people
| took the wrong idea from the original writing. They clarify
| that DRY is not about code, it's about what they call
| "knowledge", and that code is just one expression of it.
|
| People can still disagree, but the original intent behind DRY
| is very well articulated.
| hot_gril wrote:
| Yeah, applies to databases and documentation especially.
| Databases have the ol' 3NF, you also want to avoid copying data
| from one source of truth to another in a multi-service
| environment, and sometimes I intentionally avoid writing docs
| because I want the code or API spec (with its comments) to be
| the only documentation.
| causal wrote:
| Yeah, premature DRY is a pet peeve of mine. Especially since the
| "size" of the code necessary to trigger DRY is totally
| subjective: some people apply DRY when they see similar blocks of
| code, others are so averse to repetition they start abstracting
| out native syntax.
| pphysch wrote:
| > Is the duplication truly redundant or will the functionality
| need to evolve independently over time?
|
| "Looks the same right now" != "Is the same all the time"
|
| Bad abstraction is worse than no abstraction
| rmnclmnt wrote:
| It is especially hurtful when people apply DRY immediately on
| some spaghetti code already mixing abstractions.
|
| Then you find yourself untangling intertwined fatorized code on
| top of leaky abstractions, losing hours/days and pulling your
| hair out... (I'm bald already but I'm pretty sure I'm still
| losing hair in these situations)
| sharbloop wrote:
| My rule: repeat yourself 3 times. On the 4th, re-factor.
| seattle_spring wrote:
| A good alternative to DRY is WET, or "Write Everything Twice."
| Or, in your case, "Write Everything Thrice". Both better
| alternatives than automatic, dogmatic DRY.
| wzdd wrote:
| Seems like a strawman. The thing being repeated here is something
| which raises if the datetime isn't in the future. So abstract
| that out and you then get both methods calling
| raiseIfDateTimeNotInFuture() which then also serves as
| documentation.
|
| (But yes, if the actual code is as simple as this example, you
| may as well just repeat it.)
| michaelcampbell wrote:
| I mean, sure. I'm generally more WET than most of my colleagues,
| but this...
|
| > Applying DRY principles too rigidly leads to premature
| abstractions that make future changes more complex than
| necessary.
|
| ... is just one of those things that sounds wise, but is just
| basically a tautology. Use the best tool for the job, etc. No
| kidding? Would never have thought of that on my own, thanks
| sensei.
|
| Seriously, the issue with the quoted statement is not that it's
| new to anyone, it's that no one thinks they ARE applying DRY
| principles "too rigidly". This is just chin beard stroking advice
| for "everyone else".
| AnimalMuppet wrote:
| Well, then, here's some advice:
|
| Learn when to DRY, and when not.
|
| No, you probably don't know as well as you think you do. No,
| you're not going to get there by grinding leetcode. No, you
| aren't going to get there quickly, or without a lot of
| interaction with more-experienced peers, or without being told
| that your judgment is bad a few times. (And if you don't listen
| - really listen - then you don't learn.)
|
| Good judgment in these things takes time and experience. If you
| have a year of experience and think you know, you're probably
| wrong.
| protomolecule wrote:
| >takes time and experience
|
| Or maybe asking yourself what end goal you're trying to
| achieve by building an abstraction.
| protomolecule wrote:
| >it's that no one thinks they ARE applying DRY principles "too
| rigidly"
|
| But they should think twice if they are building abstractions
| _only_ for the sake of DRY.
| 0xbadcafebee wrote:
| Like the article ends with, DRY goes hand in hand with YAGNI. The
| point isn't to build a million abstractions; it's to find the
| places where you have duplication and de-duplicate it, or where
| you know there'll be duplication and abstract it, or to simply
| rearchitect/redesign to avoid complexity and duplication. This
| applies to code, data models, interfaces, etc.
|
| The duplication is typically bad because it leads to
| inconsistency which leads to bugs. If your code is highly
| cohesive and loosely coupled, this is less likely [across
| independent components].
|
| And on this:
|
| > When designing abstractions, do not prematurely couple
| behaviors
|
| Don't _ever_ couple behaviors, unless it 's within the same
| component. Keep your code highly cohesive and loosely coupled.
| Once it's complete, wall it off from the other components with a
| loosely-coupled interface. Even if that means repeating yourself.
| But don't let anyone make the mistake of thinking they both work
| the same because they have similar-looking interfaces or
| behaviors, or you will be stuck again in the morass of low
| cohesion. This is probably one of the 3 biggest problems in
| software design.
|
| Libraries are a great help here, but libraries _must_ be both
| backwards compatible, and not tightly coupled. Lack of backwards
| compatibility is probably the 4th biggest problem...
| pavlov wrote:
| This DRY-sceptical viewpoint is a bit similar to database
| denormalization.
|
| Sure, in theory you want to store every bit of information only
| once. But in practice it can make a real difference in smoothing
| out the access pattern if you don't follow this normalization
| religiously.
|
| The same applies to code. If you have to jump through hoops to
| avoid repeating yourself, it will also make it harder for someone
| else reading the code to understand what's going on. A bit of
| "code denormalization" can help the reader get to the point more
| quickly.
| chacham15 wrote:
| I think it depends on _how_ you deduplicate your code. Creating a
| DeadlineSettter as illustrated is definitely too much, but
| creating a function: def
| assert_datetime_in_future(datetime): if datetime <=
| datetime.now(): raise ValueError( "Date
| must be in the future")
|
| and then calling that from both places seems fairly reasonable to
| me.
| 12_throw_away wrote:
| Right? Creating a noun instead of a verb is the real anti-
| pattern I see here. (Once you have a DeadlineSetter, it's a
| slippery slope down to ClassInstanceFactoryConfigProxyManager,
| etc.)
| thrdbndndn wrote:
| To the author: please do not use non-ascii quotes ("") in code.
| theshrike79 wrote:
| Some blog engines try to be too fancy and do it automatically
| lcnPylGDnU4H9OF wrote:
| A practical rule for the presented problem is "wait until you
| have 3". The number being in reference to the amount of different
| cases which need to be handled. You're not likely to catch
| everything that will come up but you'll get enough to think of an
| extensible abstraction if you don't realize that you already have
| a workable one.
| EugeneOZ wrote:
| It really depends on what this code is doing. If it is dialog
| window rendering - yes, not so important. If it's complicated
| data validation - you better make it reusable and pure from the
| beginning.
| lcnPylGDnU4H9OF wrote:
| I agree. The data validation example doesn't seem contrary to
| the advice; you would generally have at least three data
| types you need to handle in such a case.
|
| The difference with the dialog window is that (presumably)
| you don't know the different flavors of window you'll need to
| render so adding an abstraction on top of the existing
| rendering abstraction fits squarely in "premature
| optimization".
| dec0dedab0de wrote:
| But don't stop telling new developers to be DRY, it's really just
| a way to remind them they're allowed to make functions.
|
| Just step in when they go too far
| watters wrote:
| This reads like a paraphrase of this widely circulated post from
| 8 years ago...
|
| https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction
| dmeijboom wrote:
| Thanks! I was looking for this blog post for a while now
| dvh wrote:
| DRY code (usually with lot of IF blocks to handle special cases,
| or various oop lasagna) eventually turns into unmaintainable
| nightmare where every trivial new feature can take hours to
| implement and is very difficult, full of cussing, hair-pulling
| kind of programming where every 5 minutes you think "we need to
| rewrite everything from scratch, the system wasn't designed for
| this". Every change breaks million different unrelated things
| because of the complexity of extremely dry functions.
|
| In WET code (write everything twice) everything looks primitive,
| as if it was written by complete newbie, and every change needs
| to be added at multiple places, but each change is trivial and
| time to finish is predictable. I would go as far as calling the
| code boring. The most difficult thing is to resist the temptation
| to remove the duplicity.
| ollien wrote:
| > In WET code (write everything twice) everything looks
| primitive, as if it was written by complete newbie, and every
| change needs to be added at multiple places, but each change is
| trivial and time to finish is predictable. I would go as far as
| calling the code boring. The most difficult thing is to resist
| the temptation to remove the duplicity.
|
| This only scales so far. After some point, it's very easy to
| run into cases where you meant to change something everywhere
| but forgot/didn't know about others. Not to say everything
| should be so compartmentalized as to restrict change, but there
| is a balance to be had.
| thfuran wrote:
| Yes, what actually happens is that many code changes are
| released half-baked because logic only got updated in 1 (or
| 13) of the 14 places that needed to be updated, and the
| cussing and hair pulling just starts later.
| rrr_oh_man wrote:
| Tests, baby
| gary_0 wrote:
| Which is why you need a balance between WET and DRY. DAMP =
| Don't Alter in Many Places.
| kag0 wrote:
| I've never heard this one before, but I love it.
| Unfortunately we've also got "Don't Abstract Methods
| Prematurely" and "Descriptive And Meaningful Phrases".
| yakshaving_jgt wrote:
| Or, use a sufficiently well designed type-checking compiler,
| like GHC.
| DSMan195276 wrote:
| I've seen code like this, what eventually happens is that all
| your 'copies' drift to be slightly different. Fixes get applied
| to some but not all of them, people copy from old code vs. new
| code, etc. And whenever you need to apply a fix you spend hours
| trying to figure out where each copy is, what it is supposed to
| be doing (since they're all different), and how the fix can be
| applied to it. You inevitably don't find them all and repeat
| the cycle.
| rjurney wrote:
| This is especially true for a data scientist, where most code is
| throwaway. If you make it all spectacular, you aren't getting
| anything done. Data scientists' code should be "eventually good,"
| that is to say it gets refactored as it approaches a production
| environment. I talk about this in my last book, Agile Data
| Science 2.0 (Amazon 4.1 stars 7 years after publishing).
|
| https://www.amazon.com/Agile-Data-Science-2-0-Applications/d...
|
| I will say that after 20 years of working as a software engineer,
| data engineer, data scientist and ML engineer, I can write pretty
| clean Python all the time but this isn't common.
| stephc_int13 wrote:
| The problem with all the "best practices" is that quite often
| they are sensible within some context but can be a detrimental
| tradeoff in a different one.
|
| "it depends" is almost always the correct answer.
|
| But what we see is young, inexperienced and zealous coders trying
| too hard and implemented the so called best practices before they
| understand them.
|
| And I don't think there are too many shortcuts to replace
| experience.
|
| My advice for beginners and intermediate is to first stick to the
| simplest solution that works, and don't be afraid to rewrite.
| PaulStatezny wrote:
| > And I don't think there are too many shortcuts to replace
| experience.
|
| But that's the point of these blogs: helping those without
| experience. Should we leave them to flounder on their own until
| they "figure it out" instead of trying to pass along wisdom?
|
| There's evidence that the best approach is, yes, experience -
| but _with Expert Feedback_. In practice, this looks like
| pairing and informal apprenticeship with competent, seasoned
| engineers.
|
| I can confirm from my own experience how much you can learn
| from working with engineers "further down the road".
| hcarvalhoalves wrote:
| The article builds a straw man though. The "bad example" is bad
| because it introduces OOP for no reason at all.
|
| What's wrong with: def set_deadline(deadline):
| if deadline <= datetime.now(): raise
| ValueError("Date must be in the future")
| set_task_deadline = set_deadline set_payment_deadline =
| set_deadline
|
| You don't need code duplication to avoid bad abstractions.
| laserlight wrote:
| Exactly. I was happy to see a code example, but facepalmed when
| I actually read it.
| spion wrote:
| Its not about OOP but the probability that those two functions
| will diverge. Linked elsewhere in the comments too, this
| article (https://sandimetz.com/blog/2016/1/20/the-wrong-
| abstraction) is probably better at articulating the point.
| zer00eyz wrote:
| People talk about DRY and then happily type pip/gem/npm into
| their terminal, and never look at 99 percent of what they just
| downloaded...
|
| Did we all forget leftpad? https://qz.com/646467/how-one-
| programmer-broke-the-internet-...
| Leherenn wrote:
| Isn't leftpad the natural conclusion of DRY? Everything is a
| unique, small, contained and tested library that other code can
| depend on instead of reimplementing it? The ultimate one source
| of truth, where if it breaks half the internet breaks.
| zer00eyz wrote:
| > Isn't leftpad the natural conclusion of DRY? Everything is
| a unique, small, contained and tested library that other code
| can depend on instead of reimplementing it?
|
| There is nothing in dry that says "util" or "frameworks" or
| "toolchains" are bad.
|
| > The ultimate one source of truth, where if it breaks half
| the internet breaks.
|
| Dry says nothing about versioning, or vendoring or deleting
| your code from the internet...
|
| The reality is that leftpad wasnt used by that many things.
| Its just that the things that did use it were all over the
| dependency graph...
| kelseydh wrote:
| Fixing duplication is far easier than the wrong abstraction.
| ziml77 wrote:
| Yes. If you abstract without a specific need, you are likely to
| end up with an abstraction that either wastes time because it's
| never used or that you will later need to fight against because
| the changes you need to make don't mesh well with it. At that
| point you have to choose between a lengthy rework of the code
| or awful hacks to bypass the abstraction.
| wg0 wrote:
| This DRY principle has ruined so many code bases merging so many
| facets into one giant monster of complexity that then later has
| to be specialised with flags and enums that I can't count how
| many times I have seen such clever PRs.
|
| IMHO - some of the clean code books have ruined the industry as
| much as the virtues of microservice preachers have.
|
| The second number goes to the Javascript tooling.
| dailykoder wrote:
| Just don't prematurely anything and write code that works. If you
| know how it works you automatically get an intuition what can be
| made better and where bottlenecks might be. Then you refactor it
| or just do a plain rewrite.
|
| It's really that simple. (There are always exceptions obviously)
| globular-toast wrote:
| Like any rule this can be taken too far. It happens all the time.
| People like simple rules. They want everything to be like
| assembling IKEA furniture: no thought required, just follow the
| instructions. We all like it because it frees up the mind to
| think about other things.
|
| There are rules like "don't stick your fingers in the plug
| socket". But, if you're an electrician, you can stick your
| fingers in the plug socket because you've isolated that circuit.
| DRY is similar. As a programmer, you can repeat yourself, but you
| should be aware that it's thoroughly unwise unless you know you
| have other protections in place, because you know _why_ such a
| rule exists.
| epr wrote:
| The example is hilariously terrible. Firstly, this is the
| currently required code: def
| set_deadline(deadline): if deadline <= datetime.now():
| raise ValueError("Date must be in the future")
| set_deadline(datetime(2024, 3, 12))
| set_deadline(datetime(2024, 3, 18))
|
| There simply is no trade-off to be made at this point. Perhaps
| there will be eventually, but right now, there is one function
| needed in two places. Turning two functions that already could be
| one into a class is absurd.
|
| Now, as far as teaching best practices goes, I also dislike this
| post because it doesn't explicitly explain the pros and cons of
| refactoring vs not refactoring in any detail. There is no
| guidance whatsoever (ie: Martin Fowler's Rule of Three). This is
| Google we're talking about, and newer developers could easily be
| led astray by nonsense like this. Addressing the two extremes,
| and getting into how solving this problem requires some nuance
| and practical experience is much more productive.
| alex_smart wrote:
| Almost all programming tutorials and even books to a certain
| extent suffer with the problem of terrible examples. Properly
| motivating most design patterns requires context of a
| sufficiently complex codebase that tutorials and books simply
| do not have the space of getting into. This particular case is
| especially bad, probably because they had the goal of having
| the whole article fit in one page. ("You can download a
| printer-friendly version to display in your office.")
|
| > There is no guidance whatsoever (ie: Martin Fowler's Rule of
| Three).
|
| That is completely unfair imo. Although not properly motivated,
| the advice is all there. "When designing abstractions, do not
| prematurely couple behaviors that may evolve separately in the
| longer term." "When in doubt, keep behaviors separate until
| enough common patterns emerge over time that justify the
| coupling."
|
| Simplified maxims like "Rule of Three" do more harm than good.
| Don't couple unrelated concerns is a much higher programming
| virtue than DRY.
| LouisSayers wrote:
| > Properly motivating most design patterns requires context
| of a sufficiently complex codebase
|
| As someone that's made a best selling technical course, I
| strongly disagree.
|
| It's 100% laziness and/or disregard for the reader.
|
| The reason examples are as bad as they are is that people
| rush to get something published rather than put themselves in
| the audience's position and make sure it's concise and makes
| sense.
|
| It's not like webpage space is expensive. There's plenty of
| room to walk through a good example, it just requires a
| little effort.
| Bjartr wrote:
| What does sales have to do with what you're claiming?
| Please share the course and or examples of it being done
| well without requiring that excessive context, so that
| there's something to support your claim.
| LouisSayers wrote:
| Well if my course and teaching was crap I wouldn't get
| good reviews and therefore many sales. I've spent $0 on
| marketing.
|
| https://www.udemy.com/neo4j-foundations/
|
| There are many people who do teach and explain topics
| well. Richard Feynman comes to mind.
|
| I've found Abdul Bari on YouTube to also be an excellent
| teacher around technical topics.
| alex_smart wrote:
| >It's not like webpage space is expensive.
|
| It is not the webpage space. It is people's limited
| attention spans and ability to focus. A complex example is
| needed to properly motivate certain concepts, but too
| complex an example also contains too many other details
| that the reader gets bogged down/distracted from the main
| concept being discussed.
|
| At least that is my hypothesis for why almost all
| programming books and tutorials have terrible examples. I
| am happy to be proven wrong.
|
| Coming back to the article, I looked at some of the
| previous articles from the same series, and to me it feels
| like a very conscious decision to only include 3-4 line
| code examples.
| re-framer wrote:
| Your example, deduplicating the two functions into one,
| illustrates an interesting point, although I'd prefer still
| having the two specialized functions there:
| def set_deadline(deadline): if deadline <=
| datetime.now(): raise ValueError("Date must be in
| the future") def
| set_task_deadline(task_deadline):
| set_deadline(task_deadline) def
| set_payment_deadline(payment_deadline):
| set_deadline(payment_deadline)
| set_task_deadline(datetime(2024, 3, 12))
| set_payment_deadline(datetime(2024, 3, 18))
|
| You lose absolutely nothing. If you later want to handle the
| two cases differently, most IDEs allow you to inline the
| set_deadline method in a single key stroke.
|
| So the argument from the article...
|
| > Applying DRY principles too rigidly leads to premature
| abstractions that make future changes more complex than
| necessary.
|
| ...does not apply to this example.
|
| There clearly _are_ kinds of DRY code that are less easy to
| reverse. Maybe we should strive for DRY code that can be easily
| transformed into WET (Write Everything Twice) code.
|
| (Although I haven't worked with LISPs, macros seem to provide a
| means of abstraction that can be easily undone without risk:
| just macro-expand them)
|
| In my experience, it can be much harder to transform WET code
| into DRY code because you need to resolve all those little
| inconsistencies between once-perfect copies.
| fiddlerwoaroof wrote:
| I've always found that duplicating and editing over-DRY code is
| easier than fixing code that's under-DRY. I strongly prefer
| working with people that care about DRY code and accidentally go
| too far than the reverse. Additionally, the worst problems I've
| had in inherited code have been due to duplication and
| insufficient abstractions leading to logical inconsistency.
| jandrewrogers wrote:
| The term "DRY" as commonly used conflates distinct situations and
| objectives that should be handled differently in most cases.
|
| There is the "single source of truth" problem where you need to
| compute something exactly the same way at all points in the
| software that need to compute that thing. In these cases you
| really want a single library implementation so that the meaning
| of that computation does not accidentally diverge over time since
| there is a single implementation to maintain.
|
| There is the "reuse behaviors in unrelated contexts" problem
| where you want to create implementations of common useful
| behaviors that can be abstracted over many use cases, often in
| the context of data structures and algorithms. In these cases you
| really want generics and metaprogramming to codegen a context-
| specific implementation rather than sharing a single
| implementation with a spaghetti mess of conditionals.
|
| DRY works best when it fits neatly and exclusively into one of
| these two categories. Cases that fit in neither category, such as
| the practice of decomposing every non-trivial function into a
| bunch of micro-functions that call each other, are virtually
| always a maintenance nightmare for no obvious benefit. Cases that
| fit into both categories, such as expansive metaprogramming
| libraries, become difficult to maintain by virtue of the
| combinatorial explosion of possible implementations that might be
| generated across the allowable parameter space -- the cognitive
| overhead grows exponentially for what is often a linear increase
| in value.
| a1369209993 wrote:
| > such as the practice of decomposing every non-trivial
| function into a bunch of micro-functions that call each other,
|
| That has approximately nothing to do with DRY. At best it might
| technically be a violation of DRY if (and only if) some of
| those micro-functions are identical, but the correct way to fix
| that is to recompose them into the non-trivial (and non-
| repeated) functions they're more usefully expressed as. And
| more often it's just a totally independent refucktoring that
| makes the codbase worse entirely orthogonally to 'DRY-ness'.
| skydhash wrote:
| Common Lisp works well for the second case. But it seems that
| some programmers are uncomfortable with the notion of code
| generating code. And as you said, it does require discipline as
| you need to focus the language's power. Other languages don't
| let you solve the boilerplate problem so readily. Instead you
| have a mess of utility functions or a huge class tree.
| icoder wrote:
| Reading a lot of this discussion I'm thinking whether DRY itself
| is the problem or it's more about mixing different (but perhaps
| comparable) things into one function (be it for the sake of
| appearing DRY or otherwise).
| Symmetry wrote:
| I have a sticker on my laptop of a yin-yang with DRY and YAGNI
| instead of the dots.
| liampulles wrote:
| My maxim: is "it" intrinsically the same, or coincidentally the
| same?
|
| Intrinsically the same means a rule, and so there should be 1
| source of truth for it. Coincidentally the same means it has the
| same shape but this just happens to be the case, and they should
| be left separate to evolve independently.
|
| Ultimately, it boils down to really thinking about the domain.
| srvaroa wrote:
| "Anytime you apply a rule too universally, it turns into an anti-
| pattern".
|
| Quote from Will Larson found in another HN post
| (https://review.firstround.com/unexpected-anti-patterns-for-e...)
| right after checking out this one.
| halfcat wrote:
| The visualizations in Dan Abramov's talk "The wet codebase" [1]
| really burned this concept in for me.
|
| Seeing what a premature, wrong abstraction looks like visually
| was eye opening.
|
| [1] https://youtu.be/17KCHwOwgms
| ofrzeta wrote:
| I know it's supposed to be catchy but "Don't Repeat Yourself" is
| quite too dogmatic. A little redundancy can absolutely help
| readablity. Obviously you don't want to repeat complicated code
| blocks that you have to maintain twice.
| LouisSayers wrote:
| Can someone also write an article on how not to write code like
| is in this article?
|
| `DeadlineSetter` should not be a class, and besides that the
| implementation makes zero sense. The whole thing should probably
| just be a single if statement.
| kag0 wrote:
| In my experience DRY and many (any?) other coding principles are
| only problematic when misused. They're typically misused because
| the user doesn't understand the motivation or underlying value of
| the principle in the first place.
|
| I think the example in the article does a bit of that as well.
| The example sets a deadline on a thing (a task or payment) by
| validating the deadline against the current time, and then
| presumably doing something else that isn't shown. The article
| argues that in the future a task might have different validation
| requirements than a payment, and they're only coincidentally the
| same today; so it would be foolish to abstract the deadline
| setting logic today. BUT, the reality is that the real
| coincidence is that payments and tasks have the same set of
| validations, not that the logic to validate a deadline is
| coincidentally the same. In my opinion "good" code would be fine
| to have separate set_task_deadline and set_payment_deadline
| methods, but only one validate_deadline_is_in_future (or
| whatever) method, alongside other validation methods which can be
| called as appropriate by each set_x_deadline implementation.
|
| Disclaimer: the code is so short and trivial that it doesn't
| matter, I think we can all assume that this concept is
| extrapolated onto a bigger problem.
| vi2837 wrote:
| I wonder why nobody mentioned it - there is one more advanced
| principle AHA: (Avoid Hasty Abstractions)
| https://kentcdodds.com/blog/aha-programming#aha- Overusing the
| DRY principle can make software almost unsupportable.
| wseqyrku wrote:
| Related: I do believe starting off at the "second" level of
| abstraction (as opposed to implementing the direct surface area
| of the service) is not premature as it helps to better understand
| the problem space, and on the implementation side, as soon as you
| identify the building blocks, the rest would be really just
| boilerplate. if you got time, rinse and repeat.
| visil wrote:
| Reminds me of this[1] great blogpost: "This abstraction adds
| overhead. "Abstracting" the common operation has made it more
| difficult to read, not less difficult to read. People for who
| consider meta-programming some sort of Black Magic often make
| this exact point: The mechanism for removing duplication adds
| complexity itself. One view is that the overall effect is only a
| win if the complexity added is small compared to the duplication
| removed."
|
| [1]: http://weblog.raganwald.com/2007/12/golf-is-good-program-
| spo...
| EasyMark wrote:
| used once? don't worry, don't think about "what about the
| possibility it's repeated in the future?"
|
| used twice? okay, maybe I will, maybe I won't
|
| used three+? "don't be lazy ya bum"
|
| I like simple rules, and I don't care if someone wants to turn it
| into a philosophical debate, I probably won't participate :)
| asdfman123 wrote:
| Upper management trying to drive software results by metrics is
| like trying to win a war with metrics unrelated to battle
| outcomes.
|
| You must produce X number of tanks, your forces must fire Y
| bullets, you should minimize the number of retreats.
|
| If you try to manage with no understanding of what's happening on
| the front lines (and upper management generally can't understand
| the front lines unless they've worked there recently), you're not
| going to win the war.
| swiftcoder wrote:
| This is a poorly-selected example, as the real problem here is
| not the DRY validation, it's that the programmer is abstracting
| the wrong thing.
|
| Ending up with an awkward class name like `DeadlineSetter` is a
| dead giveaway that your abstraction boundaries don't make sense -
| if instead you abstract `Deadline`, and put the invariant check
| in the constructor thereof, you solve both problems.
| nevinera wrote:
| DRY is _not a best practice_. Repetition is a "code smell" - it
| often suggests a missing abstraction that would allow for code
| reuse (what sort of abstraction depends on the language and
| context), but "blindly-drying" is in my experience the _single
| most frequent mistake_ made my mid-to-senior engineers.
|
| My experience is mostly in Ruby though, so I'm not sure how well
| it generalizes here :-)
| hfe wrote:
| My experiences are the same in C++ and Python. C++ in
| particular can get way out of hand in service of DRY.
| mmcnl wrote:
| Premature DIY can lead to the wrong abstractions. Sometimes
| code looks similar but actually isn't.
| samtho wrote:
| At my first big corporate jobs, I got to work on a codebase
| that was nothing but premature DRY'd code, but I didn't know
| it at the time. As someone who was self taught, and suffered
| from imposter syndrome as many of us do/did in that
| situation, I thought I was missing something huge until I was
| talking to a senior developer and these strange design
| decisions came up, to which he said something like
|
| > Yeah, that was written by <ex-engineer> and he couldn't
| abstract his way out of a paper bag
|
| I guess the real lessons were the crappy decisions that
| someone else made along the way.
| colechristensen wrote:
| Yeah I've had so many problems with understanding and working
| with other people's code bases when the person was obsessed
| with DRY.
|
| You wrote that code 4 years ago with tons of abstractions
| designed for some day someone not having to repeat
| themselves... but it's been years and they've never been
| useful. However I've had to dig through a dozen files to make
| the change I needed to make which by all rights should have
| been entirely contained in a few lines.
|
| My most common reaction to a new codebase is "where the hell
| does anything actually get done" because of silly over-
| abstraction which aspires to, one day, save a developer five
| minutes or three lines of copied code.
| hughesjj wrote:
| FWIW I completely agree in python, Java, typescript, and
| golang. I've seen people just parrot dogma about DRY and SOLID
| principals where their DRY'd code is completely not open to
| extension etc
|
| Premature dry'ing is the same as premature engineering. And
| lest someone go 'oh so YAGNI is all you need'... no, sometimes
| you are going to need it and it's better to at least make your
| code easily moldable to 'it' now instead of later. Future
| potential needs can absolutely drive design decisions
|
| My whole point is that dogma is dumb. If we had steadfast easy
| rules that applied in literally every situation, we could just
| hand off our work to some mechanical turks and the role of
| software engineer would be redundant. Today, that's not the
| case, and it's literally our job to balance our wisdom and
| experience against the current situation. And yes, we will
| absolutely get it wrong from time to time, just hopefully a
| lower percentage of occasions as we gain experience.
|
| The only dogma I live by for code is 'boring is usually
| better', and the only reason I stick by that is because it
| implicitly calls out that it's not a real dogma in that it
| doesn't apply in all cases.
|
| (Okay, I definitely follow more principals than that, but don't
| want to distract from the topic athand)
| adrianmonk wrote:
| > _" blindly-drying"_
|
| Right. It's not an optimization problem!
|
| Remember in school when you learned to turn a truth table into
| a Karnaugh map and then use it to find the smallest equivalent
| logic expression? Well, your code is not a Karnaugh map, is it?
| philipwhiuk wrote:
| I tend to follow "1, 2, many"
|
| Duplication in one place, I'm often fine with, because you don't
| yet know the level of abstraction needed with only two examples.
|
| More than twice however and you should be able to see common
| patterns across all three implementations and be able to isolate
| it.
| gsuuon wrote:
| My rule of thumb is the third time I rewrite some code, I DRY it.
| zamalek wrote:
| There is an unpraised advantage of keeping code, uh wet?, for as
| long as possible. When you do decide that a refactor is required,
| you have real use cases to test your abstraction against. While I
| am broadly in agreement with the article because of that, people
| designing public binary APIs don't have the luxury of delaying
| these choices.
| BurningFrog wrote:
| I think this is the best way to think about it:
|
| Ask yourself, if some fact or functionality changes, in how many
| places would the code have to change?
|
| If it's more than 1, you have a design problem. Of course, the
| solution does not at all have to be about DRYing.
| Stratoscope wrote:
| Sometimes it's best to be DRY right from the start.
|
| Several years ago, I did some contract work for a company that
| needed importers for airspace data and various other kinds of
| data relevant to flying.
|
| In the US, the Federal Aviation Administration (FAA) publishes
| datasets for several kinds of airspace data. Two of them are
| called "Class Airspace" and "Special Use Airspace".
|
| The guy who wrote the original importers for these treated them
| as completely separate and unrelated data. He used an internal
| generic tool to convert the FAA data for each kind of airspace
| into a format used within the company, and then wrote separate
| C++ code, thousands of lines of code each.
|
| Thing is, the data for these two kinds of airspace is mostly
| identical. You could process it all with one common codebase,
| with separate code for only the 10% of the data that is different
| between the two formats.
|
| When I asked him about this, he said, "I have this philosophy
| that says if you only have two similar things, it's best to write
| separate code for each. Once you get to a third, then you can
| think about refactoring and making some common code."
|
| That is a good philosophy! I have often followed it myself.
|
| But in this case, it was obvious that the two data formats were
| mostly the same, and there was never going to be a _third_ kind
| of almost-identical airspace, only the two. So we had twice the
| code we needed.
| wnevets wrote:
| > So we had twice the code we needed.
|
| Was that necessarily a bad thing and something that must be
| corrected for that code base?
|
| I usually follow the same rule of thumb until I find myself
| repeatedly updating both at the same time. If I can't update
| one without updating the other then they must be the same thing
| and its time to DRY.
|
| Don't Repeat Yourself when updating code.
| Stratoscope wrote:
| Good points, thanks for bringing them up.
|
| Yes, there was some ongoing maintenance of this code where
| both versions had to be updated. The original author was not
| a pilot and was unfamiliar with some of the nuances of FAA
| airspace. One of the reasons they brought me in was that I am
| a pilot and knew how the FAA's data should be interpreted.
|
| In the end, not a huge deal, but it was annoying when I had
| to make the same changes in two places.
| thdc wrote:
| Knowing to DRY there depended on business knowledge that
| the original author did not have.
|
| While they were wrong in this case, I would say it was a
| reasonable move to not DRY based on the code pattern itself
| at the time. And that's the big difference imo - DRYing
| based strictly on the structure of code vs business
| processes.
| ebolyen wrote:
| I don't know, that sounds like a complex kind of ingest which
| could be arbitrarily subtle and diverge over time for legal and
| bureaucratic reasons.
|
| I would kind of appreciate having two formats, since what are
| the odds they would change together? While there may never be a
| 3rd format, a DRY importer would imply that the source
| generating the data is also DRY.
| TeMPOraL wrote:
| In such case I think I'd go for an internal-DRYing + copy-on-
| write approach. That is, two identical classes or entry
| points, one for each format; internally, they'd share all the
| common code. Over time, if something changes in one format
| but not the other, that piece of code gets duplicated and
| then changed, so the other format retains the original code,
| which it now owns.
| ebolyen wrote:
| I like that approach.
| Spivak wrote:
| I've had the mantra "inheritance is only for code reuse"
| and it's never steered me wrong.
| jameshart wrote:
| Inheritance is only good for code reuse, and it's a trick
| you only get to use once for each piece of code, so if
| you use it you need to be absolutely certain that the
| taxonomy you're using it to leverage code across is the
| right one.
|
| All 'is-a so it gets this code' models can be trivially
| modeled as 'has-a so it gets this code' patterns, which
| _don't_ have that single-use constraint... so the
| corollary to this rule tends towards 'never use
| inheritance'.
| jeremyjh wrote:
| This is really good advice and a great way to think about
| it.
| Stratoscope wrote:
| Good point. This may be a case where domain knowledge is
| helpful.
|
| One of the reasons they brought me in on this project is that
| besides knowing how to wrangle data, I'm also an experienced
| pilot. So I had a good intuitive sense of the meaning and
| purpose of the data.
|
| The part of the data that was identical is the description of
| the airspace boundaries. Pilots will recognize this as the
| famous "upside down wedding cake". But it's not just simple
| circles like a wedding cake. There are all kinds of cutouts
| and special cases.
|
| Stuff like "From point A, draw an arc to point B with its
| center at point C. Then track the centerline of the San
| Seriffe River using the following list of points. Finally,
| from point D draw a straight line back to point A."
|
| The FAA would be very reluctant to change this, for at least
| two reasons:
|
| 1. Who will provide us the budget to make these changes?
|
| 2. Who will take the heat when we break every client of this
| data?
| ebolyen wrote:
| I see, so it's a procedural language that is well
| understood by those who fly (not just some semi-structured
| data or ontology). This is a great example of the advantage
| of domain experience. Thanks for sharing!
| Stratoscope wrote:
| > _a procedural language that is well understood by those
| who fly_
|
| That is a great way to describe it!
|
| Of course it is all just rows in a CSV file, but yes, it
| is a set of instructions for how to generate a map.
|
| In fact the pilot's maps were being drawn long before the
| computer era. Apparently the first FAA sectional chart
| was published in 1930! So the data format was derived
| from what must have been human-readable descriptions of
| what to plot on the map using a compass and straightedge.
|
| I just remembered a quirk of the Australian airspace
| data. Sometimes they want you to draw a direct line from
| point F to point G, but there were two different kinds of
| straight lines. They may ask for a great circle, a
| straight path on the surface of the Earth. Or a rhumb
| line, which looks straight on a Mercator projection but
| is a curved path on the Earth.
|
| You would often have some of each in the very same
| boundary description!
|
| For anyone curious about this stuff, I recommend a visit
| to your local municipal airport and stop by the pilot
| shop to buy a sectional chart of your area.
| tass wrote:
| Paper charts are great (they're fairly cheap and printed
| quite nicely in the USA at least) but you can get a good
| look at these boundaries through online charts.
|
| https://skyvector.com is a good way to view these.
| bcrosby95 wrote:
| I don't know. I've seen this approach for projects before go
| bad - people didn't want to DRY because they might diverge.
| Except they never did. Our 3rd+ scenarios we abstracted.
|
| But what basically ended up happening was we had 2 codebases:
| 1 for that non-DRY version, and then 1 for everything else.
| The non-DRY version limped along and no one ever wanted to
| work on it. The ways it did things were never updated. It was
| rarely improved. It was kinda left to rot.
| jononor wrote:
| Why wasn't the original implementation swapped for the new
| one? The unwillingness/inability to do that seems to be
| most likely the core of the issues here?
| bcrosby95 wrote:
| The majority of our business was through the 1st
| implementation. Because of that it was the base we used
| to refactor into a more abstract solution for further
| scenarios. It was never deemed "worth it" to transition
| the 2nd non-DRY version. Why refactor an existing
| implementation if its working well enough and we could
| expand to new markets instead?
| steve_adams_86 wrote:
| I find this is a case where different pipelines utilizing
| common functions in different compositions can be a great
| strategy. If something diverges and a function no longer makes
| sense in a pipeline, that's not a big deal. Just pull it out
| and replace it with something bespoke that does the right
| thing.
|
| I've had a lot of success with this in embedded settings where
| data is piped into storage or OTA, and I want to format and
| pack it/send it up consistently but I might want to treat the
| data itself slightly differently.
| hansvm wrote:
| A related concept that IMO still aligns with DRY is that you
| should only avoid seeming code duplication when things are
| _semantically_ the same. No matter the mechanism (codegen,
| generics, macros, inheritance, ...), if you can't give a
| concept a meaningful [0] name then you usually shouldn't DRY
| it up with any mechanism. Your example is a technique I also
| use a lot, but the critical point is that you're choosing to
| break out functionality which _is_ easy to name.
|
| [0a] More generally, I like a concept of "total" functions --
| those which have sensible outputs for all their inputs. It's
| a bit of a tomayto/tomahto situation defining "all their
| inputs" (e.g., I'm personally okay using a function name like
| `unsafe_foo` and expecting a person to read the docs, and on
| the other extreme some people want sensible answers to
| anything the type system allows you to input), but the
| desired end-state is that when the project's requirements
| change you don't muck around with the ABI and implementation
| of `count_or_maybe_sort_for_these_three_special_customers_or_
| else_hit_the_db(...)`, or whatever much more generic and very
| wrong name the method actually has; the individual components
| are already correct, so you make the changes at the few
| methods which are actually wrong given the new requirements.
|
| [0b] Another way of thinking about it is whether the two
| things should always change in tandem. For two largely
| overlapping beaucratic data formats? Maybe; there's a comment
| somewhere in this chain suggesting that they'll never go out
| of sync, but I'm a bit paranoid of that sort of thing. For
| the particular data structures that are currently shared by
| those formats? Absolutely not; if one diverges then you can
| build the new structure and link it in. The old structure is
| still valid in its own right.
| magicalhippo wrote:
| We've got a large number of customer-specific file
| integrations, and a lot of them are indeed very similar as the
| customers have the same system on the other side. However
| almost all the time there's some tweaking needed. Customer A
| used field X for this but customer B used the field for that.
|
| So if a new customer comes and need an integration to a system
| we already support, even if we think they'll start out being
| identical, we just copy the code.
|
| Thing is, these things evolve. Suddenly we have to patch over
| some process-related issues in the other system for customer A,
| while customer B does not have that issue. Now we can fix A's
| integration without worrying at all about affecting B.
|
| Of course we write library and helper functions, and use those
| actively throughout, so we only repeat the "top level" stuff.
| hprotagonist wrote:
| https://grugbrain.dev/#grug-on-dry
|
| _grug begin feel repeat /copy paste code with small variation is
| better than many callback/closures passed arguments or elaborate
| object model: too hard complex for too little benefit at times
|
| hard balance here, repeat code always still make grug stare and
| say "mmm" often, but experience show repeat code sometimes often
| better than complex DRY solution_
|
| something i have learned the hard way is that DRYing out too fast
| paints you into architectural corners you don't even know are
| there yet.
| localfirst wrote:
| read the whole article and wow!
|
| grug tell facts
| ReleaseCandidat wrote:
| That's an example for the wrong abstraction, not an example for
| "no DRY".
|
| Checking if a date is in the future does actually make sense, I
| would not do it like that (that's more of a
| `raise_if_not_in_future`), but whatever: def
| check_if_in_future(date): if date <= datetime.now():
| raise ValueError( "Date must be in the future")
| def set_task_deadline(task_deadline):
| check_if_in_future(task_deadline) def
| set_payment_deadline(payment_deadline):
| check_if_in_future(payment_deadline)
| resters wrote:
| So refreshing to see this kind of wisdom in a concise blog post!
|
| My take:
|
| In beginners, over-emphasis on DRY is a mistake made because they
| don't yet understand why DRY is considered a best practice.
|
| In more senior developers, over-emphasis on DRY comes from a few
| psychological desires... 1) to mitigate the uneasy feeling of not
| knowing what direction the product will take, and 2) the warm
| feeling that comes from finding a refactor that makes the code
| more DRY.
|
| What is overlooked is the cognitive overhead required to un-DRY
| pieces of code when requirements change. Often the result is a
| DRY but convoluted series of refactors that obscure the intention
| of the code and (often) obscure system design intention that
| would otherwise have been quite clear.
|
| Sadly, many otherwise talented software engineers have the kind
| of minds that prefer micro-level problem solving and are
| challenged at big-picture reasoning. There is often actual
| discomfort when too much big-picture reasoning or synthesis is
| involved. I view this as more of an emotional than a cognitive
| limitation, and something that is amplified by the conformist
| culture found in most large organizations (and which many small
| ones believe it is best to emulate).
|
| Conformity with best practices is valued above real problem
| solving. Worse still, there are often elaborate discussions of
| PRs relating to minutia associated with DRYing up code for which
| it wasn't necessary in the first place.
|
| Sure, as a system matures there are opportunities to remove cruft
| and DRY code where it is obviously helpful, but it is silly to
| waste too much time on it until the true requirements of the
| system are well understood.
| bbwbsb wrote:
| I prefer: do the thing when doing so reduces the expected cost of
| (time-discounted) future outcomes by more than the expected
| utility of the next best thing you can do now.
|
| The problem with DRY occurs when it contravenes this principal -
| when deduplication is too expensive and/or unlikely to decrease
| the cost of future mutations enough to be worth it.
|
| The proposed problem isn't a binary - that you should or
| shouldn't make the assumption yet - but rather that the
| assumption has a cost based on what you believe is likely to
| occur in the future and the value produced by making the
| assumption now needs to outweigh the cost.
| dkarl wrote:
| Reminds me of a conversation I had with a project manager. To
| match the example, I'll recast it in terms of deadlines.
|
| Project manager: Sam is working on a deadline validator. You made
| a deadline validator last sprint right? Could Sam use yours?
|
| Me: No, unfortunately not. My deadline validator enforces that
| deadlines are in the future and are aligned with midnight UTC, to
| ensure correct date calculations in the database. The deadline
| validator Sam is working on does not enforce those restrictions.
| Sam's deadline validator will be applied to user input for an
| entirely different field, where deadlines don't have to be at
| midnight and are just as often in the past as in the future. In
| fact, Sam's validator only checks that a deadline string has the
| expected format and is within twenty years of the present day. My
| validator operates on timestamps sent as integers from another
| service, not string values uploaded by users.
|
| Project manager: So your deadline validator is not reusable at
| all? That's unfortunate. Is there something we could have done
| differently to avoid this redundant work?
| Spivak wrote:
| Developer, next time: No, I made a real-time database
| constraint policy enforcement engine. Totally different thing.
| Aeolun wrote:
| > In fact, Sam's validator only checks that a deadline string
| has the expected format and is within twenty years of the
| present day.
|
| You two have been talking about this longer than it took Sam to
| implement by now ;)
| TedDallas wrote:
| DRY is more about support and maintenance than anything else.
|
| I see a lot of attacks on DRY these days, and it boggles my mind.
| Maybe it is being conflated with over-
| engineering/paramterization/architecting. I don't know.
|
| But I do know that having to fix the same bug twice in the same
| code base is not a good look.
| actionfromafar wrote:
| It's not that. It's when you need to change how the function
| behaves but for only one of the callers.
| ftlio wrote:
| I suck in the kitchen. If you asked me to make you a sandwich, I
| would have to go to the cupboard or refrigerator a few times to
| end up with all the right ingredients. Then I could at least
| competently assemble the sandwich. My family also loves antipasto
| salads, which are basically just like a sandwich without bread.
|
| If you asked me to assemble 10 different sandwiches, and 1
| antipasto salad, some of which I'm seeing for the firs time, I
| would attempt to gather all the ingredients, but ultimately end
| up going back and forth between the cupboard and refrigerator
| still. I might even think, on one of those trips, hey, I don't
| need the mayo anymore, so I can put it away, only to have to go
| back and get it again for a later sandwich. The end result would
| probably be all the ingredients for every sandwich on the counter
| at the same time, as I should have done.
|
| I'm pretty smart though. I'm good at Abstraction. So, I assume
| I'm going to get another order from the family for a sizable
| amount of sandwiches and some more antipasto salad. I name each
| sandwich and salad type and then write down a list of ingredients
| for each sandwich so I can cross-reference it to assemble a
| master list of all required ingredients per sandwich when the
| next order comes in. I can then go to the cupboard and
| refrigerator once.
|
| I then order each sandwich type by their shared ingredients, so
| that I can apply ingredients only once until I'm done with that
| ingredient (and then I could put it away, but I'm not a premature
| optimizer). The only issue is that some ingredients require
| slicing, like tomatoes, and tomatoes aren't sliced in the same
| manner for the salad as the sandwiches, and my daughter can't
| stand when the tomatoes and lettuce touch on her sandwich, and my
| other daughter wants the cheese and the meat separate. I don't
| want to overcomplicate the problem, but I don't want to Repeat
| Myself either, since I know I can grab the tomatoes and slice
| them all up in the same step, so I need to remember when I
| assemble my list of ingredients per sandwich and salad that some
| are exempt from the ordered application of ingredients and must
| be handled by a single, separate script for assembly.
|
| I run this process a few times, and it works, but I learn that it
| takes me 35 minutes to do, and that there's now a hard
| requirement on a frozen item involved with one of the sandwiches
| that it not be out for more than 10 minutes, so now this
| ingredient itself must be exempted from the step where I grab all
| ingredients and my assembly instructions for the one sandwich
| that involves this ingredient must be very clear that I will
| still also need to grab that ingredient.
|
| Then I learn/realize:
|
| In-fact, 90% of the time I make a sandwich, or salad, I only make
| one at a time.
|
| OR
|
| Nobody wants to order sandwiches by name, they just want to give
| me a list of ingredients in the right order
|
| OR
|
| I am gradually making so many more sandwiches every day that my
| kitchen counterspace cannot support getting all the ingredients
| at once
|
| OR
|
| I only make the same sandwiches + one salad every day to the
| exact same specification
| booleandilemma wrote:
| Software development is so varied that blanket statements like
| this never work.
|
| _Never_.
| redbell wrote:
| In an unrelated note, this " _Google Blog_ " thing appears to
| have [at least] three different domain names that redirect to the
| same url: https://blog.google.com, https://blog.google and
| https://googleblog.com, why is that?!
| wnevets wrote:
| History? The Google TLD is relatively new and would have been
| created last
___________________________________________________________________
(page generated 2024-05-30 23:00 UTC)