[HN Gopher] The Wrong Abstraction (2016)
___________________________________________________________________
The Wrong Abstraction (2016)
Author : signa11
Score : 124 points
Date : 2023-05-13 10:40 UTC (12 hours ago)
(HTM) web link (sandimetz.com)
(TXT) w3m dump (sandimetz.com)
| samsquire wrote:
| I feel there's a beautiful representation of (many) problems that
| is waiting to be found that makes particular software problems
| easy to read and understand. When the mental model of the
| software just makes sense. I don't want to reject other people's
| models of how programs work but I want to understand them.
|
| Unfortunately, I feel the the original problem can be obfuscated
| by adding ideas to the existing problem who now need to
| understand your ideas (or mental model) of the problem to
| understand your code. I need to understand how you think to read
| your code. And if your way of thinking is more advanced than mine
| or incomplete or not great, then my work is harder.
|
| Mental models are how I understand software, there is what I call
| a "critical insight" that makes the code obvious and easy to
| understand. I don't want to be deciphering and spend days
| investigating code to understand how to change it or build upon
| it or use it. I want the APIs to reflect their expected usage and
| behaviours.
|
| My perspective is that computers are adding and arrangement
| machines - they add and do operation on numbers and move things
| around to different locations. My mental model of computers is
| that it comes down to LOGISTICS and arrangement/ordering
| problems. Unfortunately, APIs and data structures are nested and
| ordered and obfuscate the underlying movement of things between
| places or addition to different things.
|
| Everything that obfuscates the rules of the computation means
| getting the behaviour you want from the code is harder.
|
| I've often thought about "commutative computation" where we
| specify what we want to be true to the computer and the computer
| works out how to arrange all its existing computations to satisfy
| that additional invariant. I often think of software as a series
| of behaviours rather than functional or imperative.
|
| Think of a materialised view, we have an existing behaviour of
| the computer and we want to customise the behaviour. You could
| work out where you need to insert your code snippet into but
| that's really hard. Or you could add an invariant to the system
| that the system now satisfies.
| kqr wrote:
| > Unfortunately, I feel the the original problem can be
| obfuscated by adding ideas to the existing problem who now need
| to understand your ideas (or mental model) of the problem to
| understand your code.
|
| What do you think are the steps someone goes through when they
| obfuscate the problem by adding ideas? Like why do you think
| they do it?
| gbear0 wrote:
| Things get obfuscated because someone's viewing the problems
| from a different abstraction lens, and they're building a
| system onto that lens.
|
| Eg. Iterate through an array: const arr = [1,
| 2, 3]; for (let i = 0, l = arr.length; i < l; ++i) {
| console.log(arr[i]) }
|
| Let's model it differently using an iterator:
| const arr = [1, 2, 3]; const arrIter =
| arr[Symbol.iterator](); let i = arrIter.next();
| while (!i.done) { console.log(i.value); i =
| arrIter.next(); }
|
| At this level it's still pretty obvious what's going on, but
| you can still see that there's a level of abstraction between
| an array access vs calling 'next/value', and that obfuscates
| what is actually happening at the computation/instruction
| level.
|
| If I extend this another level then I'm going to start
| modelling problems using an iterable and not an array/index.
| New requirements come in and we extend to use an async
| iterable. Everything still works nicely, but in some
| scenarios where the actual iterable is just an array, now
| there's a lot of extra overhead to just do an index lookup.
|
| Using the iterator allows the code to be reused in more
| scenarios, but there's usually a cost to switching the lens
| of abstraction so that it fits into a problems modeled
| differently.
| samsquire wrote:
| I can tell you what I want to believe of myself that I think
| I do.
|
| I try think of the simplest most elegant, beautiful solution
| to the problem that allows the minimal of code and minimal
| cleverness and complexity be used to solve the problem with
| trivial loop, map, hash lookup or traversal or association.
|
| That usually comes with trying to see the problem in a
| different light, to reframe the problem as a different kind
| of problem, which can obfuscate the original problem.
| veyh wrote:
| (2016)
| pachico wrote:
| I agree 100%, but here's a plot twist:
|
| - Don't fall in the trap of early abstraction...
|
| - and abstracting one single use case is very hard, if not
| impossible...
|
| - but you are writing your app in Go and interfaces are the only
| way to properly test your stuff...
|
| - The end.
| kubanczyk wrote:
| > in Go and interfaces are the only way to properly test
|
| Adding tests using interfaces is solely adding more code (and
| the method signatures are simply duplicated) so it's the exact
| opposite of OP's problem.
| dustingetz wrote:
| Agree with the big idea, but the problem is - If you are not a
| computer scientist and current with the latest papers, you
| certainly have the "wrong" abstraction
|
| So i'd collapse this whole article down to 1 bit - "software
| development is hard"
| [deleted]
| jsnell wrote:
| Significant past discussions:
|
| https://news.ycombinator.com/item?id=11032296
|
| https://news.ycombinator.com/item?id=12061453
|
| https://news.ycombinator.com/item?id=17578714
|
| https://news.ycombinator.com/item?id=23739596
|
| https://news.ycombinator.com/item?id=27095503
| revskill wrote:
| I think what people want is Design Pattern, not abstraction
| "implementation".
|
| Design pattern is real abstraction, because it's about thinking
| and designing. Abstraction is not related with specific
| implementation.
|
| So, duplication is fine until you figure out the real Design
| Pattern to be used.
| seo-speedwagon wrote:
| > Existing code exerts a powerful influence. Its very presence
| argues that it is both correct and necessary.
|
| I've probably paraphrased this line to every junior engineer I've
| mentored. It's such a succinct and pithy insight.
| scrubs wrote:
| It's what I call small 'c' culture, which regrettably, get's
| confused with 'C' culture. In software, 'C'ulture is:
|
| - know your customer, and their use cases
|
| - continuous improvement
|
| - specialists have got to know the big picture and engage in it
|
| - good cross functional coordination
|
| - CS fundamentals: DBs, algorithms, UNIX, functional
| programming, C/C++, CDCI etc.
|
| That never ages away.
|
| However, stuff like we use and have always used Kafka (read the
| code!) for messaging, so we're not doing kernel-by-pass to move
| data now is small 'c' culture.
|
| Small 'c' culture is the kind of stuff that, if you abrogate
| it, a small army of people will come out of the woodwork and
| brow beat you for it. Brow beating to keep you inline is not
| engineering. It's nagging.
|
| Tradition, when it's small 'c', is stifling. Don't fall for it.
| kgeist wrote:
| In our team we have this rule that you should'nt even think about
| introducing an abstraction unless there's at least 3 real use
| cases to consider. You're most likely to create a wrong
| abstraction if there's only 1 use case; 2 cases may be just a
| coincidence (2 business rules look similar on the surface but
| have nothing to do with each other really). 3 is an heuristic but
| it saves us from investing too much time on most likely useless
| abstractions which only get in your way.
| sodapopcan wrote:
| You are describing Rule of Three :)
|
| https://en.m.wikipedia.org/wiki/Rule_of_three_(computer_prog...
| dllthomas wrote:
| I think the rule of 3 tends to lead to reasonable results, but
| asking "is this really the same piece of knowledge I'm encoding
| in N places" (as the original formulation of DRY suggests) is
| going to be a little better. Sometimes it's two places but it's
| really clear they'll always change together, sometimes it's ten
| places but each is going to evolve independently (which, to be
| fair, might well be the determination made when "think[ing]
| about introducing an abstraction" in your formulation).
|
| To push back a bit on naive misapplication of DRY I've been
| saying we should call collapsing things that are just
| coincidentally similar (and likely to change independently)
| "Huffman coding".
| Tade0 wrote:
| More often than not, when I tried to employ the strategy
| explained in the post, the sunk-cost people would try to shut me
| down.
|
| Fortunately my current project is different, because the team is
| very small and we have silos of responsibility, so we don't
| really get in each other's way that much.
|
| It appears that the largest obstacle here is not the lack of
| ability, but agency.
| [deleted]
| croes wrote:
| >duplication is far cheaper than the wrong abstraction
|
| Isn't that a pretty useless sentence? Of course duplication is
| cheaper because aren't the higher costs one of the reasons why
| it's a wrong abstraction?
|
| Reminds of this sketch from Fry and Laurie
|
| https://youtu.be/XewVicFzRxw
|
| >Hugh: Yes but too much is bad for you.
|
| > Stephen: Well of course too much is bad for you, that's what
| "too much" means you blithering twat. If you had too much water
| it would be bad for you, wouldn't it? "Too much" precisely means
| that quantity which is excessive, that's what it means. Could you
| ever say "too much water is good for you"? I mean if it's too
| much it's too much. Too much of anything is too much. Obviously.
| Jesus.
| jakelazaroff wrote:
| Duplication and abstraction aren't the same, though.
| Abstraction is a tool for reducing duplication. The point of
| the post is that if the abstraction is wrong, it's worse than
| just leaving the seemingly duplicated code.
| yxhuvud wrote:
| I'd disagree. An abstraction is a way to reason about a
| problem. Often that reduces duplication, but it is a side
| effect from a better understanding of the problem.
| croes wrote:
| Of course because the higher costs are what make the
| abstraction wrong in the first place.
|
| That's like saying don't do the wrong thing.
| jakelazaroff wrote:
| Oh I see what you're saying. No, "the wrong abstraction"
| doesn't intrinsically mean it's more costly than duplicated
| code. A lot of people argue that the wrong abstraction is
| still better than having duplicated code. She's saying
| that's not the case.
| williamcotton wrote:
| Exactly, the Church of DRY. Heathens! The truth is found
| in the Church of the Rule of Three. ;)
| gpderetta wrote:
| We
| scrubs wrote:
| To add to the OP's post:
|
| 1. Organizations must value continuous improvement. We want to
| avoid two extremal behaviors that sours individuals. First, the
| lethargic in-bred sterility of: hey, it worked before you got
| here, and it's fine now. Play-it-off is not wisdom. On the other
| extreme is frustration gone wrong. Sure, you can see a problem
| AND be right about it. But whining and constant criticism sours.
| Everybody's problem is there are 10,000 things that could be
| worked on, and resources only for 1000. You better make sure
| you're customer driven so you pick the right 1000.
|
| 2. Duplication is better for the medium term ... if you stay with
| the problem for a while, you are better able to distill the big
| picture into a more coherent new abstraction. Here you can cite a
| problem, cite a solution, and stick to your guns. You are better
| positioned to impact change without being a whiner. Now, problems
| are working for you, not against.
| kmac_ wrote:
| [flagged]
| bcrosby95 wrote:
| Whenever this article is posted it amazes me. People seem to only
| reply to the title, and ignore the substance of it. The point is
| not to "not abstract" or "rule of 3". The point is requirements
| change, features are added, and _when_ an abstraction becomes
| wrong, tear it out.
| carlivar wrote:
| Yes, and then potentially rebuild it based on what you know
| now! No one reads that part.
| morrvs wrote:
| > The point is requirements change, features are added, and
| when an abstraction becomes wrong, tear it out.
|
| I like this phrasing a lot, thanks for this!
|
| I'm still wondering if there's also potential in _avoiding_ the
| wrong abstractions in the first place. For that we 'd need a
| "cheap" way to decide whether an abstraction is
| good/bad/something else.
|
| Is there generally applicable, widely accepted principles or
| research around this? A quick search only revealed random blog
| posts; nothing I'd consider widely accepted.
| [deleted]
| krona wrote:
| > Is there generally applicable, widely accepted principles
| or research around this
|
| J. Ousterhout is gaining traction, at least in my corner of
| the industry. https://web.stanford.edu/~ouster/cgi-
| bin/cs190-winter18/lect...
| hamdouni wrote:
| In all the article, no reference to the business those
| abstractions or duplications are made for. I mean the way to
| decide if it is a "good duplication" is to ask ourself if it is a
| coincidence that it is the same code: 2 business rules having the
| same implementation does not mean it is a duplication.
| kristiandupont wrote:
| This is part of the reason why I like Tailwind-style utility
| classes and Typescript union/intersection types. The simple fact
| that I am (often) spared the intellectual effort of coming up
| with a name. I wrote this: https://itnext.io/and-naming-things-
| tailwind-css-typescript-...
| layer8 wrote:
| For limited occurrences that's correct, but if you find
| yourself having the same `foo | bar | baz` all over the code,
| you're going to want to introduce a shortcut term for it. Even
| just to be able to efficiently talk about it.
|
| The other thing is that unions/intersections are not an
| abstraction, because they don't hide any details. The purpose
| of an abstraction is to separate essential properties of
| whatever is being modeled (the interface) from current details
| that may change later, or that client code shouldn't depend on
| (the implementation).
| ivalm wrote:
| Yes, one of the reasons I like using mypy for python and
| typescript for frontend is that it forces me to recognise
| opportunities for abstractions. If some input/return type is
| getting really complicated or reappears in many places in the
| code then likely it's a good candidate for an abstraction.
| kristiandupont wrote:
| In case it's unclear, I agree with you completely.
| Introducing the right abstraction into a code base can feel
| like someone switching on the lights. Far more benefit than
| just DRY.
|
| Conversely, I am currently working with a frontend code base
| that is using "classic CSS", and it's striking to me how
| frustrating it is to have to think up what the "semantics" of
| this and that particular <div> can be said to be, when there
| very often aren't any.
| pkolaczk wrote:
| While I agree with the statements made in the original post, I'm
| afraid this thinking can be used as an excuse for avoiding any
| attempts at finding proper abstractions. Similarly to how the
| term "premature optimisation" is so frequently used by people
| unable to write efficient code to excuse for their lack of skill
| or laziness, despite the context and the times when those words
| were first used were vastly different and the author meant
| something else.
|
| IMHO abstraction should not be guided by the desire to remove
| duplication. Duplication is not even the only (and far from the
| worst) result of insufficient abstraction.
|
| Insufficient abstraction leads to increased complexity, not just
| duplication.
|
| Example: just this week I've been working on some code that has
| to deal with arbitrary ranges of ordered values. Typically when
| you think of a range, you think of a pair of bounds - the lower
| and the upper bound. However, the input is allowed to have only
| half-ranges so that one of the ends might be unbounded. So in the
| code I inherited there are 3 cases: a range with both lower and
| upper bounds defined, a range with only a lower bound, and a
| range with only an upper bound. All code processing those ranges
| has to deal with that optionality of either end, thus making it
| way more complex than needed - lot of if ladders or switch
| statements. And it multiplies very quickly when you deal with
| more than one range at a time. It is insufficiently abstract,
| even though it doesn't have any obvious duplication. The proper
| abstraction would be to transform the half-ranges to full ranges
| by introducing special open-end items (always smaller or greater
| than every possible value) which would allow one simple type of
| range to cover all possible cases.
| auggierose wrote:
| I'd say this can serve as an example where triplication is
| better than your abstraction. What are these special open-ended
| items? How do you need to extend comparison to account for
| them? Etc. Whereas the three cases are perfectly clear and easy
| to understand.
| JadeNB wrote:
| > How do you need to extend comparison to account for them?
|
| The post already says: -[?] < x < [?] for all numerical x.
| (And the mathematician in me clarifies that that's all _real_
| numerical x.)
| pkolaczk wrote:
| I don't remember who said that, but mathematics is all
| about building abstractions, not about computation. So many
| times mathematics helped me make code simpler.
| auggierose wrote:
| The post I am replying to made no assumptions about the
| domain the order is defined on. If it is over the reals,
| sure, you can use -[?] and [?]. If it is over the integers,
| you can use MIN_VALUE and MAX_VALUE, sacrificing some of
| your domain (which might be a problem, depending on the
| context), or you can use Option[Int], which comes with
| performance issues.
|
| Or, you can use a range which is a sum of three/four cases,
| and not worry about any of that.
| Rumudiez wrote:
| > What are these special open-ended items?
|
| > If it is over the reals, sure, you can use -[?] and
| [?]. If it is over the integers, you can use MIN_VALUE
| and MAX_VALUE
|
| If you already knew the answer, why did you ask?
| pkolaczk wrote:
| The three cases force _every code_ using ranges to deal with
| them. Apply this way if thinking many times for multiple
| concepts and you end up with a spaghetti of multiply nested
| if statements that 's near impossible to analyze for
| correctness. Because now you have to read all code instead of
| just a tiny subset.
|
| > What are these special open-ended items? How do you need to
| extend comparison to account for them?
|
| The whole point of abstraction is to make those decisions
| once and isolate the complexity in one place instead of
| having it spread over N places in the code, forcing everybody
| to solve the same problems again and again.
| hitchdev wrote:
| [dead]
| kqr wrote:
| Do ranges not support a limited set of operations through
| which the rest of the code can interact with them, instead
| of manipulating the endpoints directly?
|
| I would think of the range itself as the abstraction, and
| then it matters less how it's implemented since any
| potential problems are local to the implementation and
| cheap-ish to fix.
| pkolaczk wrote:
| Sure you can probably also do it, but this is not the way
| how the code was originally written. The original ranges
| present the bounds in their public API, and most code
| just operated on them.
| kqr wrote:
| Then I would say that's the problem. Whatever
| implementation you leak, the problem is the lack of
| implementation hiding, not that the implementation looks
| this way or that way.
| pkolaczk wrote:
| But that's still insufficient abstraction and my general
| point holds.
| auggierose wrote:
| Yes, this is correct. The range is the abstraction, and
| then you can choose how to represent it. Not much
| difference between the three range cases, and the single
| case with special endpoints, except that the three cases
| are more general, as no special values are needed.
| travisjungroth wrote:
| When your three cases run into someone else's two you have
| six, and it can even get much worse than that. Or an object
| running into itself and getting nine cases.
|
| Intervals are nice for representing acceptable ranges. Half
| intervals mean greater/less than. If you stick infinities on
| the ends, everything likely works. You then expose methods or
| functions for all your operations. From the outside, you
| don't have to care if it's a half interval or not (unless
| that is what you're particularly checking). On the inside you
| don't really, either.
|
| If you're messing with intervals in a business setting, it's
| worth considering if you need multi intervals, non continuous
| regions.
|
| These are all great for handling uncertainty. Like if you add
| two weights that have +/- values, you can have the sum have
| those and be correct. The math is all well defined and rather
| easy. Wikipedia has good pages on it.
| JadeNB wrote:
| > The proper abstraction would be to transform the half-ranges
| to full ranges by introducing special open-end items (always
| smaller or greater than every possible value) which would allow
| one simple type of range to cover all possible cases.
|
| You wouldn't even need to create anything new--both math and C
| already provide this abstraction in the form of -[?]/-inf and
| [?]/inf.
| pkolaczk wrote:
| I wasnt talking specifically about real (float) numbers, but
| yes - this is that abstraction. And it generalizes to any
| type with ordering (can work with integral types as well).
| hakunin wrote:
| Problems like these often come from the pressure to ship fast,
| and not writing code 2-3 times to find a good way to express
| something. If you're going to rush through abstracting things
| away, I'd rather you duplicated. If you will take time to
| express it well, then I'd prefer a good abstraction.
| dgb23 wrote:
| Very good example of how even a small scale piece of logic can
| have very messy effects down the line.
|
| Both insufficient and wrong abstractions are viral. They infect
| everything they touch, which can snowball into large parts
| being more complex, harder to understand and debug and often
| also slower.
|
| The wrong abstraction is wrong, insufficient abstraction is
| wrong.
|
| Really the only weapons against complexity we have as
| programmers are decomposition and abstraction. We have to take
| things apart, like in your example it would be the meaning of
| each parameter, and then we put them together in such away that
| the details below our abstraction can mostly be ignored.
|
| I say that all with a caveat: I tend to prefer less,
| insufficient or no abstraction over the wrong one. The former
| few options can lead to code that is hard to understand as a
| whole and can be brittle, but the latter drives you into a
| corner: The only way out is either trying to patch over it or
| starting from scratch - choose your poison...
| marcosdumay wrote:
| > the latter drives you into a corner: The only way out is
| either trying to patch over it or starting from scratch
|
| Often enough, the way out is going back. Why are developers
| collectively so reluctant to go back? (Myself included.)
| epiccoleman wrote:
| I'm encountering a lot of these types of small abstraction
| projects in a React project I'm working on. It's a music theory
| "explorer" app and, maybe unsurprisingly if you know any music
| theory, getting a good abstraction that doesn't fall victim to
| lots of weird little edge cases is tricky.
|
| I'm using Tonal which makes it easier, because I can mostly
| push weirdness into wrappers for individual Tonal calls. It's
| honestly been a great little challenge because the scope is so
| small that it doesn't take all that much analysis or thought to
| see where abstractions break down. Fun little exercise in code
| design.
| _a_a_a_ wrote:
| IDK anything about music theory but I wonder, if you're
| having trouble finding a good at abstraction to express
| theory, perhaps the theory is at fault.
|
| I mean, at a high level, theory _is_ the abstraction isn 't
| it?
| pm wrote:
| The time dimension is often forgotten when applying these
| maxims. When we see code, we often fail to consider the the
| journey it's taken to arrive at that point in time, and where
| it might be headed in future.
|
| In the example you set, it's the right time to apply an
| abstraction, so it's no longer premature. Perhaps the maxim
| should be labelled as "premature abstraction", rather than
| "premature optimisation".
| nathias wrote:
| Wrong abstraction is a type of premature optimization, it's an
| anti-pattern that's very common among the senior and supersenior
| coders that 'already know everything in advance' and that
| knowledge turns out to be false.
| BulgarianIdiot wrote:
| Why is this article recognizing only two extremes? Either
| everyone uses the same instance, or there's NO SHARING AT ALL:
|
| "Re-introduce duplication by inlining the abstracted code back
| into every caller."
|
| Or maybe if there are, say, 10 places dependent on the shared
| code, we can make them 5 places dependent on one version and 5
| places dependent on another?
|
| Forking and merging is part of business as usual in programming
| and we should be used to it. We should not be shocked that
| sometimes you have to fork a function because adding more
| parameters is not feasible, but nor should we declare sharing is
| therefore wrong or harmful.
|
| Also, how you design parameters is extremely important. One
| callback parameter may be worth a hundred "normal" ones.
| inimino wrote:
| The point she is making is about choosing to go back, towards
| less abstraction, rather than forward. So I expect the answer
| to your question is that two endpoints are enough to establish
| both directions and make the intended point.
|
| If midpoints are introduced then comments like yours "but what
| about..." can always be made until the entire abstraction tower
| is fully described, and that's not the blog post (or book) the
| author wanted to write.
| BulgarianIdiot wrote:
| I wish every time someone established two extreme points,
| everyone is like you, automatically interpolating an entire
| space of endless possibilities, countless shades of gray. But
| this is decidedly NOT how we think, because dichotomies are
| simpler to mentally process, and in fact ultimatums or
| "single right solution" situations are easiest to process.
|
| Have you ever seen an online argument? If someone is right,
| and someone is not AS right as they are, they are a "left
| shill" and vice versa. If you promote solution A, and someone
| promotes solution B, then they're "wrong". Not establishing
| points, just "wrong".
|
| So I think establishing two directions is best accomplished
| not by marking up two extreme points and leaving the rest to
| the imagination as our imagination is apparently quite poor.
|
| It's more correct to describe the next _step_ in a direction,
| and let us take things step by step and know that nuance is
| inherent to our success, not optional.
___________________________________________________________________
(page generated 2023-05-13 23:01 UTC)