[HN Gopher] Push ifs up and fors down
___________________________________________________________________
Push ifs up and fors down
Author : celeritascelery
Score : 623 points
Date : 2023-11-15 21:41 UTC (2 days ago)
(HTM) web link (matklad.github.io)
(TXT) w3m dump (matklad.github.io)
| Waterluvian wrote:
| This kind of rule of thumb usually contains some mote of wisdom,
| but generally just creates the kind of thing I have to de-
| dogmatize from newer programmers.
|
| There's just always going to be a ton of cases where trying to
| adhere to this too rigidly is worse. And "just know when not to
| listen to this advice" is basically the core complexity here.
| actionfromafar wrote:
| I think this article could be useful as a koan in a larger
| compilation.
|
| Some of the koans should contradict each other.
| gavmor wrote:
| De-dogmatizing needs to happen, so what? I think these kinds of
| rules are helpful to play with; adopt them, ride them as far as
| they go, invert them for a day or year , see where that takes
| you. You learn their limits, so what? More grist for the
| palimpsest.
| flashback2199 wrote:
| Don't compilers, cpu branch prediction, etc all fix the
| performance issues behind the scenes for the most part?
| Thaxll wrote:
| Well compilers are good and dumb at the same time.
| bee_rider wrote:
| Some of the moves seemed to change what an individual function
| might do. For example they suggested pulling an if from a
| function to the calling function.
|
| Could the compiler figure it out? My gut says maybe; maybe if
| it started by inlining the callee? But inlining happens based
| on some heuristics usually, this seems like an unreliable
| strategy if it would even work at all.
| malux85 wrote:
| No, compilers (correctly) prefer correctness over speed, so
| they can optimise "obvious" things, but they cannot account for
| domain knowledge or inefficiencies further apart, or that
| "might" alter some global state, so they can only make
| optimisations where they can be very sure there's no side
| effects, because they have to err on the side of caution.
|
| They will only give you micro optimisations which could
| cumulatively speed up sometimes but the burden of wholistic
| program efficiency is still very much on the programmer.
|
| If you're emptying the swimming pool using only a glass, the
| compiler will optimise the glass size, and your arm movements,
| but it won't optimise "if you're emptying the correct pool" or
| "if you should be using a pump instead" - a correct answer to
| the latter two could be 100,000 times more efficient than the
| earlier two, which a compiler could answer.
| jchw wrote:
| The short answer is absolutely not, even when you are sure that
| it should. Even something as simple as a naive byteswap
| function might wind up generating surprisingly suboptimal code
| depending on the compiler. If you really want to be sure,
| you're just going to have to check. (And if you want to check,
| a good tool is, of course, Compiler Explorer.)
| roywashere wrote:
| Function call overhead can be a real issue in languages like
| Python and JavaScript. But you can or should measure when in
| doubt!
| hansvm wrote:
| It's a real issue in most compiled languages too if you're
| not careful (also a sort of opposite issue; too few functions
| causing unnecessary bloat and also killing performance).
| ezekiel68 wrote:
| This is just a myth promoted by Big Compiler designed to sell
| you more compiler.
| Izkata wrote:
| There's a pretty famous StackOverflow question/answer about
| branch prediction failure:
| https://stackoverflow.com/questions/11227809/why-is-processi...
| jmull wrote:
| The rule of thumb is put ifs and fors where they belong -- no
| higher or lower. And if you're not sure, think about a little
| more.
|
| I don't think these rules are really that useful. I think this is
| a better variation: as you write ifs, fors and other control flow
| logic, consider why you're putting it where you are and whether
| you should move it to a higher or lower level. You want to think
| about the levels in terms of the responsibility each has. If you
| can't think of what the demarcations of responsibility are, or
| they are tangled, then think about it some more and see if you
| can clarify, simplify, or organize it better.
|
| OK, that's not a simple rule of thumb, but at least you'll be
| writing code with some thought behind it.
| benatkin wrote:
| If you know where they belong, this post isn't for you.
| crazygringo wrote:
| Exactly -- write code that matches clear, intuitive, logical,
| coherent organization.
|
| Because easy counterexamples to both of these rules are:
|
| 1) I'd much rather have a function check a condition in a
| single place, than have 20 places in the code which check the
| same condition before calling it -- the whole _point_ of
| functions is to encapsulate repeated code to reduce bugs
|
| 2) I'd often much rather leave the loop to the calling code
| rather than put it inside a function, because in different
| parts of the code I'll want to loop over the items only to a
| certain point, or show a progress bar, or start from the
| middle, or whatever
|
| Both of the "rules of thumb" in the article seem to be
| motivated by increasing performance by removing the overhead
| associated with calling a function. But one of the _top_ "rules
| of thumb" in coding is to _not prematurely optimize_.
|
| If you need to squeeze every bit of speed out of your code,
| then these might be good techniques to apply where needed (it
| especially depends on the language and interpreted vs.
| compiled). But these are _not at all_ rules of thumb in
| general.
| pests wrote:
| I think a key thing software engineers have to deal with
| opposed to physical engineers is an ever changing set of
| requirements.
|
| Because of this we optimize for different trade-offs in our
| codebase. Some projects need it, and you see them dropping
| down to handwritten SIMD assembly for example.
|
| But for the most of us the major concern is making changes,
| updates, and new features. Being able to come back and make
| changes again later for those ever changing requirements.
|
| A bridge engineer is never going to build abstractions and
| redundencies on a bridge "just in case gravity changes in the
| future". They "drop down to assembly" for this and make
| assumptions that _would_ cause major problems later if things
| do change (they wont).
|
| I guess my point is: optimizing code can mean multiple
| things. Some people want to carve out of marble - it lasts
| longer, but is harder to work with. Some people want to carve
| out of clay - its easier to change, but its not as durable.
| couchand wrote:
| I've been impressed watching the crews meticulously replace
| each cable assembly of the George Washington Bridge over
| the past year or so. All the work that doesn't require
| disconnected cables is done in parallel, so you can get a
| sense of the whole process just driving across once
| (they've finished the north side so you want to drive into
| Manhattan for the current view).
|
| It's more or less the same as code migrations we're doing
| on a regular basis, done far more diligently.
| chiefalchemist wrote:
| Whether marble or clay, both ideally take into
| consideration the fact that he/she who write it today may
| not be he/she who maintains it tomorrow.
|
| When stuck between longevity v less durable,
| maintainability should be the deciding factor.
| pests wrote:
| Part of my point though was that the bridge builder of
| today does not need to take into consideration that the
| person maintaining it 20 years from now will have to deal
| with gravity changing. So they can make certain
| assumptions that will be impossible for future
| maintainers to ever undo.
|
| Software doesn't have these set-in-stone never-changing
| requirements. I think we are making similar points.
| foota wrote:
| I think the argument here could be stated sort of as push
| "type" ifs up, and "state" ifs down. If you're in rust you
| can do this more by representing state in the type
| (additionally helping to make incorrect states
| unrepresentable) and then storing your objects by type.
|
| I have a feeling this guide is written for high performance,
| while it's true that premature optimization is the devil, I
| think following this sort of advice can prevent you from
| suffering a death from a thousand cuts.
| stouset wrote:
| 1) I'd much rather have a function check a condition in a
| single place, than have 20 places in the code which check the
| same condition before calling it -- the whole point of
| functions is to encapsulate repeated code to reduce bugs
|
| That's fine, but it's often a good idea to separate "do some
| work on a thing" and "maybe do work if we have a thing".
| Using the example in the article, it is _sometimes_ useful to
| have multiple functions for those cases: fn
| frobnicate(walrus: Walrus) { ... }
| fn maybe_frobnicate(walrus: Option<Walrus>) {
| walrus.map(frobnicate) }
|
| But also... in languages like Rust, most of the time that
| second one isn't needed because of things like Option::map.
| calvinmorrison wrote:
| A good rule of thumb is validate early and return early.
| Prevents endless if else nesting
| usrusr wrote:
| Conditions inside the function are also in line with Postel's
| law, if we drag it from networking to API design. And in
| large parts of programming the entire "enforce it with types"
| (say null check without saying null check) isn't a thing at
| all. It only gets worse with languages where api evolution
| and compatibility is done by type-sniffing arguments. Those
| will just laugh at the idea of pushing an if up.
|
| But it's an interesting discussion nonetheless. What I picked
| up, even if it wasn't directly mentioned (or I might have
| missed it?), is that a simple check on the caller side can be
| nice for the reader. Almost zero cost reading at the call
| site because the branch is a short one, and chances are the
| check provides some context that helps to understand what the
| call is all about: if(x is Walrus)
| frobnicate(x);
|
| is not just control flow, it doubles as a friendly reminder
| that frobnication is that thing you do with Walrusses. So my
| takeaway is the check stays in the function (I also don't
| agree with the article), but make it a part of the naming
| consideration. Perhaps "frobnicateIfWalrus" wouldn't be so
| bad at all! I already do that occasionally, but perhaps it
| could happen more often?
| metadat wrote:
| Yes, this advice has the scent of premature optimization with
| the tradeoff sacrifice being readability/traceability.
| demondemidi wrote:
| You also want to avoid branches in loops for faster code. But
| there is a tradeoff between readability and optimization that
| needs to be understood.
| lazypenguin wrote:
| This reply is akin to "just think about it and do it perfectly
| dummy" which might sound "obvious" and "smart" but is really
| just unhelpful commentary. The ideas provided in this blog are
| actually a good heuristic for improving code in general, even
| if they don't apply in all cases. Simple and practical rules of
| thumb can go a long way for those without enough experience yet
| to have internalized these kinds of rules.
| collingreen wrote:
| I don't agree it is unhelpful commentary. I think it makes an
| important point that just following some rules you found
| won't be enough because the right answer isn't a result of
| something arbitrary/universal but is a side effect of a
| better understanding of the responsibility of each piece of
| code you're writing.
| Aeolun wrote:
| I think you'll have better code on average if you always
| apply these rules than if you don't.
|
| It's not always helpful to give people the option to do it
| as they think is best.
| zeven7 wrote:
| > than if you don't
|
| If by "don't" you mean "do the opposite" then I agree.
| The third option is "don't follow the rule blindly but
| think about the situation", and for that case it depends
| entirely on the person and how well they think.
| darkerside wrote:
| I do think it is shortsighted, and equally helpful and
| harmful. Helpful in that it reminds the reader this is only
| one of many considerations that should be taken into
| account. Harmful because it borders on implying that this
| shouldn't even be a consideration taken into account
| because it is irrelevant.
| guntherhermann wrote:
| Our job is to think about problems and solve them, I'm in
| completely agreement with GP
| flir wrote:
| If you were going to generalize your hard-won knowledge and
| pass it on to a junior, what would you say?
| ryandrake wrote:
| "Draw the rest of the fucking owl"[1]
|
| 1: https://knowyourmeme.com/memes/how-to-draw-an-owl
| mjevans wrote:
| There isn't an easy shortcut. Experts exist for reasons.
| Just like Doctors and other specialists, find a good
| expert you trust and evaluate their evaluation of the
| topic.
| llimllib wrote:
| Doctors rely on an enormous number of heuristics and
| shortcuts exactly like this.
|
| My wife teaches doctors, and so much of what she does is
| giving them rules of thumb much like this one.
|
| edit: I want to note that I'm pretty ambivalent on the
| actual advice of the article, just commenting that
| doctors in my experience have a truly astounding
| collection of rules of thumb
| mjevans wrote:
| An expert in a field tends to be an expert of the rules
| of thumb, the references to consult when the rule of
| thumb doesn't deliver the desired results, and the
| nuances and exceptions that are the art part of 'science
| and useful arts' for their field.
| guntherhermann wrote:
| "Keep at it, it can be frustrating at times but if you
| keep studying and practicing how to become a better
| developer, you will become one"
|
| There are no shortcuts in life. You need to work at it.
| Cognizant practice got me there, and I think it will get
| you there, too.
|
| I get people asking me how to become a programmer _all
| the time_ -- and I always say the same thing: You can 't
| talk about it, you need to do it. You need to write
| programs, writing software will make you a better
| programmer.
|
| Have opinions on how to write good software, be open
| about these opinions, but also be aware that they could
| be bad opinions, misinformed, or lacking greater context.
| makeitdouble wrote:
| Trying to help too much, and give shortcuts to people who
| cannot otherwise evaluate when these shouldn't be applied has
| ill effects.
|
| On the same note, I find the reasonning behind the advice way
| more interesting than the advice itself. It gives a good
| framing of the different issues to consider when writing
| conditional and iterative code.
|
| Basically it should help newcomers to identify the issues
| they're facing and build their own personal heuristics.
| civilized wrote:
| A lot of replies in this tree read to me as "don't
| concisely express a general heuristic principle, because
| someone might read it and apply it without nuance".
|
| This way of thinking is infantilizing. It is also self-
| defeating, because it is itself a failure to apply with
| nuance a general heuristic principle (don't oversimplify).
| It's doing what it tells you not to do.
|
| Heuristics are only heuristics, but you have to articulate
| _what they are_ before you can append "and this is only a
| heuristic, don't apply it blindly and universally".
|
| Appending "everything depends on context" is the trivial
| part. The heuristics are the substantial contribution.
| makeitdouble wrote:
| I see your side, but take it the other way: the basic
| message is really "you'll have to think for yourself
| anyway, heuristics from random strangers won't help",
| which I don't see as infantilizing.
|
| I see post mortems and issue discussions on public
| projects as a better resource and contribution than
| sharing personal heuristics.
| darkerside wrote:
| Better phrasing might be, this is a really good point,
| and while you can't rely on it alone, the more of these
| you learn, the better of a programmer you'll be.
| louthy wrote:
| In reality these 'rules of thumb' become dogma and people
| forget why the rules of thumb existed in the first place, or
| make up other bullshit rules to justify it.
|
| This industry is full of dogma and self-appointed experts and
| it's utterly tedious.
| chiefalchemist wrote:
| > OK, that's not a simple rule of thumb, but at least you'll be
| writing code with some thought behind it.
|
| This reflects one of my answers to the question: What separates
| an engineer from a developer?
|
| Engineers are intentional, or at least hope to be as often as
| possible. On the other hand, developers may arrive at similar
| or same ends but they're generally more reactive and less
| intentional.
| nerdponx wrote:
| Pushing "ifs" up has the downside that the preconditions and
| postconditions are no longer directly visible in the definition
| of a function, and must then be checked at each call site. In
| bigger projects with multiple contributors, such functions could
| end up getting reused outside their intended context. The result
| is bugs.
|
| One solution is some kind of contract framework, but then you end
| up rewriting the conditions twice, once in the contract and once
| in the code. The same is true with dependent types.
|
| One idea I haven't seen before is the idea of tagging regions of
| code as being part of some particular context, and defining
| functions that can only be called from that context.
|
| Hypothetically in Python you could write:
| @requires_context("VALIDATED_XY") def do_something(x, y):
| ... @contextmanager def validated_xy(x, y):
| if abs(x) < 1 and abs(y) < 1: with
| context("VALIDATED_XY"): yield x, y
| else: raise ValueError("out of bounds")
| with validated_xy(0.5, 0.5) as x_safe, y_safe:
| do_something(x_safe, y_safe) # Error!
| do_something(0.5, 0.5)
|
| The language runtime has no knowledge of what the context
| actually means, but with appropriate tools (and testing), we
| could design our programs to only establish the desired context
| when a certain condition is met.
|
| You could enforce this at the type level in a language like
| Haskell using something like the identity monad.
|
| But even if it's not enforced at the type level, it could be an
| interesting way to protect "unsafe" regions of code.
| timeon wrote:
| > In bigger projects with multiple contributors, such functions
| could end up getting reused outside their intended context. The
| result is bugs.
|
| Yes but in this particular example `fn frobnicate(walrus:
| Walrus)` if you pass here anything other then owned Walrus then
| program would not compile. Even if it was something generic
| passing the arg would have to satisfy trait bounds. Definition
| of those bounds in function definition would be required by
| compiler based on how the argument will be used inside
| function.
| imron wrote:
| > Pushing "ifs" up has the downside that the preconditions and
| postconditions are no longer directly visible in the definition
| of a function
|
| You're missing the second part of the author's argument:
|
| "or it could push the task of precondition checking to its
| caller, _and enforce via types_ "
|
| The precondition is therefore still directly visible in the
| function definition - just as part of the type signature rather
| than in an if statement.
|
| The "enforce preconditions via types" is a common pattern in
| Rust (the language used in the article), and unlike checking
| with if statements, it's a strict precondition that is checked
| at compile time rather than at runtime and you won't even be
| able to compile your program if you don't meet the pre-
| condition.
| nerdponx wrote:
| Definitely, that's a pattern in many programming languages
| (including, increasingly, in Python). And in dependently-
| typed languages, you actually can obtain a proof of validity
| that gets erased at runtime, leaving nothing but basic
| unboxed data to be executed.
|
| However that's not always an option in all languages or
| situations, and it's hard to encode conditions like "must
| only be called while an event loop is running" that way.
|
| I'm envisioning this partly as kind of a catch-all/fallback
| mechanism that can either be bolted on a language that
| doesn't have these kinds of features. But it's not always the
| case that you can effectively communicate contextual
| requirements through the types of function parameters, and
| this would cover that case as well.
|
| If you want to reduce this to anything, it's a workaround for
| not having monads in your language, with some tradeoffs
| around composability and dynamic extent.
| scns wrote:
| > The "enforce preconditions via types" is a common pattern
| in Rust
|
| Called the Type State Pattern. Here are two links from my
| bookmarks:
|
| https://cliffle.com/blog/rust-typestate/
|
| https://yoric.github.io/post/rust-typestate/
| funcDropShadow wrote:
| > "or it could push the task of precondition checking to its
| caller, and enforce via types"
|
| Or the enforcement could be done via a design-by-contract
| discipline. Either using classical assertions, or something
| like reusable specifications in Clojure's spec library.
| tengbretson wrote:
| Branded types in typescript are an interesting way to do this.
| aeonik wrote:
| Isn't the intention of public vs private supposed to cover the
| "tagging of code for a particular context"?
|
| Or maybe we need more specific semantics around this that can
| cross cut the domain that public and private (and protected in
| .NET ecosystems) cover?
| nerdponx wrote:
| Sure, this could definitely be a generalization of "public"
| and "private".
| jon-wood wrote:
| You can use types for this in Python: def
| do_something(position: ValidatedPosition): ...
| position = Position(x, y) if abs(position.x) > 1 or
| abs(position.y) > 1: raise ValueError("out of bounds")
| position = ValidatedPosition(position.x, position.y)
| do_something(position)
|
| In practice I'd probably have ValidatedPosition encapsulate
| that validation in it's constructor, but you get the idea. If
| you attempt to pass a Position to do_something then mypy is
| going to shout at you for passing an object of the wrong type
| to it.
|
| Python's type checking definitely isn't as comprehensive as
| something like Rust, but I have found it increasingly useful
| for situations like this where you want to ensure the data
| being passed in has been appropriately handled.
| nerdponx wrote:
| Sure, but now you have to rewrite your code to accept a
| ValidatedPosition as input and not two floats. In Rust that
| might be "zero cost", but in Python it certainly is not, even
| if you implement all this in Cython. And even in the context
| of a Cython/Rust/whatever extension, what if this is calling
| a BLAS function or some other external procedure that has
| specific preconditions?
|
| It's usually fine to use the type state pattern as described
| by sibling commenters, but sometimes it's expensive or
| infeasible.
|
| Another situation is less related to performance and more
| related to managing dynamic state. Consider the following
| annoyance that arises when we try to attach static types to
| an old-school Python idiom: class AppClient:
| def __init__(self, url: str, http_client: httpx.Client | None
| = None) -> None: self.url = url
| self.http_client = httpx.Client() if http_client is None else
| http_client self._auth_token: str | None = None
| def login(self) -> None: self._auth_token = ...
| def _post(self, body: dict[str, Any]) -> None:
| self.http_client.post( self.url,
| headers={"Authorizaton": f"Bearer {self._auth_token}"},
| json=body ) def submit_foo(self,
| foo: Foo) -> None: self._post(foo.to_dict())
|
| Users will end up with a difficult-to-debug auth failure if
| they forget to call `login` first. Also, Mypy won't catch
| this without contorting your code.
|
| We naturally would want to add a test case like this:
| def test_error_if_not_logged_in(): app_client =
| AppClient(url="https://api.example.net/v1") with
| pytest.raises(RuntimeError):
| app_client._post({})
|
| Thus we'll need to guard _every single usage_ of
| `self._auth_token` with: if self._auth_token
| is None: raise RuntimeError(...)
|
| And there are three usual approaches for handling this:
|
| 1. Disregard the article's advice and push the "if" _down_ ,
| into the _post method, to ensure that we only actually need
| to write that guard condition once.
|
| 2. Write a _check_logged_in method that encapsulates the
| if/raise logic, use that to at least make it easier to apply
| the guard where needed.
|
| 3. Refactor the internals to use something akin to the type
| state pattern, which is a funky and cool idea that I support.
| But it could easily end up turning into an "enterprise
| fizzbuzz" exercise where your little client library expands
| into a type-driven onion that obfuscates the business logic.
|
| I'm proposing a 4th way:
|
| 4. Declare that _post may only be called from inside a
| specific dynamic context, and ensure that that dynamic
| context is only established by login().
|
| Is it better? I'm not sure. But I've never seen it before,
| and it's kind of appealing as I continue to think about it.
| tetha wrote:
| Something I've seen a few times in Go and java: Make
| login() a constructor for the AppClient, or declare what's
| essentially a factory called "AppClientConfig" with a
| login() function to create the AppClient.
|
| I've only recently started to think about using factories
| like that, so I'm very sure there are patterns of dynamic
| state I wouldn't have a recipe for, but it's a prety
| appealing idea to me.
| nerdponx wrote:
| That's another good way to do it for sure, and IMO badly
| underused within the Python ecosystem specifically.
| paholg wrote:
| That's called RAII, and I'm a big fan (not of the name,
| though. It's a terrible name).
|
| https://en.m.wikipedia.org/wiki/Resource_acquisition_is_i
| nit...
| garethrowlands wrote:
| The "push ifs up" advice is not about checking preconditions,
| it's about selecting the right code path. If you have a
| function with a precondition, by all means assert it at the
| beginning of that function. For example, the Java type system
| allows nulls, so a function that requires an object should
| throw an exception if a null is passed.
|
| Each call site is absolutely responsible for ensuring that it
| only calls a function if its precondition is true. This follows
| from the definition of precondition. Nothing can change that.
|
| Calling a function in violation of its precondition is a bug
| (in the caller). And, sure, functions often need code that
| checks that (to prevent undefined behaviour). But we should
| distinguish these assertions from our program's actual control
| flow. The article is about the latter.
| bee_rider wrote:
| It seems like a decent general guideline.
|
| It has made me wonder, though--do there exist compilers nowadays
| that will turn if's inside inner loops into masked vector
| instructions somehow?
| p4bl0 wrote:
| I'm not convinced that such general rules can really apply to
| real-world code. I often see this kind of rules as ill-placed
| dogmas, because sadly even if this particular blog post start by
| saying these are _rule of thumbs_ they 're not always taken this
| way by young programmers. A few weeks ago YouTube was constantly
| pushing to me a video called "I'm a _never_ -nester" apparently
| of someone arguing that one should _never_ nest ifs, which is,
| well, kind of ridiculous. Anyway, back at the specific advice
| from this post, for example, take this code from the article:
| // GOOD if condition { for walrus in walruses {
| walrus.frobnicate() } } else { for
| walrus in walruses { walrus.transmogrify()
| } } // BAD for walrus in
| walruses { if condition {
| walrus.frobnicate() } else {
| walrus.transmogrify() } }
|
| In most cases where code is written in the "BAD"-labeled way, the
| `condition` part will depend on `walrus` and thus the `if` cannot
| actually be pushed up because if it can then it is quite obvious
| to anyone that you will be re-evaluating the same expression --
| the condition -- over and over in the loop, and programmers have
| a natural tendency to avoid that. But junior programmers or
| students reading dogmatic-like wise-sounding rules may produce
| worse code to strictly follow these kind of advices.
| hollerith wrote:
| Agree. Also, most of the time, the form that is _easier to
| modify_ is preferred, and even if `condition` does not
| _currently_ depend on `walrus`, it is preferable for it to be
| easy to make it depend on `walrus` in the future.
| gsuuon wrote:
| The GOOD refactor would only work if the condition didn't
| depend on `walrus` and helps to make that fact explicit. If you
| apply "push fors down" again you end up with:
| if condition { frobnicate_batch(walruses) } else
| { transmogrify_batch(walruses) }
| ToValueFunfetti wrote:
| Re: 'never-nesting', I'm not especially dogmatic, but I've
| never empirically seen a situation where this:
| match (condition_a, condition_b){ (true, true) => fn_a()
| (true, false) => fn_b() (false, true) => fn_c()
| (false, false) => fn_d() }
|
| isn't preferable to this: if condition_a {
| if condition_b { fn_a() } else {
| fn_b() } else if condition_b { fn_c()
| } else { fn_d() }
|
| (Assuming the syntax is available)
| makeitdouble wrote:
| I think the lines gets blurred when doing early exits and
| guard conditions.
|
| For instance if !condition_a && !conditon_b {
| return fn_d() // probably an error condition ? }
| if condition_a && condition_b { return fn_a() }
| if condition_a { fn_b() } else { fn_c()
| }
| 1letterunixname wrote:
| These 3 examples optimize to the the same thing because
| they generate identical instructions:
|
| https://godbolt.org/z/a1Yq9rceE
|
| Note: Inverting and rearranging conditions changes what
| LLVM decides to output, sometimes for the worse. --opt-
| level=s is counterproductive here.
| titzer wrote:
| I think I would like the former syntax. Witness one of the
| several nested matching situations I run into:
|
| https://github.com/titzer/wizard-
| engine/blob/master/src/engi...
|
| This would be much, much better if Virgil had pattern
| matching on tuples of ADTs.
| agumonkey wrote:
| I really prefer the flat variant cause it helps me ensure
| exhaustiveness. In the end people probably trained their
| brain to read the nested variant to the point it's
| transparent to their neurons.
| cratermoon wrote:
| Oh no, hard disagree. The match implementation is far easier
| to reason about. I can see at a glance that if both
| condition_a and condition_b are true, call fn_a(). For the
| nested if version I have to trace each expression by hand.
| Izkata wrote:
| They said "isn't", not "is". Double negative. You two
| agree.
| cratermoon wrote:
| Ugh. I missed that. No wonder my grade school teachers
| taught us "don't use no double negatives".
| whartung wrote:
| > saying these are rule of thumbs they're not always taken this
| way by young programmers.
|
| The important thing to learn about all "rules of thumb" and
| "best practices" is the WHY behind them. Programmers especially
| should not copy/paste these things and apply them by rote.
|
| RoT and BP blindly applied may well not be a good idea. As with
| everything "it depends".
| keithnz wrote:
| given it looks state mutating, I'd write this something like
| walruses.apply(condition ? frobnicate : transmorfify)
|
| where apply does the looping.
| tacitusarc wrote:
| I think this is actually a great example of why `if` should be
| "pushed up." The objective of the code is to perform an
| particular operation on the walrus, given a condition. The is
| actually ifs being instead of polymorphism and a type system.
| Why does the walrus have these two functions, which must be
| called in different scenarios? Why not have a single function,
| and two types, and the correct type is passed in? Even given
| the current structure, we could accomplish this:
| // There are better ways to accomplish this, depending on the
| language func frobnicate(walrus) { return
| walrus.frobnicate(); } func
| transmogrify(walrus) { return
| walrus.transmogrify(); } // Somewhere high
| in the code if condition {
| processing_function = frobnicate } else {
| processing_function = transmogrify }
| // Somewhere else in the code for walrus in walruses {
| processing_function() }
|
| If the decisions are made as early as possible, they do not
| need to be littered throughout the code. The guts of the code
| can run without branches, performing the same operations every
| time, the output only modified through the construction graph.
|
| Of course, this is not a new idea:
| https://www.youtube.com/watch?v=4F72VULWFvc
|
| It was old 15 years ago.
| hedora wrote:
| That pattern is not zero-cost: It breaks inlining and other
| compile-time optimizations. It also can hurt maintainability
| and readability. If the type of processing_function() is
| significantly generic, figuring out what is being called in
| the loop can be hard (as in undecidable; it is equivalent to
| the halting problem).
|
| In an extreme case that I've seen in real code,
| processing_function() is called send(t: Object) -> Object or
| recv(u: Object), and all non-trivial control flow in the
| system is laundered through those two function names.
|
| So, you have 100's or 1000's of send, recv pairs, and grep
| (or even a fancy IDE) can't match them up.
| tacitusarc wrote:
| For a non-trivial codebase, this style is incredibly
| valuable for readability. With it, decisions about
| execution are made in a single place, rather than
| distributed throughout the code. If the conditionals are
| embedded deeply in the call graph, sure a reader may be
| able to understand a single function when looking at it,
| but they won't know the implications of that function or be
| able to reason about the program as a whole. I've seen a
| huge class of bugs that are eliminated by properly using
| polymorphism rather than embedded conditionals.
|
| That being said, yeah, its possible to just layer
| unnecessary abstraction and way over complicate things,
| making the layers utterly incomprehensible. But that seems
| to be the result of a misunderstanding- the point here
| isn't to _abstract_, it's to properly model polymorphic
| behavior. Any abstraction that takes place to do that is
| suspicious at best.
|
| I think the performance aspect is negligible for 98% of
| programs. The compiler may or may not be able to inline the
| underlying function, depends on the compiler. Depends on
| the optimizations. And in almost all cases, if you're
| relying on compiler optimizations to fix your performance
| problems, you're looking in the wrong place. Not to say
| such cases don't exist, and there's a slew of people who
| work in those areas, but I trust they know their business
| well enough to evaluate the trade offs here and make the
| right decisions.
| hedora wrote:
| If you're using a language with proper static typing and
| generics support (so C++, Rust, Swift, Go, etc),
| disabling the optimizer is typically a 10x hit. Most of
| what the optimizer does relies on statically analyzing
| call graphs, which it can't do if you're slinging
| function pointers around all over the place.
|
| Godbolt.org has a great example for Swift. Switch the
| editor to swift mode, load the "Sum over Array" example,
| and compile it with defaults. It produces reams and reams
| of assembly code. Now, pass -O in as a compiler flag to
| enable optimization, and it produces extremely tight
| code. (Swift 5.0 even noticed the two functions are
| semantically identical, and deduped them. 5.9 doesn't,
| but it also produces more compact code.)
|
| It's a similar story for rust, since the example is
| written in a functional style with an iterator. The C and
| C++ versions only have an imperative for loop variant, so
| the optimizer doesn't matter as much for them. (Though
| you could easily produce a C++ version of the code that
| takes a sum() function, and has similar bloat until
| optimization is enabled.)
| tacitusarc wrote:
| It seems necessary to point out that moving the goalposts
| during a discussion isn't terribly productive.
| swsieber wrote:
| > But junior programmers or students reading dogmatic-like
| wise-sounding rules may produce worse code to strictly follow
| these kind of advices.
|
| Yes.
|
| I for one am happy such articles exist. Thus one articulated
| something I've run into quite a few times without being able to
| fully articulate the issues. It seems like a nice mental model
| to have in my back pocket.
|
| That said, I appreciate comments like your because I hope the
| dogmatic junior dev comes across it and hopefully becomes a
| little more nuanced.
| matklad wrote:
| Thanks! You are exactly the target audience for this article!
| It really is only useful when:
|
| * you already have the intuitive understanding of problems
| being described
|
| * but you haven't quite yet verbalized the problem and
| connected the dots
|
| This clicked for me yesterday, happy that it helped you to
| articulate what you've already knew today!
| Olreich wrote:
| I'm close to giving up on dogmatic junior devs. They don't
| ever seem to give up their dogma, even in the face of direct
| damage caused by their dogma. The closest I've seen is a
| dogmatic junior dev saying they understand and will relax on
| the dogma, but turning around and enforcing it when they
| think you aren't looking.
|
| I've seen dogmatic junior devs turn into dogmatic senior
| devs, but I've never seen a dogmatic junior dev turn into a
| pragmatic senior dev.
| tenahu wrote:
| That was my same thought. This example would only work in
| specific circumstances, so why present it as a rule,
| torstenvl wrote:
| I wouldn't quite say this is _bad_ advice, but it isn 't
| necessarily good advice either.
|
| I think it's somewhat telling that the chosen language is Rust.
| The strong type system prevents a lot of defensive programming
| required in other languages. A C programmer who doesn't check the
| validity of pointers passed to functions and subsequently causes
| a NULL dereference is not a C programmer I want on my team. So at
| least some `if`s should definitely be down (preferably in a way
| where errors bubble up well).
|
| I feel less strongly about `for`s, but the fact that array
| arguments decay to pointers in C also makes me think that
| iteration should be up, not down. I can reliably know the length
| of an array in its originating function, but not in a function to
| which I pass it as an argument.
| lytigas wrote:
| > A C programmer who doesn't check the validity of pointers
| passed to functions and subsequently causes a NULL dereference
| is not a C programmer I want on my team.
|
| I disagree. Interfaces in C need to carefully document their
| expectations and do exactly that amount of checking, not more.
| Documentation should replace a strong type system, not runtime
| checks. Code filled with NULL checks and other defensive
| maneuvers is far less readable. You could argue for more
| defensive checking at a library boundary, and this is exactly
| what the article pushes for: push these checks up.
|
| Security-critical code may be different, but in most cases an
| accidental NULL dereference is fine and will be caught by
| tests, sanitizers, or fuzzing.
| jrockway wrote:
| I agree with that. If a function "can't" be called with a
| null pointer, but is, that's a very interesting bug that
| should expose itself as quickly as possible. It is likely
| hiding a different and more difficult to detect bug.
|
| Checking for null in every function is a pattern you get into
| when the codebase violates so many internal invariants so
| regularly that it can't function without the null checks. But
| this is hiding careless design and implementation, which is
| going to be an even bigger problem to grapple with than
| random crashes as the codebase evolves.
|
| Ultimately, if your problem today is that your program
| crashes, your problem tomorrow will be that it returns
| incorrect results. What's easier for your monitoring system
| to detect, a crashed program, or days of returning the wrong
| answer 1% of the time? The latter is really scary, depending
| on the program is supposed to do. Charge the wrong credit
| card, grant access when something should be private, etc.
| Those have much worse consequences than downtime. (Of course,
| crashing on user data is a denial of service attack, so you
| can't really do either. To really win the programming game,
| you have to return correct results AND not crash all the
| time.)
| wavemode wrote:
| Yeah but, not checking for null in C can cause undefined
| behavior. One possible outcome of undefined behavior is
| that your program doesn't even crash, but rather continues
| running in a weird state. So such a bug doesn't always
| "expose itself".
|
| If we accept that bugs are inevitable, and that
| accidentally passing a null pointer to a function is a
| possible bug, then we also conclude that your code really
| should include non-null assertions that intentionally
| abort() the program. (Which run in debug/staging mode but
| can be disabled in release/production mode.)
| jrockway wrote:
| Very good point. For C, I like the idea of sticking an
| assertion in there.
| LegionMammal978 wrote:
| Indeed, Rust's own standard library uses this method.
| There are lots of public-facing unsafe functions that can
| result in undefined behavior if called incorrectly. But
| if the standard library is compiled in debug mode (which
| currently requires the unstable flag -Zbuild-std), then
| it will activate assertions on many of these unsafe
| functions, so that they will print a message and abort
| the program if they detect invalid input.
| afdbcreid wrote:
| The Rust compiler has even started recently to put extra
| checks on unsafe code in codegen, e.g. on raw pointer
| dereference to check it is aligned.
| johncowan wrote:
| That raises a more general point. When you can't or don't
| have compile-time checks, removing run-time checks in
| production amounts to wearing your seat belt only when
| driving around a parking lot and then unbuckling when you
| get on the highway. It's very much the Wrong Thing.
| wavemode wrote:
| I wouldn't really characterize it that way. You (ideally)
| shouldn't be hitting code paths in production that you
| didn't ever hit in testing.
|
| But, in any case, if you are fine with the slight
| performance hit (though many C/C++ projects are not), you
| can always just keep assertions enabled in production.
| gpderetta wrote:
| assert_always(ptr != nullptr);
|
| (custom assert_always macro, so it doesn't get compiled out
| in release builds)
| branko_d wrote:
| My rule of thumb is: if the type system doesn't prevent an
| invalid value, it's your responsibility to prevent it at run-
| time.
|
| I've been writing a lot of T-SQL lately, which doesn't let you
| declare a parameter or variable as NOT NULL. So it's a good
| idea to check for NULLs as early as reasonable - usually at the
| top of the stored procedure (for parameters). Otherwise, a NULL
| might propagate unexpectedly deep into the call hierarchy and
| cause less-than-obvious problems.
|
| Fortunately, the data in the table can be declared as NOT NULL,
| so these kinds of bugs will usually not corrupt the data, but
| catching them as early as possible makes life easier. However,
| if there is piece of logic that writes something to the
| database depending on the value of some parameter, and that
| parameter is unexpectedly NULL, that might lead to a wrong
| thing being written, or a necessary thing not being written at
| all, effectively corrupting the data.
|
| So, defensive programming all the way, baby!
| bregma wrote:
| In C a NULL is often a valid non-dereferencable pointer value.
| If you're checking for invalid pointer values you need to check
| for any of the 0xfffffffffffffffe possible invalid values. If
| all you're doing is checking for NULL you're not checking for
| invalid values at all.
|
| If the precondition of your function is "parameter p can not be
| NULL" that's fine, go ahead and check for it. If the
| precondition is "parameter p must be a valid pointer" well
| then, good luck to you finding the appropriate assertion
| condition.
| ryanjshaw wrote:
| I wrote some batch (list) oriented code for a static analyzer
| recently.
|
| It was great until I decided to change my AST representation from
| a tuple+discrimated union to a generic type with a corresponding
| interface i.e. the interface handled the first member of the
| tuple (graph data) and the generic type the second member (node
| data).
|
| This solved a bunch of annoying problems with the tuple
| representation but all list-oriented code broke because the
| functions operating on a list of generics types couldn't play
| nice with the functions operating on lists of interfaces.
|
| I ended up switching to scalar functions pipelined between list
| functions because the generic type was more convenient to me than
| the list-oriented code. The reality is you often need to play
| with all the options until you find the "right" one for your use
| case, experience level and style.
| aeonik wrote:
| I'm curious, why couldn't the list of generic types play nice
| with functions operating on lists of interfaces?
| ryanjshaw wrote:
| Have a look here: https://onecompiler.com/fsharp/3ztmx2uhr
|
| Basically we want a "Yes" or a "No" when the family has
| children: let kidsYN = family |>
| numberOfChildren |> yesOrNo
|
| But we get: error FS0001: Type mismatch.
| Expecting a INode list but given a
| Node<Person> list The type 'INode' does not match
| the type 'Node<Person>'
|
| Forcing us to do: let familyAsINode =
| family |> List.map (fun n -> n :> INode)
|
| Sure you can wrap this up in a function but it's ugly and
| annoying to have to use this everywhere and takes away from
| your logic. It ends up being better to split your "batch" and
| "scalar" operations and compose them e.g. by introducing a
| "mapsum" function: let kidsYN2 = family |>
| mapsum numberOfChildrenV2 |> yesOrNo
| smokel wrote:
| Without a proper context, this is fairly strange, and possibly
| even bad advice.
|
| For loops and if statements are both control flow operations, so
| some of the arguments in the article make little sense. The
| strongest argument seems to be about performance, but that should
| typically be one of the latest concerns, especially for rule-of-
| thumb advice.
|
| Unfortunately, the author has managed to create a catchphrase out
| of it. Let's hope that doesn't catch on.
| actionfromafar wrote:
| try let's hope catch not on
| koonsolo wrote:
| > Unfortunately, the author has managed to create a catchphrase
| out of it. Let's hope that doesn't catch on.
|
| In you next pull request: "Hey can you push this if up?" :D.
| aktenlage wrote:
| > The strongest argument seems to be about performance
|
| It may be an argument, but it's not a strong one. If the
| improved code can be written like the author puts it in their
| example (see below), the condition is constant over the runtime
| of the loop. So unless you evaluate an expensive condition
| every time, you are good. Branch prediction will have your
| back. If condition is just a boolean expression using const
| values, I'd even guess the compiler will figure it out.
| if condition { for walrus in walruses {
| walrus.frobnicate() } } else { for walrus
| in walruses { walrus.transmogrify() } }
|
| Branch prediction should have you covered here. If you can
| easily rewrite it in
| wyager wrote:
| Modern compilers and branch predictors means this doesn't matter
| 99.9% of the time.
|
| If you take the same branch every time 100 times in a row, the
| processor will optimize the cost of the branch away almost
| entirely.
|
| If the branch condition is not volatile, compilers will usually
| lift it.
| bhuber wrote:
| This is mostly true, but sometimes the cost of evaluating the
| condition itself is non-trivial. For example, if a and b are
| complex objects, even something as trivial as `if (a.equals(b))
| ...` might take a relatively long time if the compiler/runtime
| can't prove a and b won't be modified between calls. In the
| worst case, a and b only differ in the last field checked by
| the equality method, and contain giant collections of some sort
| that must be iterated recursively to check equality.
| wyager wrote:
| "If the branch condition is not volatile, compilers will
| usually lift it"
|
| Usually in any program with well-defined semantics (e.g. not
| using janky multithreaded mutability), this will be true
| titzer wrote:
| It's not just the direct cost of a branch, it's the downstream
| costs as well. Removing (or automatically folding branches in a
| compiler) can lead to optimizations after the branch.
|
| E.g. if (foo > 0) { x = 3;
| } else { x = 7; } return x * 9;
|
| If the compiler (or programmer) knows foo is greater than zero
| (even if we don't know what it actually is), then the whole
| thing folds into: return 27;
|
| That also means that foo is not even used, so it might get
| dead-code eliminated.
|
| If that gets inlined, then the optimizations just keep stacking
| up.
|
| (not that the article was about that, it's just one implication
| of removing branches: downstream code can be optimized knowing
| more).
|
| So, in summary, compilers matter.
| RenThraysk wrote:
| That code shouldn't generate any branching (unless on an odd
| cpu). A serious compiler should recognise it can be done with
| a conditional move.
| titzer wrote:
| A conditional move is not guaranteed to be faster than a
| predictable branch. It varies _a lot_ by microarchitecture.
| It was just an example. Imagine if you had a store in the
| else branch. Now optimizing away a store means that load
| elimination may kick in. Folding branches in the compiler
| unlocks tons of downstream benefits.
| wyager wrote:
| "If the branch condition is not volatile, compilers will
| usually lift it."
| titzer wrote:
| Lift it where?
| layer8 wrote:
| One variation on this theme is to use subclass or function
| polymorphism. This lets you decouple (in time) (a) the code that
| decides what to do based on the condition from (b) the code that
| actually does what was decided. In TFA's enum example, instead of
| the enum values, you could pass the foo/bar functions around as
| values (or instances of different subclasses implementing a
| common interface method as either foo or bar), and the place
| where the operation finally needs to be performed would invoke
| the passed-around function (or object). I.e., f() would return
| foo or bar as a function value, and g() would be passed the
| function value and simply invoke it, instead of doing the `match`
| case distinction.
|
| The drawback is that it's then less clear in some parts of the
| code which implementation will be invoked. But the alternative is
| having to perform the specific operation (directly or indirectly)
| immediately when the condition is determined. It's a trade-off
| that depends on the situation.
| jampekka wrote:
| Fors down sounds like a bad advice, and the rationale for it
| seems to be The Root of all Evil.
|
| "Fors up" allows for composition, e.g. map. Fors down makes it
| clunky at best.
| clausecker wrote:
| This reads like the author is very close to rediscovering array
| oriented programming.
| ezekiel68 wrote:
| In his (for he is a 'he') defense, I believe much of the whole
| industry has been doing so over the past five years as well.
| Everything old is new again!
| PaulDavisThe1st wrote:
| Contrarily:
|
| Push ifs down: BAD: if (ptr) delete ptr;
| GOOD: delete ptr;
|
| Polymorphize your fors: frobnicate (walrus);
| frobnicate (walruses) { for walrus in walruses frobnicate
| (walrus); }
| BenFrantzDale wrote:
| I agree with them and with you. It looks like they work in some
| poor language that doesn't allow overloading. Their example of
| `frobnicate`ing an optional being bad made me think: why not
| both? `void frobnicate(Foo&); void
| frobnicate(std::optional<Foo>& foo) { if (foo.has_value()) {
| frobnicate(foo); } }`. Now you can frobnicate `Foo`s and
| optional ones!
| leduyquang753 wrote:
| Indeed they are working in a poor language that doesn't allow
| overloading. It's Rust. :-)
| myaccountonhn wrote:
| This is a variation of the "golden rule of software quality"
|
| https://www.haskellforall.com/2020/07/the-golden-rule-of-sof...
| andyferris wrote:
| This is kinda "just" predicate push-downs, for imperative code.
| Makes sense that the author is thinking about it given he is
| working on databases (tigerbeetle) and part of the motivation is
| performance.
|
| Interesting that we push the ifs up but we push the predicates
| down! (And a "predicate pushup" sounds like you are adding some
| randomness to your exercise routine - one, two, skipping this
| one, four, ...).
| 1letterunixname wrote:
| _condition_ is an invariant. Unless using cranelift or gcc, it 's
| going to get optimized away by LLVM unless rustc is giving it
| some non-invariant constraints to solve for. Most compilers,
| JITs, and interpreters can and do do invariant optimization.
|
| Another way to look at the class of problem: if you're using too
| many conditionals too similarly in many places, you may have
| created a god type or god function with insufficient abstraction
| and too much shared state that should be separated.
|
| ---
|
| _Prime directive 0._ Write working code.
|
| _Prime directive 1._ Choose appropriate algorithms and data
| structures suitable for the task being mindful of the approximate
| big O CPU and memory impact.
|
| _Prime directive 2._ Write maintainable, tested code. This
| includes being unsurprisingly straightforward.
|
| _Prime directive 3._ Exceed nonfunctional requirements: Write
| code that is economically viable. If it 's too slow, it's
| unusable. If it's somewhat too slow, it could be very expensive
| to run or will cost N people M time.
|
| _Prime directive 4._ If it becomes too slow, profile and
| optimize based on comparing real benchmark data rather than
| guessing.
|
| _Prime directive 5._ Violate any rule for pragmatic avoidance of
| absurdity.
| assbuttbuttass wrote:
| I love this advice, moving if statements "up" is something I've
| observed makes a big difference between code that's fun to work
| with and easy to maintain, and code that quickly gets
| unmaintainable. I'm sure everyone's familiar with the function
| that takes N different boolean flags to control different parts
| of the behavior.
|
| I think it really comes down to: functions that have fewer if
| statements tend to be doing less, and are therefore more
| reusable.
| jackblemming wrote:
| "Put ifs were they minimize the net total Cyclomatic complexity"
|
| This is exactly what the factory design pattern is trying to
| achieve. Figure out the type of object to create and then use it
| everywhere vs a million different switch statements scattered
| around.
|
| Also don't create batch functions unless you need to. Functions
| that work on a single item compose better with map-reduce.
| bcrosby95 wrote:
| The first example is bad for reasons not related to ifs and fors.
| In general, if you can, if you have a "container" for something,
| you should write functions on the contained, domain-level "Thing"
| rather than the container with the domain level thing.
|
| As an example I work with - Clojure. Sometimes I use agents. I
| don't write functions for agents, I write functions for things
| agents might contain.
|
| Similar rules for Elixir. My primary domain level functions don't
| work off a PID. They work off the underlying, domain-level data
| structures. GenServer calls delegate to that where necessary.
|
| This makes them more flexible, and tends to keep a cleaner
| distinction between a core domain (frobnicate the Walrus) and
| broader application concerns (maybe the Walrus is there, maybe
| not... oh yeah, also frobnicate it).
| crdrost wrote:
| Yeah. I think the given advice probably takes validation logic
| and floats it too high. It is of course nice to have early
| validation logic, but it is also nice when your functions don't
| mysteriously crap out with some weird error but instead shout a
| validation error at you.
|
| Haskell solves this with newtypes, "here is a transparent
| container that certifies that you did the appropriate
| validation already," that helps for this.
|
| The advice that I really want to hammer into people's heads is,
| prefer "sad ifs." That is, I will almost always find this
| if (something_is_wrong_in_way_1) { // fix it or abort
| } if (something_is_wrong_in_way_2) { // fix
| it or abort } if (something_is_wrong_in_way_3)
| { // fix it or abort }
|
| more readable and maintainable than this if
| (things_are_ok_in_way_1) { if
| (things_are_ok_in_way_2) { if
| (things_are_ok_in_way_3) { // do the happy path!
| } else { // fix or abort thing 3 //
| if fixed, do the happy path } } else {
| // fix or abort thing 2 // if fixed, test way 3
| again // if way 3 is good do the happy path, else
| fix it // ... } } else {
| // ... }
|
| I feel like it's in human nature to focus on the expected case,
| I want everyone whose code I meet to do the exact opposite,
| focus primarily on the unexpected. Every "if" imposes a mental
| burden that I am keeping track of, and if you have to go to an
| external system to fetch that information, or you need to exit
| early with an error, I can immediately discharge that mental
| burden the moment I know about it, if the handling and the
| detection are right next to each other.
| quickthrower2 wrote:
| In short: avoid `else`.
|
| Both what you mention, and also the case with :
| if (...) { ... } else { return 0;
| }
|
| Which can become if (...) { ...
| } return 0;
| hedora wrote:
| Even better, if you're using a language with pattern
| matching (rust, any ML, haskell), just put it in a single
| match with the happy case on top, and the error cases
| below.
|
| To do this, you need to encode the distinction between
| happy and sad path in your types. That's good for all sorts
| of other reasons, so it's not really a drawback.
| KineticLensman wrote:
| Yes, pattern matching really helps with clarity,
| especially when there are multiple happy paths, like in a
| parser
| theteapot wrote:
| What about `else if`? Not a fan?
| quickthrower2 wrote:
| Occasional vice for me.
| stouset wrote:
| Personally I only either use if statements that wrap a
| single line of logic, or a switch/case statement where
| each branch is a single line of logic.
|
| Most situations are handled by guard clauses, exhaustive
| type switching, or switching on logical cases. _Very_
| occasionally do I have to break this practice for a
| legitimate reason, but this keeps functions simple to
| read and to reason about.
|
| I occasionally am unable to avoid an else, but I will
| always avoid else if. It puts too much reliance on
| remembering how this condition interacts with the one on
| the if statement above it, which can often be many lines
| away. I'd infinitely rather use a case statement where
| all the alternatives are in an easily-understood table
| with every condition aligned on successive lines.
| Xeamek wrote:
| In functional languages like elixir there is no return
| though, and you are forced to use if-else structure
| jstanley wrote:
| I tend to prefer a style more like: if
| (!...) return 0; ...
|
| Because otherwise you can end up with lots of nested
| conditions which are very confusing to follow, even though
| they boil down to "do the thing if these are true,
| otherwise don't do the thing".
| at_a_remove wrote:
| I usually move this up in an ETL process into a function I
| usually call "Pre_Flight_Checklist." Over the years this has
| gotten its own helper functions. For example, if I am going
| to use a file, I have function that not only checks for the
| existence of the file, but that it _is_ a file. This function
| can also be fed an expected size range and date if
| "freshness" is expected. If something is out of whack, an
| error message will mention that a file was expected at
| /Adirectory/Bdirectory/Cdirectory/filename.ext, but only
| /Adirectory/Bdirectory was found, nothing under it. I believe
| in overly-explanatory error messages.
|
| Now, I _do_ run the risk of having things change out from
| under me, but putting all of the defensiveness up front
| allows the happy path and some alternatives to be more
| succinct in the code.
|
| I keep finding new things to be paranoid about, though!
| blandflakes wrote:
| I at first had the opposite reaction, which is that I spend a
| ton of time fighting inverted ifs! But my reaction is not
| incompatible with yours. It's not, but it is related: I HATE
| reading:
|
| if (!someCondition) { // short amount of code } else { //
| longer happy path }
|
| The contextual overhead of having to invert a negative (i.e.
| !!someCondition) is just annoying.
|
| I do agree that if (happy) { /* tons of code _/ } else { /_
| I forgot how we got here */ } can also be an issue.
| Izkata wrote:
| The key to the idea above is that you don't have any "else"
| clause. It's an early return, raise an exception, or modify
| the data so it's no longer bad.
| blandflakes wrote:
| Yes, that's why I said I realized they weren't quite the
| same situation, while adding another example of a
| situation where people create hard to follow structures.
| Izkata wrote:
| > The advice that I really want to hammer into people's heads
| is, prefer "sad ifs."
|
| I think this is the first time I've heard that name for this
| idea. You'll find more under the term "guard clauses" - it
| even has its own Wikipedia page:
| https://en.wikipedia.org/wiki/Guard_(computer_science)
| softfalcon wrote:
| Yeah, same, also know this as guard clauses. But "sad if's"
| is kind of adorable sounding, so I might start using it.
| philbo wrote:
| pessimifs
| 867-5309 wrote:
| depressifs?
| softfalcon wrote:
| TIL: another word for "guard clauses" is "sad if's".
| btown wrote:
| I once had a CS teacher who hated early returns and break
| statements; they would always prefer the "nested ifs"
| approach. And they're far from being alone - many others
| believe that path is cleaner.
|
| I'm with the parent poster, though: if you can use control
| flow to keep the happy path the unindented path, and you have
| syntax highlighting to help you show that each guard clause
| actually does interrupt the control flow, it can often be
| significantly cleaner.
|
| I also like to think of it in terms of "railway oriented
| programming" (https://fsharpforfunandprofit.com/rop/ which is
| a must-read for anyone trying to wade into monads and
| functional programming IMO) - the "happy path" should be the
| path that it's easiest for the train to go down, namely a
| straight line down the page. Only give yourself the headache
| of a sharp turn when you're interrupting that path!
| jaza wrote:
| I don't mind early returns, but I'm infamously anti "break"
| and "continue" statements (and all my colleagues to date
| have disagreed with me and have found my stance hilarious
| and/or infuriating). My rationale is that they turn a loop
| into a mutant abomination that's a loop and conditionals
| all mushed up together. I also can't shake this gut feeling
| that they're crude and antiquated control flow mechanisms,
| only one step away from the fiery inferno of "goto". So
| instead of this: for thing in things:
| if thing.is_yellow(): continue
| if thing.is_rainbow(): break
| thing.thingify()
|
| I'll do: things_iter = iter(things)
| thing = next(things_iter, None)
| is_rainbow_thing_found = ( thing.is_rainbow()
| if thing is not None else False )
| while ( thing is not None and not
| is_rainbow_thing_found ): if not
| thing.is_yellow(): thing.thingify()
| thing = next(things_iter, None)
| is_rainbow_thing_found = (
| thing.is_rainbow() if thing is not None
| else False )
| KineticLensman wrote:
| I'm with your colleagues I'm afraid. The intent of the
| all-mushed-up-together version is really clear. I had to
| spend a long time looking at the longer version to figure
| out what was going on. I know which I'd prefer to
| maintain if I had to.
| indigo945 wrote:
| Considering break and continue to be too powerful as
| control flow statements is a somewhat common opinion, in
| my experience. (I still use them, but only sparingly, and
| where I am quite sure that they're not confusing.)
|
| That said, your second code block is unreadable to me --
| even knowing what it does, it's just confusing. No way
| that that's more readable than making use of break here.
| If you don't want so many conditionals in the loop body,
| you can factor the first check out:
| no_yellow_things = (t for t in things if not
| t.is_yellow()) for t in no_yellow_things:
| if t.is_rainbow(): break
| t.thingify()
| sethammons wrote:
| I almost never say this: you are wrong. Code needs to be
| maintained. Everyone can understand the first example.
| Everyone has to slowdown and reason about your second
| example. It is objectively worse. And your argument is
| flawed: break and continue are not the goto-like behavior
| that goto is criticized for as they are limited to the
| scope of that block of code.
|
| The problem traditionally cited with goto is it can go
| anywhere (even into the body of another function where
| expectations on which variables are set may not hold).
|
| If you don't like break and continue, how do you justify
| exceptions? The handling code for those may be in a
| different file or nowhere. Much closer to goto.
| jaza wrote:
| As others have said here, if you want to do it nicely and
| avoid break / continue, then do it in a proper FP way
| with chained limit / filter calls, rather than doing it
| per my example.
|
| I accept your criticism, that my example is much harder
| to understand, and I admit that it's probably not the way
| to go in terms of maintainability. However, I'd still
| argue that my example is closer to an FP approach (even
| though it still uses a procedural loop), and that it
| communicates the logic in the same way that an FP
| approach does. In my example, it's clear that we're
| filtering out yellow (whereas with "continue", the logic
| is written in an inverted way); and it's clear just from
| reading the loop's header that we stop looping when
| rainbow is found (whereas with "break", logic to stop
| looping is being added at any arbitrary spot inside the
| loop's body).
|
| Re: exceptions. I know that many more people object to
| them, than to break / continue (and whole languages make
| a point of not using them, most notably Golang). The pros
| and cons of exceptions is a whole topic in itself that
| I'd rather not digress into right here. Anyway,
| personally I'm ok with them. Firstly, they're for errors,
| and I feel that having weird control flow is sort-of
| justified for errors, whereas it's not justified for
| expected behaviour. And secondly, they're at least
| consistent in that they can occur anywhere, whereas break
| / continue can only occur inside loops (in Python, at
| least - in other C-family languages, "break" has
| different behaviour in different contexts, which is IMHO
| even worse).
| indiv0 wrote:
| Minimizing control flow in loops is a good idea, but if
| you can FP-ize the loop entirely that keeps it pretty
| readable IMO. things
| .iter() .filter(|t| !t.is_yellow())
| .take_while(|t| !t.is_rainbow())
| .for_each(|t| t.thingify());
| Tainnor wrote:
| That's the one I would prefer. If you're at all used to
| FP, this signals intent very clearly.
| jaza wrote:
| That's the nicest counter-example to my example, thanks
| for that! I wasn't familiar with take_while() (looks like
| that's Rust, and looks like Python has a similar
| itertools.takewhile() ), TIL a neat FP version of break.
| My example was quite trivial, it's not always so obvious
| how to break up a convoluted loop, but it should always
| be possible.
| reroute22 wrote:
| I'm afraid I'm with your colleagues.
|
| I value being able to grasp what the code does at a
| glance while scrolling through. Your first snippet is
| entering my brain like as if without even reading at all,
| all at once and instantly.
|
| Your second example absolutely does not and I'm sure you
| know why.
| toxik wrote:
| Surely you are joking? Your version is four times longer
| and repeats a large section of itself.
|
| The way to avoid control flow here is to simply find the
| rainbow first, take the sequence up until then, filter by
| yellow, and thingify the remaining things.
| rainbow_idx = next(i for i, t in enumerate(things) if
| t.is_rbow()) yellows = (t for t in
| things[:rainbow_idx] if t.is_ylw()) for t in
| yellows: t.thngry()
|
| Excuse formatting, cellphone.
|
| Not sure this is any better than the obvious solution
| either.
| jaza wrote:
| Yes, I agree, that's the way to do it. Although it isn't
| always so obvious how to break it up into limit / filter
| steps, my example was quite trivial. But there should
| always be a way somehow.
| WesolyKubeczek wrote:
| Do they come from the times when structured programming was
| hot and new, SESE had been the buzzword of the year, and
| that doodad called "transistor" just had been invented?
| mcronce wrote:
| The author called out the newtype solution specifically in
| the first paragraph:
|
| > it could push the task of precondition checking to its
| caller, and _enforce via types_ (or an assert) that the
| precondition holds
|
| (Emphasis mine)
| slowmovintarget wrote:
| I recall avoiding multiple method exits (in Java) like the
| plague. They were hell to get through in a debugging session.
| But I guess we don't use debuggers much any more.
|
| When I began writing C# I recall my other team members
| "refactoring" my code to be shot through with multiple exits
| instead of a my collecting variable, because "it looked
| nicer." When they asked me why I wrote it the way I did, I
| mentioned the Java debugger thing, and they kind of looked at
| me blankly and said, "Huh, never thought of that."
|
| Times change.
| _gabe_ wrote:
| Why is multiple method exits hell to get through with a
| debugger? I use a debugger for my C++ code all the time and
| I've never run into an issue because of multiple returns.
| slowmovintarget wrote:
| Setting breakpoints to inspect local state means you have
| to catch every single exit. Back then, if you had a
| thirty second to one minute round trip to get back to the
| method you were inspecting, it became... frustrating. You
| probably didn't have to deal with those kind of start-up
| round trips in C++. With C++ you'd pay on the compile
| time.
|
| If you wrote the code to have a single exit, the
| cognitive load was also typically lower, as the control
| flow could be thought of as one way in, and two ways out
| (return or exception). Good disciple also followed "the
| jolly good idea of Demeter" or whatever it came to be
| called; that being to only operate on the state (or
| preferably, values) passed in, where possible.
| sachahjkl wrote:
| Great insight. many programmers fail to think of their errors
| valid as values to get as an output. Golang got this right
| GuestHNUser wrote:
| Correct me if I am misunderstanding, you are saying the first
| example would be better if it was `walrus.frobnicate()`? Isn't
| that a syntax preference more than an issue with the point the
| author is trying to make?
| runeks wrote:
| > As an example I work with - Clojure. Sometimes I use agents.
| I don't write functions for agents, I write functions for
| things agents might contain.
|
| I'm not following you. Do you have a better example?
| bcrosby95 wrote:
| Similar to the example of the OP: (def walrus
| (agent {})) ; good (defn frobnicate [w]
| (assoc w :frobnicated true)) (send walrus frobnicate)
| ; bad (defn frobnicate [w] (send walrus #(assoc
| % :frobnicated true))) (frobnicate walrus)
|
| It might make sense to _also_ have a function that executes
| something like (send walrus frobnicate). But IMHO code should
| basically never look like the bad example.
| chatmasta wrote:
| tl;dr Branch as early as possible (move ifs up) and as
| infrequently as possible (move loops down, to minimize the number
| of loops calling something that branches)
|
| It probably actually is a good rule of thumb, in that it will
| naively force you into some local maxima of simplified logic. But
| eventually it becomes equivalent to saying "all of programming is
| loops and branches," which is not a very useful statement once
| you need to decide where to put them...
| jasonjmcghee wrote:
| I really thought the whole article was building up to a code
| example like [fwalrus, twalrus] = split(walrus,
| condition) frobnicate_batch(fwalrus)
| transmogrify_batch(twalrus)
|
| And instead went for if condition { for
| walrus in walruses { walrus.frobnicate() }
| } else { for walrus in walruses {
| walrus.transmogrify() } }
| crabmusket wrote:
| Sort the data! Sort it! [1]
|
| [1]: https://macton.smugmug.com/Other/2008-07-15-by-Eye-
| Fi/n-xmKD...
| anyonecancode wrote:
| A good example of this I see a lot in a code base I'm currently
| working in is React components that conditionally render or not.
| I really can't stand this pattern, and whenever I can I refactor
| that into having the component ALWAYS render, but have the caller
| decide whether or not to call the component.
| theteapot wrote:
| Interesting that dependency inversion principal -- [1] is like an
| extreme case of pushing ifs up by encoding the if into the type
| interface implemention. Ultimately what you get with DI is
| pushing an if up some to ... somewhere. #
| Somewhere: walruses = [new TransWalrus(), new
| FrobWalarus()], ...] ... for(walrus in walruses) {
| walrus.transfrobnicaterify() }
|
| [1] https://en.wikipedia.org/wiki/Dependency_inversion_principle
| tubthumper8 wrote:
| This only works when `frobnicate` and `transmogrify` have the
| same argument and return types
| bobmaxup wrote:
| As with much programming advice, this is language dependent.
|
| You might want branching structures when you have no overloading.
| You might want guards and other structures when your type
| checking is dynamic.
| norir wrote:
| These heuristics feel to me like a corollary to a simpler, more
| general rule: avoid redundant work.
| gumby wrote:
| The idea of hoisting precondition ifs into the caller is terible!
| Sure there are special cases where it's a good idea (if nothing
| else it skips a function call) but in the common case you don't
| want to this.
|
| In a library you want to check preconditions at the external
| boundary so the actual implementation can proceed knowing there
| are no dangling pointers, or negative numbers, or whatever the
| internal assumptions may be. Depending on the caller to do the
| check defeats the purpose.
|
| Also in many cases you would need to violate
| encapsulation/abstraction. Consider a stupid case: `bool
| cache_this (T obj)`. Let the cache manager itself check to see if
| the object is already there as it can probably touch the object
| fewer times.
|
| I agree on the `for` case but it's so trivial the article barely
| talks about it. Basically it's the same as the encapsulation case
| above.
| quickthrower2 wrote:
| I think there is a spirit to this. I.e. is it an "if" in
| spirit.
|
| Preconditions are not really branches. They are usually `if
| (bad) throw;` type of thing. They could be replaced with
| `assert(!bad);`.
|
| A branch would be a function like
| add_todo_or_calendaritem(is_todo: bool, ...) which would need
| to branch on is_todo.
| LegionMammal978 wrote:
| > In a library you want to check preconditions at the external
| boundary so the actual implementation can proceed knowing there
| are no dangling pointers, or negative numbers, or whatever the
| internal assumptions may be. Depending on the caller to do the
| check defeats the purpose.
|
| I think the idea is to instead address this with a type-safe
| interface, designed so that the external boundary physically
| cannot receive invalid input. The caller would then be
| responsible for its own if statements when constructing the
| input types from possibly-invalid raw values.
|
| > Also in many cases you would need to violate
| encapsulation/abstraction. Consider a stupid case: `bool
| cache_this (T obj)`. Let the cache manager itself check to see
| if the object is already there as it can probably touch the
| object fewer times.
|
| I don't see the suggestion as encouraging such a thing: the
| "cache_this" check should only ever be performed when it's
| known for certain that the user wants to access the cached
| object, so the entry point of the cache abstraction acts as a
| kind of boundary that the if statement depends on. And the if
| statement clearly shouldn't be pushed above its own dependency.
| hedora wrote:
| In cases where that doesn't make sense: let
| f = Some(get_a_u16()); foo(f); ...
| func foo(f: u16) -> u16 { match f {
| None => 0, Some(f) => f * 1234 }
| }
|
| I'd expect any reasonable compiler to include enough link
| time optimization and dead code elimination to compile the
| whole mess down to a single multiply instruction.
| LegionMammal978 wrote:
| I don't see what you're tying to show with your example?
| The typical case is, you start with a raw type in the
| caller (u16), and then you create a derived type
| (Option<MultipliedU16>) as a result of validation. Also,
| you'd ideally want to handle overflow errors in the
| multiplication, especially with something as small as a
| u16.
|
| (And in case it helps anyone, for a trivial function to be
| inlined across crates in Rust, either the function must be
| marked with #[inline], or a special LTO option must be
| explicitly set in the Cargo profile. So often it's a good
| idea to mark all public trivial utility functions in a
| crate with #[inline].)
| hedora wrote:
| In my example, passing in none returns a value, not an
| error, so the intent is that it makes sense for that API
| to handle the None case in normal operation. However, in
| situations where the compiler can infer the argument is
| never None, it can remove the branch.
|
| (Yes, that function should probably handle overflow
| explicitly, but that wasn't the point.)
| gumby wrote:
| > I think the idea is to instead address this with a type-
| safe interface, designed so that the external boundary
| physically cannot receive invalid input.
|
| Ah, an idea that people have broken their picks on for 50
| years.
|
| We will pay out bonuses to employees who worked at least 9
| months last year. Actually current employees and former
| employees. I should create a special type for this case and
| then depend on the caller to construct such types only for
| qualifying recipients? That's just "moving the if" back into
| the caller. Any responsible developer would validate those
| entries at the interface...and is it complete?
| LegionMammal978 wrote:
| Ths idea is, we wouldn't be "depending on the caller": the
| constructor itself would indicate failure instead of
| producing an object if the data is invalid, so the caller
| can't mess up without completely violating the abstraction.
|
| So we would still be doing the validation "at the
| interface". It's just that the interface would be split
| into two separate functions, one checking for qualification
| and another operating on qualified employees.
|
| Also, an alternative would be to push the for loop down
| into the interface, so that it enumerates each employee,
| immediately filters them for qualification, and then
| performs whatever other work is needed for paying out. (It
| could even include a caller hook to perform additional
| filtering before paying out.)
| behnamoh wrote:
| A similar example of this is OpenAI's API calls which don't do
| response validation when you do function calling. Essentially,
| validating the response against the given function(s) is left
| to the user, leading to various implementations that just make
| the code noisy.
|
| As an alternative, OpenAI could just make sure the true
| function call is run (first, validate that the response is a
| JSON, then make sure it's a valid JSON against the provided
| JSON schema) in n tries, after which the API raises an error or
| returns None.
| stephc_int13 wrote:
| A beneficial side effect of this strategy (operating in batches
| with control logic out of the loop) is that you can also
| relatively easily distribute the work on many worker threads
| without touching the interface or the code structure.
| wg0 wrote:
| This advice falls flat in case of validations. If a function is
| given some input and is supposed to work with it, how can it
| avoid if/else and how can we move this logic one level up to the
| caller to ask the caller to verify every parameter before calling
| the function?
|
| And if we keep pushing (thus pending the decision making) up,
| wouldn't the top most function become a lot more complicated
| having a lot more logic pushed up from far down below?
|
| That's bad and impractical advice but now will pollute many pull
| requests with needless arguments.
| LegionMammal978 wrote:
| > If a function is given some input and is supposed to work
| with it, how can it avoid if/else and how can we move this
| logic one level up to the caller to ask the caller to verify
| every parameter before calling the function?
|
| The usual way in idiomatic Rust would be to use type safety for
| this purpose: have the function accept special types for its
| input, and provide the caller secondary interfaces to construct
| these types. The constructors would then be responsible for
| inspecting and rejecting invalid input. This way, the caller
| can continue pushing the construction, and thus the if/else
| statements for validation errors, upward to the ultimate source
| of the possibly-invalid values.
|
| (This is also possible in C/C++/Java/C#/..., if not so
| idiomatic.)
| garethrowlands wrote:
| No, the advice is good. If the wrong function was called, or a
| function is called with the wrong input, it's just a bug and no
| amount of 'validations' can fix it. The article is written in
| the context of Rust, which has a type system, so the compiler
| can help check.
|
| In cases where a function's precondition can't be expressed in
| the type system, the function should check at the start and
| bale. For example, it's reasonable in Java to check if
| parameters are null (since the compiler cannot do this).
|
| For more on this, Google "parse, don't validate".
| vladf wrote:
| And yet, a Rust Option (or really any option) can just be viewed
| as a list of one or zero elements. https://rust-
| unofficial.github.io/patterns/idioms/option-ite...
|
| In fact, in Haskell, operating on an option conditionally has the
| exact same functor as a list: `map`.
|
| So what am I to do, with an iterator? It's conflicting advice! An
| if is a for for an option.
| stevage wrote:
| >// GOOD >if condition { > for walrus in walruses { >
| walrus.frobnicate() > } >} else { > for walrus in walruses { >
| walrus.transmogrify() > } >}
|
| What? They literally just said in the previous paragraph that the
| `for` should be pushed down into a batch function.
| nialv7 wrote:
| I am just really tired of seeing articles like this. Sure, you
| find some rules that are helpful in some specific cases, but
| those rules almost never generalize (yeah the irony of me making
| a generalizing statement here is not lost on me, but I did say
| "almost").
|
| Imagine you got a corporate programming job, and your manager
| come to you and says "here, in this company, we keep _all_ the if
| statements in one function, and no ifs are allowed anywhere
| else". I would just walk out on the spot.
|
| Just stop, stop writing these articles and please stop upvoting
| them.
| dg44768 wrote:
| Thanks for the article. Maybe I'm confused, but why in the
| section near the end about how the two recommendations go
| together, why is the code this:
|
| if condition { for walrus in walruses { walrus.frobnicate() } }
| else { for walrus in walruses { walrus.transmogrify() } } and not
| this?
|
| if condition { frobnicate_batch(walruses) } else {
| transmogrify_batch(walruses) }
| BenFrantzDale wrote:
| I thought that too. I _think_ the point there was that you
| don't have to push this notion as afar as in /out of functions:
| that just flipping them within a function can be beneficial.
| nurettin wrote:
| I like my ifs down. In fact, after dedades of forcing myself to
| use parameter checklists and avoiding any nesting, I started to
| appreciate code that is nested just a couple of times
| intentionally to get rid of a bunch of conditionals that become
| implied as a result. It all depends on what feels natural and
| easy to read at the given stage of your life.
| pipeline_peak wrote:
| Please stop using Rust in coding style examples as if it's
| something people understand as widely as C.
|
| I don't know your egg head => symbols and idc.
| ilitirit wrote:
| I'm glad many people have identified why "pushing ifs up" is
| often bad advice. This article should give examples of when and
| why you should use _either_ approach. Furthermore, I would argue
| that there 's far too little information and context presented
| here to even make a decision like that.What do `frobnicate` and
| `transmogrify` do? How many callers would need to perform these
| conditional checks? Do these if statements convey domain logic
| that should actually belong in the walrus class? If these checks
| need to be made often, would it make better sense to capture the
| call as a lambda and then only perform the check once instead of
| having a conditional for loop? Etc etc.
| Tainnor wrote:
| The more experience I get, the more I feel that too many
| programmers worry about making things pretty "in the small", but
| not enough care about proper design of an entire codebase.
|
| I'll admit that I like a concise and well-crafted function just
| as much as many others, but articles like this one are probably
| the things that can lead to the kind of unproductive bikeshedding
| that is sometimes experienced in PRs and other discussions. I
| don't care that much about whether your function is messy - or
| about where you put your ifs and fors (or if you use maps and
| filters instead), as long as the function is properly named, has
| a good interface (including expressive types), a clear purpose,
| is properly documented, doesn't make excessive use of side
| effects etc.
| tangjurine wrote:
| Like there's the stuff that can be refactored because the
| changes are local, or the places to change are clear.
|
| And then the stuff that is a lot harder to refactor because you
| would need to change code in a lot of places, and the places
| that need to be changed aren't clear. That's probably more the
| proper design of a codebase stuff.
| Leo_Germond wrote:
| The advice about if up is not bikeshedding though, it is the
| exact kind of architectural choice you're saying one should
| decide on. Don't believe me ? Well imagine you have inputs,
| where should you validate them ? According to this rule of
| thumb it's at the topmost level, when they are received. Well
| that seems super sensible, and it's typically something that
| helps with understanding the code (rather than checking them at
| thw very last moment). Also for proofs that's technically
| necessary to allow the preconditions to "percolate up", which
| has the same effect of moving the if up.
|
| So the first advice is definitely not bike shedding, the second
| one I'm not so clear though ;)
| Tainnor wrote:
| But the article wasn't about "validate inputs at the
| boundaries" (which is a semantic concern), it was about "push
| your ifs up" (which is a syntactic concern).
|
| I agree that in the provided example, those two seem to
| somewhat coincide (although it's hard to say, given that the
| author makes an effort to give their functions names such as
| "frobnicate" that don't indicate their purpose), but in the
| general case that doesn't have to be true.
| mike_hock wrote:
| Are we reading the same article? There is zero concern for
| syntax.
|
| The first point is literally what you said. Write functions
| with clear interfaces. If your function doesn't handle
| None, make its argument type reflect that (Walrus instead
| of Option<Walrus>).
|
| The second point is about performance. Hoist branches out
| of loops, and process batches so any constant cost is
| amortized over the batch. Is that even controversial?
|
| > the author makes an effort to give their functions names
| such as "frobnicate"
|
| Yes, and that's good, because it keeps the focus on his
| actual points rather than the minutiae of a contrived
| example.
| Tainnor wrote:
| > Are we reading the same article? There is zero concern
| for syntax.
|
| By "syntactical" I mean "concerned with the surface level
| appearance of low-level code structure".
|
| > Yes, and that's good, because it keeps the focus on his
| actual points rather than the minutiae of a contrived
| example.
|
| Only if you think that guidelines about how to write code
| should be devoid of the context of the problem you're
| trying to solve. I happen to think that this is mostly
| wrong.
| mike_hock wrote:
| Yes, I think it's possible to have discussions on an
| abstract level and not everything has to be a case study.
| pjc50 wrote:
| "If" is semantic!
| Tainnor wrote:
| Its location in the codebase isn't.
| patmorgan23 wrote:
| If it changes the behavior of the program it is.
| Tainnor wrote:
| Well, let's consider the three examples the author gives
| for "pushing ifs up":
|
| The first one isn't even a proper example because they
| write "push the if to the caller", but doesn't actually
| show the caller. So the good and bad examples aren't even
| straightforwardly comparable. For the second example,
| calling g() has exactly the same behaviour as calling
| f(), and for the third example, the behaviour also
| remains identical with or without the enum.
|
| I don't see how this is any more than just pushing code
| around _without having more context about these methods
| and who calls them_.
| mpweiher wrote:
| > it is the exact kind of architectural choice you're saying
| one should decide on.
|
| While I can't speak to what the OP had in mind, architectural
| concerns are definitely not inside a function. Even
| connecting individual functions/procedures barely registers
| at the smallest, most granular scale of architecture.
|
| Of course, our programming languages for the most part don't
| let us express architectural concerns...so here we are.
| Tainnor wrote:
| Yes, this is what I was talking about. Debating which
| function should include an "if" condition seems rather
| immaterial to me.
| kubb wrote:
| Programming language design also tends to focus on the small
| (e.g. optimizing the number of language keywords).
| pjmlp wrote:
| To go along the same line of thought, too many think about the
| beatuy of the flower, instead of the whole florest.
|
| The amount of hours occasionally wasted discussing this in PR,
| that add zero value to what the customer actually cares, e.g.
| does pressing the button print or not their monthly invoice.
|
| Sure there is a balance to be had between completly
| unmaintainable code, and a piece of art to expose in the
| Louvre, however too many focus on the latter, instead of what
| is the customer getting out of their beautiful flower.
| valand wrote:
| Often the people who overfocus on the beauty of the flower
| are those detached too far from the customers, too many
| indirections that hide failures and pressure
| mpweiher wrote:
| > ... but not enough care about proper design of an entire
| codebase.
|
| One reason is that we don't have ways of meaningfully
| expressing components much larger than functions or that
| connect differently from functions in our programming
| languages. At most we can aggregate functions together. After
| that we're on our own.
| teo_zero wrote:
| Poor choice of words in TFA and a cursory read by a reader
| generated this misunderstanding. "Up" means "to the caller" for
| the author, but was probably understood as "towards the
| beginning (of the function or file)".
| Woshiwuja wrote:
| From the title i would have guessed "Use more ifs and less
| fors" and it didnt make sense at the beginning
| Tainnor wrote:
| No, I did understand the point the author was making. I just
| think it's a weak one.
| joelthelion wrote:
| > The more experience I get, the more I feel that too many
| programmers worry about making things pretty "in the small",
| but not enough care about proper design of an entire codebase.
|
| I've seen the reverse a lot, too. People who enjoy designing
| over-engineered cathedrals and can't be bothered to think about
| the low-level efficiency of their algorithms.
| maeln wrote:
| In general, I feel like too many developer get caught up in
| dogma, "best-practices" that are effectively often
| unrealistic, and tend to adopt extremist way of thinking
| about the big and / or the small picture. A lot of the "dev
| culture" feel like it lacks a sense of pragmatism.
| antupis wrote:
| I think this tells more about company culture than the people
| themselves, some companies like small things, and others like
| cathedrals.
| berkes wrote:
| > People who enjoy designing over-engineered cathedrals and
| can't be bothered to think about the low-level efficiency of
| their algorithms.
|
| When that's deliberate, I'd say it's good.
|
| Quite often we need to optimize for maintenance. Or for high
| churn in developers. For readability. Adaptability. And so
| on. Each optimization is a trade-off. It's not always
| "performance vs readability", but quite often it is.
|
| If we spend time and effort on the low-level efficiency of
| algorithms, AND that comes at the cost of readability,
| adaptability, maintainability or any other -ility, we should
| very deliberate make a choice: does performance matter more
| than X. The answer is suprisingly often "no". And it's one of
| the many reasons why today's software is slow, bloated,
| battery-eating and unstable.
|
| Well, the times when that choice is not made deliberately are
| probably more to blame.
| Tainnor wrote:
| "Over-engineered cathedrals" aren't what I mean by "good,
| proper design".
|
| I hate huge, messy meta-frameworks where everything happens
| implicitly and you can debug nothing. But I hate piles of ad-
| hoc code where everything calls everything and you can't
| change a single thing without breaking 2000 other things just
| as well. You usually don't need complicated code constructs
| to organise your code well - but you do need to _think_.
|
| Designing maintainable code is a difficult skill and one that
| involves careful thinking about tradeoffs. Unfortunately,
| mostly we as an industry never seem to be able to have time
| for it.
| valand wrote:
| > but articles like this one are probably the things that can
| lead to the kind of unproductive bikeshedding that is sometimes
| experienced in PRs and other discussions
|
| I'd counterargue that if these trivial principles are
| established beforehand, unproductive bikeshedding of this type
| will happen less on PRs and other discussions.
|
| > as long as the function is properly named, has a good
| interface (including expressive types), a clear purpose, is
| properly documented, doesn't make excessive use of side effects
| etc.
|
| In performance-sensitive software where DOD thrives, more care
| needs to be put into these "in the small" things because that
| is how compiler optimization works. Then, it changes the rule
| of the game: statement semantics become crucial, self-
| documenting code makes more sense, in-code comments are for
| reasoning exclusively, and "documentation" becomes
| "specification and user manual".
| t8sr wrote:
| I work in security, and get to interact with a lot of area tech
| lead and architect types (L7-L9 in FAANG terms). I have found
| that the people who most concern themselves with "proper
| design" are generally about 5-10 years into their career and
| trying to prevent juniors from shooting themselves in the foot,
| but don't yet have a proper appreciation for the cost of
| complexity and how code changes over the long run. Conversely,
| after you've been doing this for 20+ years you learn to value
| simplicity over most other technical factors.
|
| Questions like "how early do we branch" and "what is the actual
| job this has to do" end up being the ones whose answers tend to
| have the most long-term value. Questions about abstractions and
| encapsulation tend to lead to the type of argument you
| describe. And the "the big picture" people are the ones with
| the most security issues in their code, because they don't
| really understand what the properly architected codebase is
| really doing.
| Tainnor wrote:
| > Questions like [...] "what is the actual job this has to
| do" end up being the ones whose answers tend to have the most
| long-term value.
|
| How is that not "big picture", proper design and
| abstraction/encapsulation?
|
| You seem to be arguing against a strawman. I didn't write
| "complexity". But if you want to fix security issues in your
| database access patterns, you better not have SQL distributed
| across your whole codebase.
| t8sr wrote:
| I don't think I am. You seem to disagree that things like
| where to branch are worth talking about, but you haven't
| given examples of what you mean by the big picture that is
| worth talking about. In this thread, you've told a few
| people they don't understand what you're saying - so what
| are you saying?
|
| Re: SQL. Maybe? Maybe not? It's a lot easier to do bad
| things to a database if you let an ORM generate the query.
| Writing SQL directly is not an anti-pattern, as long as you
| parametrize the query. But then again, maybe an ORM is not
| what you're suggesting - once again, you're saying what you
| don't like, and others are left to draw conclusions about
| what you think is good style.
| Tainnor wrote:
| 1. Yes, I happen to think that the difference between
| this: fn f() { if foo && bar {
| if foo { } else { }
| } }
|
| and this: fn g() { if foo && bar
| { h() } } fn h() {
| if foo { } else { } }
|
| is not worth talking about.
|
| 2. If we're talking about things like _validate user
| input at the boundaries_ (and if you 're a library
| developer, "user input" might refer to what the library
| users will pass in to your code), then I think that _is_
| a valid and useful guideline. If we talk about "don't
| spend time executing a bunch of code when you can exit
| early", then I'll also agree, although in most contexts,
| this doesn't really matter at the granularity of
| functions - but it does matter that you e.g. reject
| invalid input early on in your request handling (assuming
| a web app here) and not start processing it before
| deciding that you don't need to (with the notable
| exception of password validation, which should always be
| constant time).
|
| 3. What I mean by "big picture" is things like clearly
| delineating which parts of the codebase are responsible
| for what - for example, which parts of the code contain
| logic related to the database, which parts are related to
| business logic, which parts to interfacing with external
| APIs, which parts about handling responses. It's things
| such as "functional core, imperative shell" (Gary
| Bernhard) or other means to try to manage side effects.
| It's deciding whether to write blocking or non-blocking
| code, and in the latter case, how exactly to go about it
| (instead of making a total mess by mixing both styles
| with a lot of back and forth). It's about deciding on how
| you want to structure your test suite so it gives you the
| maximum amount of confidence.
|
| 4. I think ORMs provide mostly false confidence and
| outside of prototype scenarios, I wouldn't personally use
| them anymore - especially not ones like Hibernate and
| ActiveRecord - but, alas, it's not always my decision.
| Their biggest flaw is IMHO that they pollute your domain
| model (and sometimes even your presentation logic -
| looking at you, ActiveRecord) with database logic, just
| for the convenience of being able to call "save()"
| directly on an object. I don't terribly mind writing raw
| SQL (as long as you properly parameterize inputs),
| although I think that generally, type-safe query builders
| with a 1:1 relationship to SQL (such as jOOQ) are
| preferable. I _do_ think that all of your SQL-related
| logic in your application shouldn 't be scattered
| randomly across 10,000 files - it should be in a
| dedicated place (a number of files grouped in a common
| folder/package/module, maybe) so that if you need to
| review certain aspects of it (related to security or
| performance, e.g. avoiding N+1 queries), you don't have
| to hunt that logic down throughout your whole
| application.
| xtracto wrote:
| This resonates with my experience. Apparently now the
| "inversion of control", "dependency injection" patterns are
| in vogue. So my Sr. Developers are building new code with all
| that crap. So much that when you want to make a simple
| change, you have to hunt down 20 files of `xxxCreator`s,
| `xxxHandler`s and `xxxProvider`s that are glorified functions
| invoked by `.execute` or `.call`.
|
| And this is in TypeScript ... we were laughing about that
| kind of complexity in Java 20 years ago... in Slashdot.
| valand wrote:
| I always deplore this shallow influencing of design
| patterns without proper examination of whether said design
| pattern fits or is even correctly described by the
| popularizer. People fail to realize that inversion of
| control is necessary ONLY if some controls need to be
| inverted.
|
| To be fair, "inversion of control" is a great concept, but
| unfortunately told in a very specific way across time so
| the meaning now has changed into an inferior form of the
| original.
| Tainnor wrote:
| I do personally think it's preferable for components to
| have their dependencies injected in some way, and that
| it's better to have initialisation logic (especially
| related to heavy weight components) in one place.
|
| That doesn't mean that you need some huge ass meta-
| framework for it though. Just have your main function
| assemble the components together, done.
| shanghaikid wrote:
| no one likes 'if/else', moving 'if/else' inside/outside a
| function is not a solution, if you are writing business logic or
| UI logic, we should try to avoid it as much as possible, only
| except when you are writing complex algorithm which the
| computational complexity is required.
| crabmusket wrote:
| I was initially surprised by the pushback this article is
| getting.
|
| Then I remembered that this is data-oriented design advice, and I
| imagine most people on this forum (myself included most of the
| time) are writing line-of-business web apps where this advice
| seems like nonsense. I had already internalised the context, and
| wasn't planning to go apply this to my Laravel backend code.
|
| A heuristic: if in your usual daily work you don't need to think
| about the instruction cache, then you should probably ignore this
| advice.
|
| If you haven't yet and want to get a taste of when this advice
| matters, go find Mike Acton's "Typical C++ Bullshit" and decipher
| the cryptic notes. This article is like an understandable
| distillation of that.
|
| Despite what Casey Muratori is trying to argue (and I'm largely
| sympathetic to his efforts) most line-of-business software needs
| to optimise for changeability and correctness ("programming over
| time") not performance.
| GuestHNUser wrote:
| Yeah, data-oriented design (DOD) always seems to get people
| riled up. I think it largely gets this type of reaction b/c DOD
| implies that many aspects of the dominant object-oriented
| approach are wrong.
|
| > most line-of-business software needs to optimise for
| changeability and correctness, not performance.
|
| It's a shame that so many see changeability and performance in
| opposition with each other. I've yet to find compelling
| evidence that such is the case.
| crabmusket wrote:
| Well it's hard to argue about that tradeoff _in general_ ,
| but I think the existence of languages like Python, Ruby and
| PHP is compelling. Though I'd accept the argument that they
| help optimise for _neither_ performance nor changeability!
|
| My perspective is necessarily limited, but I often see
| optimisation as a case of "vertical integration" and
| changeability as about "horizonal integration".
|
| To make something fast, you can dig all the way down through
| all the layers and do the exact piece of work that is
| required with the minimum of useless faffing about for the
| CPU[0]. But to make something robust, you might want to e.g.
| validate all your inputs at each layer since you don't know
| who's going to call your method or service.
|
| Regarding the DOD/OOP wars, I really love this article, which
| argues that OOP doesn't have to be bad[1]. I also think that
| when performance is a requirement, you just have to get more
| particular about your use of OOP. For example, the difference
| between Mesh and InstancedMesh[2] in THREE.js. Both are OOP,
| but have very different performance implications.
|
| [0] Casey Muratori's "simple code, high performance" video is
| an epic example of this. When the work he needed to do was
| "this many specific floating point operations", it was so
| cool to see him strip away all the useless layers and do
| almost exactly and only those ops.
|
| [1] https://www.gamedev.net/blogs/entry/2265481-oop-is-dead-
| long...
|
| [2] https://threejs.org/docs/index.html?q=instanc#api/en/obje
| cts...
| GuestHNUser wrote:
| Thanks for the thoughtful reply.
|
| > Well it's hard to argue about that tradeoff in general,
| but I think the existence of languages like Python, Ruby
| and PHP is compelling. Though I'd accept the argument that
| they help optimise for neither performance nor
| changeability!
|
| I see the point you're making, and I don't disagree with
| it. But I should be a bit more clear in what I really meant
| in my parent comment:
|
| Given whatever language a developer is working in, whether
| it is fast like C++ or a slow language like Python (or
| perhaps even Minecraft Redstone), I think the programmer
| that takes a data oriented approach (meaning they write
| their code thinking about the kinds of data the program can
| receive and the kind it will likely receive, along with
| what operations will be the most expensive) will have
| better code than a programmer that makes a nice object
| model following all the SOLID principles. The majority of
| the OOP code I've worked with spends too much time caring
| about abstractions and encapsulation that performance is
| lost and the code is no better to work with.
|
| > Regarding the DOD/OOP wars, I really love this article,
| which argues that OOP doesn't have to be bad[1]. I also
| think that when performance is a requirement, you just have
| to get more particular about your use of OOP. For example,
| the difference between Mesh and InstancedMesh[2] in
| THREE.js. Both are OOP, but have very different performance
| implications.
|
| Absolutely agree here. Classic gamedev.net article. ECS !=
| DOD and I think the next parts of the article illustrate
| how DOD isn't necessarily in opposition with programming
| paradigms like OOP and FP.
|
| With that said, I think it can be argued that common
| patterns within both OOP and FP circles are a hurdle at
| times to utilizing hardware to its fullest. Here's Casey
| Muratori's argument against SOLID principles[0] for
| instance.
|
| ---------------------
|
| I think the point still stands: performance isn't in
| opposition to making a maintainable/changeable program.
|
| [0] https://youtu.be/tD5NrevFtbU?si=sED7befZRnVnQIxQ
| Jach wrote:
| A missing element to the conversation is another
| interpretation of DOD, that is, domain-oriented design.
| My favorite writing on the matter is "Programming as if
| the Domain (and Performance) Mattered"
| (https://drive.google.com/file/d/0B59Tysg-
| nEQZSXRqVjJmQjZyVXc...)
|
| When OOP has bad performance, or is otherwise another
| instance of ball of mud architecture, it often stems from
| not modeling the domain correctly. Only using or being
| forced to use an inferior conception of OOP (i.e. OOP as
| merely fancy structs + function pointers) doesn't help
| either, and ends up encouraging the sorts of patterns
| found in GOF even preemptively before they can maybe earn
| their keep. And what's especially insidious about Kingdom
| of Nouns style OOP is that just because you have named
| something, or created a type hierarchy, does not actually
| make it a particularly good model of anything. If you
| interview enough people, you'll find some thinking it's
| entirely reasonable to do the equivalent of making a Car
| a subclass of a Garage just because garages contain cars.
| When bad modeling infects production code, it's difficult
| to remove, especially when it's so overgrown that a bunch
| of other code is created not to fix the modeling but to
| try and wrangle a bit of sanity into localized corners
| (often at the further expense of performance --
| frequently from a lot of data copying and revalidating).
|
| On the other hand, modeling things too close to the
| hardware, or (as the linked paper goes through) too close
| to one ideal of functional programming purity (where you
| want to just express everything with map and reduce on
| sequences of numbers if you can, in theory giving high
| performance too), can severely get in the way of changing
| things later, because again you're not actually modeling
| the domain very correctly, just implementing one clever
| and specific mapping to raw numbers. When the domain
| changes, if your mapping was too coupled to the hardware,
| or too coupled to a nice map/reduce scheme, the change
| throws a big wrench in things. Sometimes, of course, such
| direct hardware and code exploitation is necessary, but
| we live in a fat world of excess and most business
| software isn't so constrained. An interesting example
| from the past could be Wolfenstein 3D, where its doors
| and especially its secret push walls are handled as
| special cases by the raycaster. Carmack initially refused
| to add push wall functionality to his engine, it seemed
| like it would be too big of a hack and possibly hinder
| performance too much. Big special cases as ifs down in
| the fors usually suck. After much needling and thinking
| though, and realizing the game really did need such a
| feature, he eventually got it done, surprising everyone
| when at a final attempt at convincing him to add them, he
| revealed he already did it.
| Gravityloss wrote:
| Yeah. Is there an article showing how a clever one liner that
| is hard to read and an "inefficient looking" but easy to
| understand multiline explanation style code with proper
| variable names etc will result in the same compiled code?
|
| I would assume compilers would be sufficiently advanced
| nowadays...
| rkangel wrote:
| > It's a shame that so many see changeability and performance
| in opposition with each other. I've yet to find compelling
| evidence that such is the case.
|
| I think agree with some of the sibling comments that you can
| write in a default style that is both maintainable and
| performant. Regularly though, when profiling and optimising,
| the improvement can reduce maintainability.
|
| Two examples from the last 6 months:
|
| A key routine is now implemented in assembly for our embedded
| target because we couldn't reliably make the compiler
| generate the right code. We now have to maintain that in
| parallel with the C code implementation we use on every other
| target.
|
| We had a clean layering between two components. We have had
| to "dirty" that layering a bit to give the lower layer the
| information needed to run fast in all cases.
|
| With thought you can often minimise or eliminate the impact,
| but sometimes you take the trade-off.
| wizzwizz4 wrote:
| If you had good tooling, you could annotate the assembly
| routine as being equivalent to the C implementation; then
| maintenance would be a lot easier, since the compiler would
| yell at you if they ever diverged.
|
| Depending on how you've dirtied your abstractions, language
| tooling _might_ help with that (e.g. if you could decouple
| the abstraction from the implementation, and map multiple
| abstractions to the same data), but I don 't know whether
| that could work, or if it'd even be an improvement if it
| did.
| Lvl999Noob wrote:
| > If you had good tooling, you could annotate the
| assembly routine as being equivalent to the C
| implementation; then maintenance would be a lot easier,
| since the compiler would yell at you if they ever
| diverged.
|
| Is that something actual compilers can do right now? I
| don't think I have heard of something like though, though
| I don't work that close with compilers much. Furthermore,
| if the compiler can recognise that the assembly routine
| is equivalent to the C implementation, wouldn't it also
| be able to generate the same routine?
| PhilipRoman wrote:
| >Is that something actual compilers can do right now?
|
| it is not (outside of research)
|
| >if the compiler can recognise that the assembly routine
| is equivalent to the C implementation, wouldn't it also
| be able to generate the same routine?
|
| If you're willing to provide the equivalence proof
| yourself, it is much easier to verify such a proof than
| to produce one. The more work you're willing to put in
| when writing it, the simpler the proof verifier becomes.
| You could probably write a basic school arithmetic proof
| verifier within an hour or so (but it would not be
| pleasant to use).
|
| I've thought about this idea a lot myself and I think it
| should be feasible, (maybe not with assembly right away).
| You could write an easily readable function in C for code
| review/verification purposes and then specify a list of
| transformations. Each transformation must preserve the
| semantics of the original function. For example "unroll
| this loop by a factor of 4", "Fuse these 4 additions in a
| single vector instruction". It would be a pain to write
| such a transformation list (AI assistance maybe?) but
| once written you can rely on the compiler to make sure
| there are no mistakes.
| rcxdude wrote:
| They don't need to be in opposition: it's enough that the
| changeability, correctness, and performance of solutions is
| uncorrelated for you to frequently need to tradeoff between
| them, especially given the inevitable tradeoff of all of
| these against time to write.
| gpderetta wrote:
| There is a point where performance optimizations get in the
| way of clarity, but that's after a long plateau where simple
| software == fast software. And simple software is the most
| amenable to changeability. It might not be the fastest way to
| initially write the software though, as leveraging existing
| large complex frameworks can give an head start to someone
| familiar with it.
| gnuvince wrote:
| Somewhere, somewhen, we, as software developers, started
| thinking that other programmers would rather extend code
| rather than modify it. This has led us to write code that
| tries to predict the use cases of future programmers and to
| pre-emptively include mechanisms for them to use or extend
| our code. And because it has seeped so deeply into our
| culture, if we don't do this -- engage in this theater --
| we get called out for not being good software engineers.
|
| Of course, the extra hooks we put in to allow re-use and
| extensibility usually results in code that is slower and
| more complex than the simple thing. Worse, very often, when
| a customer needs a new feature, the current extension hooks
| did not predict this use case and are useless, and so the
| code has to be modified anyway, but now it's made 10x more
| difficult because of the extra complexity and because we
| feel that we have to respect the original design and not
| rip out all the complexity.
|
| I like Knuth's quote [1] on this subject:
|
| > I also must confess to a strong bias against the fashion
| for reusable code. To me, "re-editable code" is much, much
| better than an untouchable black box or toolkit. I could go
| on and on about this. If you're totally convinced that
| reusable code is wonderful, I probably won't be able to
| sway you anyway, but you'll never convince me that reusable
| code isn't mostly a menace.
|
| [1] https://blog.jj5.net/blog/2021/05/21/knuth-on-reusable-
| code/
| mattacular wrote:
| Writing generally "reusable code", aka a library,
| warrants a different approach to software development
| than application code in many areas.
|
| 1. Application code = Fast-changing, poorly specified
| code. You need to have a rapid development cycle that
| supports "discovering" what the customer wants along the
| way. Your #1 job is pleasing the customer, as quickly,
| and as reliably, as possible.
|
| 2. Library code = Slow-changing, highly specified code.
| You have a long, conservative development cycle. Your #1
| job is supporting application programmers (the customers
| of your library).
| mschuster91 wrote:
| > Somewhere, somewhen, we, as software developers,
| started thinking that other programmers would rather
| extend code rather than modify it.
|
| That was when stuff like "proper testing" was deemed to
| be too expensive. It's unlikely to break existing
| workflows with extending something, but very easy to do
| so during a modification.
|
| Companies used to have hordes of manual testers/QA staff,
| that all got replaced by automated tools of questionable
| utility and capability.
| wizzwizz4 wrote:
| The tools are very useful, and they have well-known
| capability. That capability is strictly less than the
| capability of most manual testers / QA staff, but it's a
| lot faster at it, and gets much closer to being
| exhaustive.
|
| Automation should mean you can do a better job, more
| efficiently, more easily. Unfortunately, ever since the
| Industrial Revolution, it seems to mean you can do a
| quicker job with less money spent on labour costs.
| mschuster91 wrote:
| > That capability is strictly less than the capability of
| most manual testers / QA staff, but it's a lot faster at
| it, and gets much closer to being exhaustive.
|
| That's if you put the effort in to write good tests. When
| I look at the state of gaming in general, it's ... pretty
| obvious that this hasn't worked out. Or the GTA Online
| JSON debacle - I'm dead sure that this was known
| internally for a long time, but no one dared to modify
| it.
|
| And even then: an automated test can't spot _other_
| issues unrelated to the test that a human would spot
| immediately. Say, a CSS bug causes the logo to be
| displayed in grayscale. The developer who has
| accidentally placed the filter on all img elements writes
| a testcase that checks if an img element in content is
| rendered in greyscale, the tests pass, the branch gets
| merged without further human review... and boom.
| wizzwizz4 wrote:
| We do actually have automated testing tools for that.
| https://bbc.github.io/wraith/ is one: see the write-up
| https://responsivenews.co.uk/post/56884056177/wraith or a
| contemporary interview
| https://source.opennews.org/articles/responsive-css-
| testing-... .
|
| I get your overall point, though.
| brabel wrote:
| Simplest is definitely not more amenable to changes. We've
| just implemented a large feature where some devs tried to
| "hardcode" all the logic to apply a kind of rules-engine. I
| was horrified because the whole thing was coupled to the
| rules we currently needed, but we all know this is just the
| beginning and we plan to add more rules , and even allow
| custom rules to be defined by our customers. So, even
| though what they were trying to do is often lauded on HN
| and other forums because it's applying KISS and YAGNI, in
| this case adding a new rule would mean, basically, changing
| everything - the engine, the data that the engine persists,
| potentially the end result... everything! Now, perhaps this
| was indeed simpler though. However, it's the opposite of
| modifiable (and by the way, implementing it with abstract
| rules which store their own data, which the engine needs
| not know about, is actually much cleaner and the de-
| coupling just comes for free, almost).
| funcDropShadow wrote:
| This doesn't sound like a simple solution from your
| fellow devs. It appears to have been an easy solution. If
| the distinction isn't familiar to you, there is a great
| talk of Rich Hickey [0], that explains that distinction
| and some more. The talk is definitely not only applicable
| to Clojure.
|
| YAGNI is a great slogan, but must not be a dogma. If you
| already know, that you are going to need custom rules.
| Then prepare for it. But, if --- for example --- the
| current rules are almost trivial, and you don't know yet,
| which rule engine will be the right fit later on. Then it
| might be a good idea to postpone the decision about the
| rule engine, until you know more. In the meantime a hard-
| coded solution could be enough.
|
| [0]: https://www.youtube.com/watch?v=LKtk3HCgTa8
| brabel wrote:
| I know that talk very well. And I still disagree with you
| that hardcoding the rules is not simple, just easy. It's
| always going to be simpler to implement code that's more
| specific (i.e. has less room for the unknown) than less.
| Or do you think there is anything Rich said that shows
| that not to be always true?
| funcDropShadow wrote:
| Rich didn't say it, if I remember correctly. But there
| are problems where a more general algorithm is simpler
| than a more specific straight-forward algorithm. Usually
| because you change the way you model the problem.
|
| Otherwise, I have to take your word for it, because I
| cannot see your specific example.
| bluefirebrand wrote:
| YAGNI definitely doesn't apply in cases where you do
| actually know that you are gonna need it.
|
| If the task is "build a rules engine with X, Y, and Z
| rules, that can add A, B, and C rules next year" then
| delivering hardcoded XYZ rules engine is an absolute
| failure to engineer and a very braindead use of "YAGNI"
| RHSeeger wrote:
| > It's a shame that so many see changeability and performance
| in opposition with each other. I've yet to find compelling
| evidence that such is the case.
|
| In the article, when I saw this
|
| > For f, it's much easier to notice a dead branch than for a
| combination of g and h!
|
| My first thought was "yes, but now if anyone _else_ calls h
| or g, the checks never happen (because they live in f). I'd
| much rather have h and g check what _they_ need in order to
| run correctly. That way, if another call to one of them is
| added, we no longer need to rely on _that_ call correctly
| checking the conditions. Plus it avoids duplication.
|
| But... and this goes back to the original point from your
| post... this is a matter of code being correct over time;
| changeability. If you're worried about performance, then
| having the same check in 2 different places is a problem. If
| you're not (less) worried, then having the code less likely
| to break later as changes are made is helpful.
| Gadiguibou wrote:
| You can also often encode those checks in the type system
| like in the Option example near the start of the article.
| RHSeeger wrote:
| Assuming your type system is robust enough to support
| such things. That's not the case for many mainstream
| languages.
| sfink wrote:
| In that case, use DEBUG-only asserts. They're not
| foolproof (they're only as good as your test suite), but
| they're pretty damn good.
| bryanrasmussen wrote:
| maybe - although most of the time, especially last few years, I
| write web apps and I push ifs up all the time because it allows
| for early return like the article says.
|
| Determining that you need to do nothing should be done as soon
| as possible, especially in any system where performance is
| essential and web apps make more money the better their
| performance as a general rule.
| valand wrote:
| Not even strictly DOD, this trivial principle produces better
| semantics and drives better code separation down the line.
|
| Several years ago I was exposed to DOD (and then this
| principle) when working on complex JS/TS-based for long-running
| systems. It results in better code navigability, accurate
| semantic synthesis, and easier subsequent refactors.
|
| The side effect: some people remarked that the code looked like
| C
| gnuvince wrote:
| > The side effect: some people remarked that the code looked
| like C
|
| Did you understand this comment to be positive or negative?
| valand wrote:
| It's neutral. However, a brief negative implication: it's
| less "idiomatic", as in the pattern that's commonly
| advertised as best practices
| simiones wrote:
| Honestly, both of these I think are pretty applicable to line-
| of-business apps as well.
|
| The "push loops down" advice most especially: for any CRUD app,
| handling creation and updates in bulk when possible will
| typically save huge amounts of time, much more than in CPU
| bound use cases. The difference between doing
| `items.map(insertToDb/postToServer) ` vs doing
| `insertToDb/postToServer(items)` is going to be orders of
| magnitude in almost all cases.
|
| I have seen optimizations of this kind take operations from
| seconds or minutes down to milliseconds. And often the APIs end
| up cleaner and logs are are much easier to read.
| atherton33 wrote:
| Great point. The n+1 sql queries antipattern comes to mind as
| a very common "push loops down" application in line of
| business stuff. Let the db do the work/join.
| crabmusket wrote:
| Oh definitely. I've seen the same kind of benefits. Avoiding
| 1+n queries is a similar change - push the loop "down" into
| the database.
| berniedurfee wrote:
| I think this is still very much applicable in OOP.
|
| Developers tend to break complex business logic within classes
| down into smaller private methods to keep the code DRY. The
| "push ifs up" mantra is really useful here to ensure the
| branching doesn't end up distributed amongst those methods.
|
| The "push fors down" is also very relevant when most call
| threads end up at an expensive database query. It's common to
| see upstream for loops that end up making many DB calls
| downstream somewhere when the looping could have been replaced
| by a "where" clause or "join" in the SQL.
|
| In fact, the "push fors down" mantra is useful even at the
| architecture level, as it's usually better to push looping
| logic for doing aggregations or filtering into a DAO where it
| can be optimized close to the DB, rather than pulling a bunch
| of objects out of the database and looping through them.
|
| I love simple and clear design principles like this!
|
| Though, as with all design principles, one needs to consider it
| deliberately vs applying it as dogma!
| tetha wrote:
| I've been experimenting with a similar idea to "pushing ifs
| up" in OOP, too: Question ifs in the code of a class and just
| use more classes.
|
| For example - python terminology, as it came from there -
| don't have a class "BackupStatus", and have the BackupStatus
| use 2-3 Booleans to figure out how to compute a duration, a
| status message and such. Instead, have a protocol
| "BackupStatus" to declare what a BackupStatus has to do, and
| figure these out for a FailedBackup, a PartialBackup, a
| SuccessfulBackup and so on.
|
| It's different from what I did for a long time, and some
| people are weirded out by it at first. But at the same time,
| it allows me to create a bunch of small, complete and
| "obviously correct" classes. Or at the least, very concrete
| cases to reason about with somewhat technical stake holders.
| And once your backup status classes / enum members / whatever
| are just correct, then you can go and discuss when to create
| and use which of states.
|
| It's not for everything naturally, but it quickly gives a
| solid foundation to build more complex things upon.
| thesnide wrote:
| There are indeed 2 religions. Data oriented or Code oriented.
|
| Both are making very different choices as they have diverging
| trade offs.
|
| I wrote something along the lines. Not the same, but very
| similar given enough abstraction
|
| https://blog.pwkf.org/2022/08/07/two-programming-religions-c...
| GuB-42 wrote:
| > most line-of-business software needs to optimise for
| changeability and correctness ("programming over time") not
| performance
|
| These are not mutually exclusive, in fact, more often than not,
| they are correlated.
|
| Maybe the most important aspect of performance is to make
| things small. Small code, small data structures, small number
| of executed instructions. Writing small code is what "thinking
| about the instruction cache" essentially is, btw.
|
| And as it turns out, the smaller the code, the less room there
| is for bugs, the more you can understand at once, and the
| easier it is to get good coverage, good for correctness. As for
| changeability, the smaller the code, the smaller the changes.
| The same applies to data.
|
| Now, some optimization techniques can make the code more
| complicated, for example parallelization, caching, some low
| level optimization, etc... but these only represent a fraction
| of what optimizing for performance is. And no serious
| performance conscious programmer will do that without proper
| profiling/analysis.
|
| Then there are things that make the code faster with limited
| impact (positive and negative), and this is what the article is
| about. Functionally, if/for is not really different from
| for/if, but one is faster than the other, so why pick the slow
| one? And even if the compiler optimizes that for you, why rely
| on the compiler if you can do it properly at no cost. Just like
| looping over 2D arrays, it is good to know that there are two
| ways of doing it, and while they look equivalent, one is fast
| and one is slow, so that you don't pick the slow one by
| accident.
| Obscurity4340 wrote:
| Is for/if faster since loops get started right away where ifs
| need to check conditionals constantly on top of whatever
| action they are supposed to execute?
| tremon wrote:
| Loops are fastest when they fit in the processor's
| instruction cache (and preferably, only touch data that
| fits in the data cache). Similarly, code is fastest when it
| has to execute the least amount of instructions. In the
| first example, the walrus(Option) function is designed to
| be executed unconditionally, only to return early when
| there is no walrus. That's an unnecessary function call
| that can be removed by changing the method signature (in
| Rust, because it has non-nullable types. In other languages
| you would need to do the non-null check anyway for safety
| reasons).
| mdavidn wrote:
| I have practiced this advice in line-of-business software to
| great effect. Popular ORM libraries operate on individual
| records, but that is quite slow when software needs to import a
| large number of objects. Restructuring the code to operate on
| batches--push fors down--results in significantly fewer
| database queries and network round-trips. The batched routines
| can resolve lookups of foreign keys in batches, for example, or
| insert multiple records at once.
|
| Granted, that is more difficult to maintain. I only batched the
| import routines. The rest still uses the more maintainable ORM.
| loup-vaillant wrote:
| > _Despite what Casey Muratori is trying to argue (and I 'm
| largely sympathetic to his efforts) most line-of-business
| software needs to optimise for changeability and correctness
| ("programming over time") not performance._
|
| Then consider listening to John Ousterhout instead: turns out
| that in practice, changeability and correctness are not
| _nearly_ at odds with performance than we might initially
| think. The reason being, simpler programs also tend to run
| faster on less memory. Because in practice, simpler programs
| have shorter call stacks and avoid convoluted (and often
| costly) abstractions. Sure, top notch performance will
| complicate your program. But true simplicity _will_ deliver
| reasonable performance most of the time.
|
| By the way, while pushing fors down is mostly a data oriented
| advice, pushing ifs up is more about making your program
| simpler. Or more precisely, _increasing its source code
| locality_ : https://loup-vaillant.fr/articles/source-of-
| readability that's what concentrating all the branching logic
| in one place is all about.
| fl0ki wrote:
| I almost agree except that I may have in mind a different kind
| of engineer than you do: people who don't think about the
| instruction cache even though they _are_ writing performance-
| critical code. This post won 't change that, but I hope that
| people writing about performance know that audience exists, and
| I dearly hope that people in this situation recognize it and
| strive to learn more.
|
| At a large enough scale, even line of business code can become
| a bottleneck for real-world business activity. And
| unsurprisingly, engineers who don't think about performance
| have a way of making small scales feel like large ones because
| they use architectures, algorithms, and data structures that
| don't scale.
|
| This happens even in the FAANG companies famed for their
| interviewing and engineering. I've seen outages last for hours
| longer than they should have because some critical algorithm
| took hours to run after a change in inputs, all the while
| costing millions because one or more user-facing revenue-
| critical services couldn't run until the algorithm finishes
| (global control, security, quota, etc. systems can all work
| like this by design and if you're lucky the tenth postmortem
| will acknowledge this isn't ideal).
|
| I've had to inherit and rework enough of these projects that I
| can definitively say the original developers weren't thinking
| about performance even though they knew their scale. And when I
| did inherit them and have to understand them well enough to
| make them as fast as they needed to be, some were at least
| written clearly enough that it was a joy, and others were a
| tangled mess (ironically, in some cases, "for performance" but
| ineffectively).
|
| See also: the evergreen myth that "you don't need to learn
| algorithms and data structures, just learn to find them online"
| resulting in bullshit like a correct but exponential algorithm
| being put into production when a simple linear one would have
| worked if the engineer knew more about algorithms.
|
| There's so much wrong with how many people do performance
| engineering that I don't think pushing fors down is even in the
| top 10 tips I would give, I just think that folks posting and
| commenting in this space recognize how large and impactful this
| section of the audience is.
| throwaway2037 wrote:
| Despite what Casey Muratori is trying to argue (and I'm largely
| sympathetic to his efforts) most line-of-business software
| needs to optimise for changeability and correctness
| ("programming over time") not performance.
|
| In investment banking (markets / trading), much of the most
| economically valuable software could easily run on a Raspberry
| Pi. I always call it "10,000 if-thens" (all the business rules
| that build up over 10+ years). Excel with VBA can still do a
| lot. A very tiny fraction of markets / trading software needs
| to be performant, yet, those tiny parts dominate most of the
| conversations (here and at work). It is tiring. Keep writing
| simple if-thens with a few for loops. Keep making money...
| sdfghswe wrote:
| I work in a domain where performance is very important, and I'm
| surprised that such a simple heuristic can be so good.
| vjk800 wrote:
| I'm surprised by how often programmers coming from software
| engineering background do this wrong. I started programming in
| science and there it's absolutely necessary to think about this
| stuff. Doing for loops in a wrong order can be the difference
| between running your simulation in one hour instead of one week.
|
| With this background, I instinctively do small-time optimization
| to all my codes by ordering for's and if's appropriately. Code
| that doesn't do this right just _looks_ wrong to me.
| crabmusket wrote:
| Watch out for a visit from the premature optimisation police!
| topaz0 wrote:
| I try to avoid the premature optimisation police just as much
| as the regular police. Neither of them tend to be serving the
| public interest, whatever they may say.
| cubefox wrote:
| The pushing-ifs-up assumes you are using a language where objects
| aren't nullable. This doesn't apply to most languages. Otherwise
| we would just get a potential null pointer exception when we
| don't check for null inside the function.
| theK wrote:
| > If there's an if condition inside a function, consider if it
| could be moved to the caller instead
|
| Haha, it is important to have logic where it is relevant. If
| performance is more relevant than semantics or maintainability do
| that. In all other cases favor locality, filter early and fscking
| kiss. Why is this news?
| blumomo wrote:
| I wrongly assumed this article was about readability and
| maintainability, two high cost factors if five poorly.
|
| But I was wrong, here's the motivation on "push fors up":
|
| > The primary benefit here is performance. Plenty of performance,
| in extreme cases.
| amelius wrote:
| > While performance is perhaps the primary motivation for the for
| advice, sometimes it helps with expressiveness as well.
|
| Compilers can do this. And I don't think the second part of that
| sentence is applicable very often.
| cbogie wrote:
| i first interpreted this on terms of organization policies, and
| then clicked the link.
|
| functions, they're just like us.
| nivertech wrote:
| The real answer is "it depends", but in the vast majority of
| cases it is better to "push down" all control flow structures:
| both conditional branching and loops. This will make it easier to
| write the code as a pipeline, and therefore more readable.
|
| Some argue that higher-order functions (HoFs) should be "pushed
| up" even if they make the code more verbose (that's a standard
| code style in Elixir), while the alternative is more readable -
| it creates lots of microfunctions, which are also clutter the
| code.
|
| For programming langauges with either succinct control structures
| (APL family), or with syntactic sugar for them, I see no problem
| with "pushing them up".
|
| Of course there are exceptions, e.g. in case the code needs to be
| optimized for speed, memory usage, security, or formal code
| correction.
| Vt71fcAqt7 wrote:
| The way he explains "pushing `for`s down" is obvious. Restating
| it would be: only do something you need in a for loop and not
| something you could do once. But that has nothing to do with what
| he means by "pushing `if`s up" which is about making `if`s appear
| es early as possible in the control flow. It doesn't matter if
| `for`s are early or late in the control flow. What matters is
| that only things that need to be done n times with n being the
| loop count of the for loop are in the for loop and everything
| else is not. And I disagree with "push ifs up" stated
| unqualified.
| sharas- wrote:
| In other words: write small reusable functions. And an
| orchestrator "script" function to glue them all together. The
| orchesrator has all the "ifs" what is domain/application
| specific.
| btbuildem wrote:
| This usually culminates with the patient trying to cram all
| if/else logic into a type system, which effort eventually leads
| them to have an aneurysm and they have to be carried away on a
| stretcher.
| sharno wrote:
| What I noticed too for pushing the for loops down is that more
| functions start taking list of things instead of one thing. This
| makes it easier down the path when a requirement change and we
| need to handle multiple things instead of one thing with the same
| code logic. It decreases overall change in the codebase.
| kazinator wrote:
| > _The GOOD version is good, because it avoids repeatedly re-
| evaluating condition, removes a branch from the hot loop, and
| potentially unlocks vectorization._
|
| Compilers can do that: notice that _condition_ is not affected
| within the loop and hoist it outside.
|
| Speaking of which, if the condition _is_ affected by the loop,
| then the transformation is not correct. E.g. maybe the walruses
| appear in the list in a specific order, and an earlier walrus can
| determine a condition affecting the treatment of a later walrus.
|
| Don't go into your code randomly swapping for and ifs.
| 6510 wrote:
| In my ignorance I would add that empty functions are really fast.
|
| if(condition){carbonate = function(){}}
|
| for(dragon in dragons){ carbonate(dragon);/ _do something else_ /
| }
| Snetry wrote:
| I find this to be really interesting advise but I wonder when
| this is right and when this would be wrong.
|
| Especially in C, where a lot of data is passed through pointers,
| you'd want to make sure the data you are given isn't pointing to
| nothing and can't always rely on the callee doing it for you.
___________________________________________________________________
(page generated 2023-11-17 23:02 UTC)