[HN Gopher] Push ifs up and fors down
___________________________________________________________________
Push ifs up and fors down
Author : celeritascelery
Score : 60 points
Date : 2023-11-15 21:41 UTC (1 hours ago)
(HTM) web link (matklad.github.io)
(TXT) w3m dump (matklad.github.io)
| Waterluvian wrote:
| This kind of rule of thumb usually contains some mote of wisdom,
| but generally just creates the kind of thing I have to de-
| dogmatize from newer programmers.
|
| There's just always going to be a ton of cases where trying to
| adhere to this too rigidly is worse. And "just know when not to
| listen to this advice" is basically the core complexity here.
| flashback2199 wrote:
| Don't compilers, cpu branch prediction, etc all fix the
| performance issues behind the scenes for the most part?
| Thaxll wrote:
| Well compilers are good and dumb at the same time.
| bee_rider wrote:
| Some of the moves seemed to change what an individual function
| might do. For example they suggested pulling an if from a
| function to the calling function.
|
| Could the compiler figure it out? My gut says maybe; maybe if
| it started by inlining the callee? But inlining happens based
| on some heuristics usually, this seems like an unreliable
| strategy if it would even work at all.
| malux85 wrote:
| No, compilers (correctly) prefer correctness over speed, so
| they can optimise "obvious" things, but they cannot account for
| domain knowledge or inefficiencies farther apart, or that
| "might" alter some global state, so they can only make
| optimisations where they can be very sure there's no side
| effects, because they have to err on the side of caution.
|
| They will only give you micro optimisations which could
| cumulatively speed up sometimes but the burden of wholistic
| program efficiency is still very much on the programmer.
|
| If you're emptying the swimming pool using only a glass, the
| compiler will optimise the glass size, and your arm movements,
| but it won't optimise "if you're emptying the correct pool" or
| "if you should be using a pump instead" - a correct answer to
| the latter two could be 100,000 times more efficient than the
| earlier two, which a compiler could answer.
| jchw wrote:
| The short answer is absolutely not, even when you are sure that
| it should. Even something as simple as a naive byteswap
| function might wind up generating surprisingly suboptimal code
| depending on the compiler. If you really want to be sure,
| you're just going to have to check. (And if you want to check,
| a good tool is, of course, Compiler Explorer.)
| roywashere wrote:
| Function call overhead can be a real issue in languages like
| Python and JavaScript. But you can or should measure when in
| doubt!
| jmull wrote:
| The rule of thumb is put ifs and fors where they belong -- no
| higher or lower. And if you're not sure, think about a little
| more.
|
| I don't think these rules are really that useful. I think this is
| a better variation: as you write ifs, fors and other control flow
| logic, consider why you're putting it where you are and whether
| you should move it to a higher or lower level. You want to think
| about the levels in terms of the responsibility each has. If you
| can't think of what the demarcations of responsibility are, or
| they are tangled, then think about it some more and see if you
| can clarify, simplify, or organize it better.
|
| OK, that's not a simple rule of thumb, but at least you'll be
| writing code with some thought behind it.
| benatkin wrote:
| If you know where they belong, this post isn't for you.
| crazygringo wrote:
| Exactly -- write code that matches clear, intuitive, logical,
| coherent organization.
|
| Because easy counterexamples to both of these rules are:
|
| 1) I'd much rather have a function check a condition in a
| single place, than have 20 places in the code which check the
| same condition before calling it -- the whole _point_ of
| functions is to encapsulate repeated code to reduce bugs
|
| 2) I'd often much rather leave the loop to the calling code
| rather than put it inside a function, because in different
| parts of the code I'll want to loop over the items only to a
| certain point, or show a progress bar, or start from the
| middle, or whatever
|
| Both of the "rules of thumb" in the article seem to be
| motivated by increasing performance by removing the overhead
| associated with calling a function. But one of the _top_ "rules
| of thumb" in coding is to _not prematurely optimize_.
|
| If you need to squeeze every bit of speed out of your code,
| then these might be good techniques to apply where needed (it
| especially depends on the language and interpreted vs.
| compiled). But these are _not at all_ rules of thumb in
| general.
| pests wrote:
| I think a key thing software engineers have to deal with
| opposed to physical engineers is an ever changing set of
| requirements.
|
| Because of this we optimize for different trade-offs in our
| codebase. Some projects need it, and you see them dropping
| down to handwritten SIMD assembly for example.
|
| But for the most of us the major concern is making changes,
| updates, and new features. Being able to come back and make
| changes again later for those ever changing requirements.
|
| A bridge engineer is never going to build abstractions and
| redundencies on a bridge "just in case gravity changes in the
| future". They "drop down to assembly" for this and make
| assumptions that _would_ cause major problems later if things
| do change (they wont).
| foota wrote:
| I think the argument here could be stated sort of as push
| "type" ifs up, and "state" ifs down. If you're in rust you
| can do this more by representing state in the type
| (additionally helping to make incorrect states
| unrepresentable) and then storing your objects by type.
|
| I have a feeling this guide is written for high performance,
| while it's true that premature optimization is the devil, I
| think following this sort of advice can prevent you from
| suffering a death from a thousand cuts.
| metadat wrote:
| Yes, this advice has the scent of premature optimization with
| the tradeoff sacrifice being readability/traceability.
| demondemidi wrote:
| You also want to avoid branches in loops for faster code. But
| there is a tradeoff between readability and optimization that
| needs to be understood.
| nerdponx wrote:
| Pushing "ifs" up has the downside that the preconditions and
| postconditions are no longer directly visible in the definition
| of a function, and must then be checked at each call site. In
| bigger projects with multiple contributors, such functions could
| end up getting reused outside their intended context. The result
| is bugs.
|
| One solution is some kind of contract framework, but then you end
| up rewriting the conditions twice, once in the contract and once
| in the code. The same is true with dependent types.
|
| One idea I haven't seen before is the idea of tagging regions of
| code as being part of some particular context, and defining
| functions that can only be called from that context.
|
| Hypothetically in Python you could write:
| @requires_context("VALIDATED_XY") def do_something(x, y):
| ... @contextmanager def validated_xy(x, y):
| if abs(x) < 1 and abs(y) < 1: with
| context("VALIDATED_XY"): yield x, y
| else: raise ValueError("out of bounds")
| with validated_xy(0.5, 0.5) as x_safe, y_safe:
| do_something(x_safe, y_safe) # Error!
| do_something(0.5, 0.5)
|
| The language runtime has no knowledge of what the context
| actually means, but with appropriate tools (and testing), we
| could design our programs to only establish the desired context
| when a certain condition is met.
|
| You could enforce this at the type level in a language like
| Haskell using something like the identity monad.
|
| But even if it's not enforced at the type level, it could be an
| interesting way to protect "unsafe" regions of code.
| bee_rider wrote:
| It seems like a decent general guideline.
|
| It has made me wonder, though--do there exist compilers nowadays
| that will turn if's inside inner loops into masked vector
| instructions somehow?
| p4bl0 wrote:
| I'm not convinced that such general rules can really apply to
| real-world code. I often see this kind of rules as ill-placed
| dogmas, because sadly even if this particular blog post start by
| saying these are _rule of thumbs_ they 're not always taken this
| way by young programmers. A few weeks ago YouTube was constantly
| pushing to me a video called "I'm a _never_ -nester" apparently
| of someone arguing that one should _never_ nest ifs, which is,
| well, kind of ridiculous. Anyway, back at the specific advice
| from this post, for example, take this code from the article:
| // GOOD if condition { for walrus in walruses {
| walrus.frobnicate() } } else { for
| walrus in walruses { walrus.transmogrify()
| } } // BAD for walrus in
| walruses { if condition {
| walrus.frobnicate() } else {
| walrus.transmogrify() } }
|
| In most cases where code is written in the "BAD"-labeled way, the
| `condition` part will depend on `walrus` and thus the `if` cannot
| actually be pushed up because if it can then it is quite obvious
| to anyone that you will be re-evaluating the same expression --
| the condition -- over and over in the loop, and programmers have
| a natural tendency to avoid that. But junior programmers or
| students reading dogmatic-like wise-sounding rules may produce
| worse code to strictly follow these kind of advices.
| hollerith wrote:
| Agree. Also, most of the time, the form that is _easier to
| modify_ is preferred, and even if `condition` does not
| _currently_ depend on `walrus`, it is preferable for it to be
| easy to make it depend on `walrus` in the future.
| torstenvl wrote:
| I wouldn't quite say this is _bad_ advice, but it isn 't
| necessarily good advice either.
|
| I think it's somewhat telling that the chosen language is Rust.
| The strong type system prevents a lot of defensive programming
| required in other languages. A C programmer who doesn't check the
| validity of pointers passed to functions and subsequently causes
| a NULL dereference is not a C programmer I want on my team. So at
| least some `if`s should definitely be down (preferably in a way
| where errors bubble up well).
|
| I feel less strongly about `for`s, but the fact that array
| arguments decay to pointers in C also makes me think that
| iteration should be up, not down. I can reliably know the length
| of an array in its originating function, but not in a function to
| which I pass it as an argument.
| ryanjshaw wrote:
| I wrote some batch (list) oriented code for a static analyzer
| recently.
|
| It was great until I decided to change my AST representation from
| a tuple+discrimated union to a generic type with a corresponding
| interface i.e. the interface handled the first member of the
| tuple (graph data) and the generic type the second member (node
| data).
|
| This solved a bunch of annoying problems with the tuple
| representation but all list-oriented code broke because the
| functions operating on a list of generics types couldn't play
| nice with the functions operating on lists of interfaces.
|
| I ended up switching to scalar functions pipelined between list
| functions because the generic type was more convenient to me than
| the list-oriented code. The reality is you often need to play
| with all the options until you find the "right" one for your use
| case, experience level and style.
| smokel wrote:
| Without a proper context, this is fairly strange, and possibly
| even bad advice.
|
| For loops and if statements are both control flow operations, so
| some of the arguments in the article make little sense. The
| strongest argument seems to be about performance, but that should
| typically be one of the latest concerns, especially for rule-of-
| thumb advice.
|
| Unfortunately, the author has managed to create a catchphrase out
| of it. Let's hope that doesn't catch on.
| actionfromafar wrote:
| try let's hope catch not on
___________________________________________________________________
(page generated 2023-11-15 23:00 UTC)