hngopher.com

       [HN Gopher] Push ifs up and fors down
       ___________________________________________________________________
        
       Push ifs up and fors down
        
       Author : celeritascelery
       Score  : 60 points
       Date   : 2023-11-15 21:41 UTC (1 hours ago)
        
 (HTM) web link (matklad.github.io)
 (TXT) w3m dump (matklad.github.io)
        
       | Waterluvian wrote:
       | This kind of rule of thumb usually contains some mote of wisdom,
       | but generally just creates the kind of thing I have to de-
       | dogmatize from newer programmers.
       | 
       | There's just always going to be a ton of cases where trying to
       | adhere to this too rigidly is worse. And "just know when not to
       | listen to this advice" is basically the core complexity here.
        
       | flashback2199 wrote:
       | Don't compilers, cpu branch prediction, etc all fix the
       | performance issues behind the scenes for the most part?
        
         | Thaxll wrote:
         | Well compilers are good and dumb at the same time.
        
         | bee_rider wrote:
         | Some of the moves seemed to change what an individual function
         | might do. For example they suggested pulling an if from a
         | function to the calling function.
         | 
         | Could the compiler figure it out? My gut says maybe; maybe if
         | it started by inlining the callee? But inlining happens based
         | on some heuristics usually, this seems like an unreliable
         | strategy if it would even work at all.
        
         | malux85 wrote:
         | No, compilers (correctly) prefer correctness over speed, so
         | they can optimise "obvious" things, but they cannot account for
         | domain knowledge or inefficiencies farther apart, or that
         | "might" alter some global state, so they can only make
         | optimisations where they can be very sure there's no side
         | effects, because they have to err on the side of caution.
         | 
         | They will only give you micro optimisations which could
         | cumulatively speed up sometimes but the burden of wholistic
         | program efficiency is still very much on the programmer.
         | 
         | If you're emptying the swimming pool using only a glass, the
         | compiler will optimise the glass size, and your arm movements,
         | but it won't optimise "if you're emptying the correct pool" or
         | "if you should be using a pump instead" - a correct answer to
         | the latter two could be 100,000 times more efficient than the
         | earlier two, which a compiler could answer.
        
         | jchw wrote:
         | The short answer is absolutely not, even when you are sure that
         | it should. Even something as simple as a naive byteswap
         | function might wind up generating surprisingly suboptimal code
         | depending on the compiler. If you really want to be sure,
         | you're just going to have to check. (And if you want to check,
         | a good tool is, of course, Compiler Explorer.)
        
         | roywashere wrote:
         | Function call overhead can be a real issue in languages like
         | Python and JavaScript. But you can or should measure when in
         | doubt!
        
       | jmull wrote:
       | The rule of thumb is put ifs and fors where they belong -- no
       | higher or lower. And if you're not sure, think about a little
       | more.
       | 
       | I don't think these rules are really that useful. I think this is
       | a better variation: as you write ifs, fors and other control flow
       | logic, consider why you're putting it where you are and whether
       | you should move it to a higher or lower level. You want to think
       | about the levels in terms of the responsibility each has. If you
       | can't think of what the demarcations of responsibility are, or
       | they are tangled, then think about it some more and see if you
       | can clarify, simplify, or organize it better.
       | 
       | OK, that's not a simple rule of thumb, but at least you'll be
       | writing code with some thought behind it.
        
         | benatkin wrote:
         | If you know where they belong, this post isn't for you.
        
         | crazygringo wrote:
         | Exactly -- write code that matches clear, intuitive, logical,
         | coherent organization.
         | 
         | Because easy counterexamples to both of these rules are:
         | 
         | 1) I'd much rather have a function check a condition in a
         | single place, than have 20 places in the code which check the
         | same condition before calling it -- the whole _point_ of
         | functions is to encapsulate repeated code to reduce bugs
         | 
         | 2) I'd often much rather leave the loop to the calling code
         | rather than put it inside a function, because in different
         | parts of the code I'll want to loop over the items only to a
         | certain point, or show a progress bar, or start from the
         | middle, or whatever
         | 
         | Both of the "rules of thumb" in the article seem to be
         | motivated by increasing performance by removing the overhead
         | associated with calling a function. But one of the _top_ "rules
         | of thumb" in coding is to _not prematurely optimize_.
         | 
         | If you need to squeeze every bit of speed out of your code,
         | then these might be good techniques to apply where needed (it
         | especially depends on the language and interpreted vs.
         | compiled). But these are _not at all_ rules of thumb in
         | general.
        
           | pests wrote:
           | I think a key thing software engineers have to deal with
           | opposed to physical engineers is an ever changing set of
           | requirements.
           | 
           | Because of this we optimize for different trade-offs in our
           | codebase. Some projects need it, and you see them dropping
           | down to handwritten SIMD assembly for example.
           | 
           | But for the most of us the major concern is making changes,
           | updates, and new features. Being able to come back and make
           | changes again later for those ever changing requirements.
           | 
           | A bridge engineer is never going to build abstractions and
           | redundencies on a bridge "just in case gravity changes in the
           | future". They "drop down to assembly" for this and make
           | assumptions that _would_ cause major problems later if things
           | do change (they wont).
        
           | foota wrote:
           | I think the argument here could be stated sort of as push
           | "type" ifs up, and "state" ifs down. If you're in rust you
           | can do this more by representing state in the type
           | (additionally helping to make incorrect states
           | unrepresentable) and then storing your objects by type.
           | 
           | I have a feeling this guide is written for high performance,
           | while it's true that premature optimization is the devil, I
           | think following this sort of advice can prevent you from
           | suffering a death from a thousand cuts.
        
         | metadat wrote:
         | Yes, this advice has the scent of premature optimization with
         | the tradeoff sacrifice being readability/traceability.
        
         | demondemidi wrote:
         | You also want to avoid branches in loops for faster code. But
         | there is a tradeoff between readability and optimization that
         | needs to be understood.
        
       | nerdponx wrote:
       | Pushing "ifs" up has the downside that the preconditions and
       | postconditions are no longer directly visible in the definition
       | of a function, and must then be checked at each call site. In
       | bigger projects with multiple contributors, such functions could
       | end up getting reused outside their intended context. The result
       | is bugs.
       | 
       | One solution is some kind of contract framework, but then you end
       | up rewriting the conditions twice, once in the contract and once
       | in the code. The same is true with dependent types.
       | 
       | One idea I haven't seen before is the idea of tagging regions of
       | code as being part of some particular context, and defining
       | functions that can only be called from that context.
       | 
       | Hypothetically in Python you could write:
       | @requires_context("VALIDATED_XY")       def do_something(x, y):
       | ...            @contextmanager       def validated_xy(x, y):
       | if abs(x) < 1 and abs(y) < 1:               with
       | context("VALIDATED_XY"):                   yield x, y
       | else:               raise ValueError("out of bounds")
       | with validated_xy(0.5, 0.5) as x_safe, y_safe:
       | do_something(x_safe, y_safe)            # Error!
       | do_something(0.5, 0.5)
       | 
       | The language runtime has no knowledge of what the context
       | actually means, but with appropriate tools (and testing), we
       | could design our programs to only establish the desired context
       | when a certain condition is met.
       | 
       | You could enforce this at the type level in a language like
       | Haskell using something like the identity monad.
       | 
       | But even if it's not enforced at the type level, it could be an
       | interesting way to protect "unsafe" regions of code.
        
       | bee_rider wrote:
       | It seems like a decent general guideline.
       | 
       | It has made me wonder, though--do there exist compilers nowadays
       | that will turn if's inside inner loops into masked vector
       | instructions somehow?
        
       | p4bl0 wrote:
       | I'm not convinced that such general rules can really apply to
       | real-world code. I often see this kind of rules as ill-placed
       | dogmas, because sadly even if this particular blog post start by
       | saying these are _rule of thumbs_ they 're not always taken this
       | way by young programmers. A few weeks ago YouTube was constantly
       | pushing to me a video called "I'm a _never_ -nester" apparently
       | of someone arguing that one should _never_ nest ifs, which is,
       | well, kind of ridiculous. Anyway, back at the specific advice
       | from this post, for example, take this code from the article:
       | // GOOD         if condition {           for walrus in walruses {
       | walrus.frobnicate()           }         } else {           for
       | walrus in walruses {             walrus.transmogrify()
       | }         }                  // BAD         for walrus in
       | walruses {           if condition {
       | walrus.frobnicate()           } else {
       | walrus.transmogrify()           }         }
       | 
       | In most cases where code is written in the "BAD"-labeled way, the
       | `condition` part will depend on `walrus` and thus the `if` cannot
       | actually be pushed up because if it can then it is quite obvious
       | to anyone that you will be re-evaluating the same expression --
       | the condition -- over and over in the loop, and programmers have
       | a natural tendency to avoid that. But junior programmers or
       | students reading dogmatic-like wise-sounding rules may produce
       | worse code to strictly follow these kind of advices.
        
         | hollerith wrote:
         | Agree. Also, most of the time, the form that is _easier to
         | modify_ is preferred, and even if `condition` does not
         | _currently_ depend on `walrus`, it is preferable for it to be
         | easy to make it depend on `walrus` in the future.
        
       | torstenvl wrote:
       | I wouldn't quite say this is _bad_ advice, but it isn 't
       | necessarily good advice either.
       | 
       | I think it's somewhat telling that the chosen language is Rust.
       | The strong type system prevents a lot of defensive programming
       | required in other languages. A C programmer who doesn't check the
       | validity of pointers passed to functions and subsequently causes
       | a NULL dereference is not a C programmer I want on my team. So at
       | least some `if`s should definitely be down (preferably in a way
       | where errors bubble up well).
       | 
       | I feel less strongly about `for`s, but the fact that array
       | arguments decay to pointers in C also makes me think that
       | iteration should be up, not down. I can reliably know the length
       | of an array in its originating function, but not in a function to
       | which I pass it as an argument.
        
       | ryanjshaw wrote:
       | I wrote some batch (list) oriented code for a static analyzer
       | recently.
       | 
       | It was great until I decided to change my AST representation from
       | a tuple+discrimated union to a generic type with a corresponding
       | interface i.e. the interface handled the first member of the
       | tuple (graph data) and the generic type the second member (node
       | data).
       | 
       | This solved a bunch of annoying problems with the tuple
       | representation but all list-oriented code broke because the
       | functions operating on a list of generics types couldn't play
       | nice with the functions operating on lists of interfaces.
       | 
       | I ended up switching to scalar functions pipelined between list
       | functions because the generic type was more convenient to me than
       | the list-oriented code. The reality is you often need to play
       | with all the options until you find the "right" one for your use
       | case, experience level and style.
        
       | smokel wrote:
       | Without a proper context, this is fairly strange, and possibly
       | even bad advice.
       | 
       | For loops and if statements are both control flow operations, so
       | some of the arguments in the article make little sense. The
       | strongest argument seems to be about performance, but that should
       | typically be one of the latest concerns, especially for rule-of-
       | thumb advice.
       | 
       | Unfortunately, the author has managed to create a catchphrase out
       | of it. Let's hope that doesn't catch on.
        
         | actionfromafar wrote:
         | try             let's hope         catch             not on
        
       ___________________________________________________________________
       (page generated 2023-11-15 23:00 UTC)