[HN Gopher] Four Kinds of Optimisation
       ___________________________________________________________________
        
       Four Kinds of Optimisation
        
       Author : ltratt
       Score  : 32 points
       Date   : 2023-11-14 12:04 UTC (10 hours ago)
        
 (HTM) web link (tratt.net)
 (TXT) w3m dump (tratt.net)
        
       | gavinhoward wrote:
       | This is a great article, and I have some notes.
       | 
       | I initially disagreed with the author when he said, "we tend to
       | overestimate how much we know about the software we're working
       | on."
       | 
       | I disagreed because that is _not_ my experience; I do know a lot
       | about my software.
       | 
       | However, then he said, "We overemphasise the parts of the system
       | we've personally worked on, particularly those we've most
       | recently worked on. We downplay other parts of the system,
       | including the impact of dependencies (e.g. libraries)."
       | 
       | Oh. Um, well, this is _also_ not my experience.
       | 
       | I work on software alone, and I do not have any dependencies.
       | 
       | So the author is right; I just have weird experience. :)
       | 
       | I agree with the four kinds of optimization, especially since he
       | focuses on optimization humans do. I think you could even
       | shoehorn compiler optimizations into those four. For example,
       | strength reduction could be considered a "better algorithm."
       | 
       | His observations about the differences between _best_ , _average_
       | , and _worst_ cases are the things programmers need to learn
       | most. At some point, our code will _always_ hit bad cases, and we
       | need to know if it will hit it _often_.
       | 
       | I love his example of Timsort because I've analyzed and
       | implemented it myself. It ended being more comments than code.
       | (Of course, that includes two license header comments, so
       | slightly less comments than code actually.)
       | 
       | Timsort is pretty complicated, and Tim isn't the best at naming
       | things, so it took a bit for me to understand what was going on.
       | 
       | (The most prominent example is the galloping functions. He named
       | them "gallop_left" and "gallop_right," but they both gallop left
       | _and_ right. I renamed them to  "gallop2Leftmost" and
       | "gallop2Rightmost," which is _actually_ what they do.)
       | 
       | The author says he biases to writing algorithms and adopting pre-
       | written data structures. This is a good thing. However, after
       | having written data structures (because I work in C, which
       | doesn't have them), I would encourage every programmer to write
       | some. Best, average, and worst cases are easier to learn on data
       | structures than algorithms. Still use pre-written ones though;
       | they are usually better optimized.
       | 
       | The trick about reducing memory is _crucial_. I recently spent an
       | entire refactor just reducing the size of some of the most-used
       | structs. I cut the size of one struct in _half_. `pahole` is
       | great for this.
       | 
       | The part of PyPy is a great example too. A great book to read is
       | _Is Parallel Programming Hard, And, If So, What Can You Do About
       | It?_ The entire point of the book is to make you think,  "Okay, I
       | need some concurrency or parallelism; what is the easiest way to
       | get it?" The PyPy example is that same thing with optimization;
       | what's the easiest way to get the performance you need?
       | 
       | Every time you run into an optimization problem, ask that
       | question first, and it will get faster to answer every time. And
       | the author's summary reinforces this.
       | 
       | [1]:
       | https://mirrors.edge.kernel.org/pub/linux/kernel/people/paul...
        
         | dj_mc_merlin wrote:
         | > However, then he said, "We overemphasise the parts of the
         | system we've personally worked on, particularly those we've
         | most recently worked on. We downplay other parts of the system,
         | including the impact of dependencies (e.g. libraries)."
         | 
         | > Oh. Um, well, this is also not my experience.
         | 
         | > I work on software alone, and I do not have any dependencies.
         | 
         | If you work in a company of 100+ engineers with multiple
         | specialties it's impossible to keep up with everything. At <50
         | engineers if you collaborate and/or switch teams it's possible
         | to know almost everything, but that's the most I think the
         | majority of us have the internal storage/RAM for.
        
           | gavinhoward wrote:
           | Yes, I said as much in my comment.
        
       | llamajams wrote:
       | There needs to be more articles like this, far too often the
       | answers on SO and the like is canned and dissimissive; "you don't
       | need to optimize because who cares youre writing a crud app
       | anyway".
        
         | armchairhacker wrote:
         | The very first CS class taught at my undergraduate, where many
         | students get exposed to programming the very first time and
         | solve the simplest problems in functional style (e.g. list map,
         | fold, fibonacci), introduces problems which require
         | optimizations in the later part.
         | 
         | No matter how powerful your computer and how small your
         | computer program is, you have to optimize at least the
         | exponential-time algorithms, and sometimes the high-degree-
         | polynomial ones. When writing a real-time or production app you
         | have to optimize many polynomial or even linear-time
         | algorithms.
         | 
         | Micro-optimizations like allocation and boxed numbers?
         | Unnecessary unless you need performance, only apply a constant
         | multiplier to your program's speed. But macro-optimizations
         | which affect time complexity can't be ignored.
        
       | nerpderp82 wrote:
       | v = [ random.randrange(0, 100) for _ in range(1000) ]
       | %timeit sorted(v) ## include a free extra allocation         71.2
       | us +- 1.99 us per loop (mean +- std. dev. of 7 runs, 10,000 loops
       | each)
        
       | morelisp wrote:
       | > it's difficult to measure the indirect impact of things like
       | memory locality -- I have heard such factors blamed for poor
       | performance much more often than I have seen such factors proven
       | as responsible for poor performance. In general, I only look to
       | such factors when I'm getting desperate.
       | 
       | No thanks.
       | 
       | Cache-related perf counters are easily measurable, but the
       | impacts are so big you rarely need them.
        
       | webnrrd2k wrote:
       | Its a good post, but I'd add that there isvalso a different kind
       | of optimization where you optimize by minimizing th overall
       | cognitive load of any given code.
       | 
       | For example, it's important to develop a feel for data
       | structures. Often, if you choose the right data structure, the
       | algorithm is fairly easy. In other words, when faced with a
       | choice of complex algorithm vs complex data structures, it's
       | generally easier to work with a complex data structure and use a
       | simpler algorithm.
       | 
       | Data is fairly static, and therefore easier to reason about, vs a
       | complex and "dynamic" algorithm. Optimization for ease of reading
       | and maintenance is also really important.
        
         | hinkley wrote:
         | A great example of this is Norvig's sudoku solver. Arranging
         | the data the right way makes solving the problem a mere matter
         | of correct bookkeeping .
        
       ___________________________________________________________________
       (page generated 2023-11-14 23:00 UTC)