[HN Gopher] Four Kinds of Optimisation
___________________________________________________________________
Four Kinds of Optimisation
Author : ltratt
Score : 32 points
Date : 2023-11-14 12:04 UTC (10 hours ago)
(HTM) web link (tratt.net)
(TXT) w3m dump (tratt.net)
| gavinhoward wrote:
| This is a great article, and I have some notes.
|
| I initially disagreed with the author when he said, "we tend to
| overestimate how much we know about the software we're working
| on."
|
| I disagreed because that is _not_ my experience; I do know a lot
| about my software.
|
| However, then he said, "We overemphasise the parts of the system
| we've personally worked on, particularly those we've most
| recently worked on. We downplay other parts of the system,
| including the impact of dependencies (e.g. libraries)."
|
| Oh. Um, well, this is _also_ not my experience.
|
| I work on software alone, and I do not have any dependencies.
|
| So the author is right; I just have weird experience. :)
|
| I agree with the four kinds of optimization, especially since he
| focuses on optimization humans do. I think you could even
| shoehorn compiler optimizations into those four. For example,
| strength reduction could be considered a "better algorithm."
|
| His observations about the differences between _best_ , _average_
| , and _worst_ cases are the things programmers need to learn
| most. At some point, our code will _always_ hit bad cases, and we
| need to know if it will hit it _often_.
|
| I love his example of Timsort because I've analyzed and
| implemented it myself. It ended being more comments than code.
| (Of course, that includes two license header comments, so
| slightly less comments than code actually.)
|
| Timsort is pretty complicated, and Tim isn't the best at naming
| things, so it took a bit for me to understand what was going on.
|
| (The most prominent example is the galloping functions. He named
| them "gallop_left" and "gallop_right," but they both gallop left
| _and_ right. I renamed them to "gallop2Leftmost" and
| "gallop2Rightmost," which is _actually_ what they do.)
|
| The author says he biases to writing algorithms and adopting pre-
| written data structures. This is a good thing. However, after
| having written data structures (because I work in C, which
| doesn't have them), I would encourage every programmer to write
| some. Best, average, and worst cases are easier to learn on data
| structures than algorithms. Still use pre-written ones though;
| they are usually better optimized.
|
| The trick about reducing memory is _crucial_. I recently spent an
| entire refactor just reducing the size of some of the most-used
| structs. I cut the size of one struct in _half_. `pahole` is
| great for this.
|
| The part of PyPy is a great example too. A great book to read is
| _Is Parallel Programming Hard, And, If So, What Can You Do About
| It?_ The entire point of the book is to make you think, "Okay, I
| need some concurrency or parallelism; what is the easiest way to
| get it?" The PyPy example is that same thing with optimization;
| what's the easiest way to get the performance you need?
|
| Every time you run into an optimization problem, ask that
| question first, and it will get faster to answer every time. And
| the author's summary reinforces this.
|
| [1]:
| https://mirrors.edge.kernel.org/pub/linux/kernel/people/paul...
| dj_mc_merlin wrote:
| > However, then he said, "We overemphasise the parts of the
| system we've personally worked on, particularly those we've
| most recently worked on. We downplay other parts of the system,
| including the impact of dependencies (e.g. libraries)."
|
| > Oh. Um, well, this is also not my experience.
|
| > I work on software alone, and I do not have any dependencies.
|
| If you work in a company of 100+ engineers with multiple
| specialties it's impossible to keep up with everything. At <50
| engineers if you collaborate and/or switch teams it's possible
| to know almost everything, but that's the most I think the
| majority of us have the internal storage/RAM for.
| gavinhoward wrote:
| Yes, I said as much in my comment.
| llamajams wrote:
| There needs to be more articles like this, far too often the
| answers on SO and the like is canned and dissimissive; "you don't
| need to optimize because who cares youre writing a crud app
| anyway".
| armchairhacker wrote:
| The very first CS class taught at my undergraduate, where many
| students get exposed to programming the very first time and
| solve the simplest problems in functional style (e.g. list map,
| fold, fibonacci), introduces problems which require
| optimizations in the later part.
|
| No matter how powerful your computer and how small your
| computer program is, you have to optimize at least the
| exponential-time algorithms, and sometimes the high-degree-
| polynomial ones. When writing a real-time or production app you
| have to optimize many polynomial or even linear-time
| algorithms.
|
| Micro-optimizations like allocation and boxed numbers?
| Unnecessary unless you need performance, only apply a constant
| multiplier to your program's speed. But macro-optimizations
| which affect time complexity can't be ignored.
| nerpderp82 wrote:
| v = [ random.randrange(0, 100) for _ in range(1000) ]
| %timeit sorted(v) ## include a free extra allocation 71.2
| us +- 1.99 us per loop (mean +- std. dev. of 7 runs, 10,000 loops
| each)
| morelisp wrote:
| > it's difficult to measure the indirect impact of things like
| memory locality -- I have heard such factors blamed for poor
| performance much more often than I have seen such factors proven
| as responsible for poor performance. In general, I only look to
| such factors when I'm getting desperate.
|
| No thanks.
|
| Cache-related perf counters are easily measurable, but the
| impacts are so big you rarely need them.
| webnrrd2k wrote:
| Its a good post, but I'd add that there isvalso a different kind
| of optimization where you optimize by minimizing th overall
| cognitive load of any given code.
|
| For example, it's important to develop a feel for data
| structures. Often, if you choose the right data structure, the
| algorithm is fairly easy. In other words, when faced with a
| choice of complex algorithm vs complex data structures, it's
| generally easier to work with a complex data structure and use a
| simpler algorithm.
|
| Data is fairly static, and therefore easier to reason about, vs a
| complex and "dynamic" algorithm. Optimization for ease of reading
| and maintenance is also really important.
| hinkley wrote:
| A great example of this is Norvig's sudoku solver. Arranging
| the data the right way makes solving the problem a mere matter
| of correct bookkeeping .
___________________________________________________________________
(page generated 2023-11-14 23:00 UTC)