[HN Gopher] Goodhart's Law
       ___________________________________________________________________
        
       Goodhart's Law
        
       Author : rfreytag
       Score  : 132 points
       Date   : 2021-09-17 13:42 UTC (9 hours ago)
        
 (HTM) web link (en.wikipedia.org)
 (TXT) w3m dump (en.wikipedia.org)
        
       | [deleted]
        
       | baron_harkonnen wrote:
       | I once mentioned Goodhart's Law to a data scientist at a company
       | and they immediately rejected it based on the unironic assertion
       | that
       | 
       | "that would mean that KPIs shouldn't be the sole measure of our
       | performance that that doesn't make sense!"
       | 
       | My experience in the field has been that an astounding number of
       | products have been destroyed and users harmed by failing to heed
       | Goodhart's Law.
        
         | dang wrote:
         | Is Goodhart's Law a sort of upper bound on the usefulness of
         | data for decision-making in general?
        
           | baron_harkonnen wrote:
           | Rather than implying a limit to the usefulness of data, I
           | find it speaks more to the folly of substituting data for
           | critical thinking.
           | 
           | You can reduce a fever by treating the underlying infection,
           | soaking in ice water, or taking acetaminophen. No good doctor
           | would judge a patient's health solely because a single
           | metric, namely body temperature, was able to be moved into
           | acceptable ranges. That doesn't mean temperature isn't
           | extremely valuable data, and essential to decision making,
           | but that it cannot be a substitute for understanding and
           | solving the real problem.
           | 
           | I once knew of a SaaS company that had perpetually growing
           | MRR (Monthly Recurring Revenue), great right? Except, churn
           | was also growing. An increase in MRR was achieved by
           | upselling a perpetually shrinking group of core customers.
           | The core KPI of this company was MRR, and, unsurprisingly,
           | this company does not exist anymore. Again, here is a case
           | where we can see all that other data (churn, upselling) is
           | very useful, as is the KPI. But the key to success or failure
           | here is whether or not you want to really expend the effort
           | to understand the problem or just chase a KPI.
           | 
           | KPIs are seductive because they make managing team's
           | performance seem much easier: just get this number higher and
           | you're doing good, get it lower and you're doing bad. But
           | that's like playing a game of chess where each piece is
           | controlled by a different person, and that person is judged
           | solely on how many times they can get the king in check.
        
             | cratermoon wrote:
             | > No good doctor would judge a patient's health solely
             | because a single metric, namely body temperature
             | 
             | That's a good example because as Strathern's formulation
             | notes, the problem lies in make the metric the target. It
             | would be folly to think that reducing a patient's body
             | temperature to the normal range is sufficient for curing
             | illness. GE's Jack Welch famously focused solely on the
             | stock performance as a measure of success. It worked, by
             | that measure GE was wildly successful. By almost any other
             | measure Welch destroyed GE
             | https://www.bnnbloomberg.ca/jack-welch-inflicted-great-
             | damag...
        
         | some_furry wrote:
         | I wonder if measuring managers' understanding of Goodhart's Law
         | would result in better management.
         | 
         | /s
        
         | dhosek wrote:
         | Are you sure it was unironic? Because I sure can't imagine
         | anyone saying that unironically.
        
           | marcosdumay wrote:
           | I've heard plenty of "what else do I have to work with?",
           | that has about the same meaning.
        
             | rhizome wrote:
             | The politician's syllogism comes to mind: "Something must
             | be done, this is something, so we must do this."
             | 
             | https://en.wikipedia.org/wiki/Politician%27s_syllogism
        
             | gipp wrote:
             | That at least acknowledges that it's an unsatisfactory
             | situation, OP's conversation didn't even have that level of
             | awareness.
        
           | dboreham wrote:
           | Presumably someone, somewhere, thinks KPIs are a good idea.
           | 
           | Edit: but that person is unlikely to be subject to KPIs
           | themselves.
        
             | AnimalMuppet wrote:
             | Re your edit: Not necessarily. They may be a winner under
             | the KPI regime, and may feel that they are less likely to
             | be so under a saner regime.
             | 
             | For example, if I'm a manager that can make my people hit
             | their KPIs, and _my_ KPIs are about getting my people to
             | hit theirs, then I 'm subject to KPIs, _and I like it_. It
             | 's easier than making my people succeed at what really
             | matters, and it makes me look good.
        
             | hpoe wrote:
             | To push against this point I prefer KPIs, or something
             | objective that I can be measured against, now that doesn't
             | mean I like bad KPIs but the fact of the matter is there
             | are always going to be KPIs the only question is how
             | explicit or implicit they are.
             | 
             | When KPIs are explicit everyone knows what they are and can
             | modify their behavior to optimize for their KPIs when all
             | measurement goes away the new KPI is the arbitrary one held
             | in the decision makers head, and now instead of it being an
             | explicit bar that can be objectively used to make decisions
             | the entire system falls apart into politics and emphasizing
             | appearances over work because the only thing that matters
             | with implicit KPIs are what everyone else thinks of you,
             | which is much easier to manipulate than the amount of cash
             | you brought in.
        
       | cs702 wrote:
       | The same phenomenon has many different names. From the OP:
       | 
       | > See also:
       | 
       | > Campbell's law - "The more any quantitative social indicator is
       | used for social decision-making, the more subject it will be to
       | corruption pressures"
       | https://en.wikipedia.org/wiki/Campbell%27s_law
       | 
       | > Cobra effect - when incentives designed to solve a problem end
       | up rewarding people for making it worse
       | https://en.wikipedia.org/wiki/Cobra_effect
       | 
       | > Gaming the system
       | https://en.wikipedia.org/wiki/Gaming_the_system
       | 
       | > Lucas critique - it is naive to try to predict the effects of a
       | change in economic policy entirely on the basis of relationships
       | observed in historical data
       | https://en.wikipedia.org/wiki/Lucas_critique
       | 
       | > McNamara fallacy - involves making a decision based solely on
       | quantitative observations (or metrics) and ignoring all others
       | https://en.wikipedia.org/wiki/McNamara_fallacy
       | 
       | > Overfitting https://en.wikipedia.org/wiki/Overfitting
       | 
       | > Reflexivity (social theory)
       | https://en.wikipedia.org/wiki/Reflexivity_(social_theory)
       | 
       | > Reification (fallacy)
       | https://en.wikipedia.org/wiki/Reification_(fallacy)
       | 
       | > San Francisco Declaration on Research Assessment - 2012
       | manifesto against using the journal impact factor to assess a
       | scientist's work
       | https://en.wikipedia.org/wiki/San_Francisco_Declaration_on_R...
       | 
       | > Volkswagen emissions scandal - 2010s diesel emissions scandal
       | involving Volkswagen
       | https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal
       | 
       | Source: https://en.wikipedia.org/wiki/Goodhart%27s_law#See_also
        
       | Beldin wrote:
       | Inspired by a scandalous fraud case in scientific publishing, i
       | wrote a paper applying Goodhart's Law to scientific publishing
       | ("A Much-needed Security Perspective on Publication Metrics",
       | published at the Security Protocols Workshop 2017). That was a
       | really fun paper to write! Basically, how can you systematically
       | start cheating at publishing - and how could you catch that?
       | 
       | The most fun was challenging the audience - security researchers
       | all - to think even more outside the box than usual for them.
       | 
       | I'm still (slowly) forging ahead on ideas spawned by this paper.
       | Bringing the ideas of catching crooks to reality was not as
       | straightforward as hoped. Then again, when has any project ever
       | gone as planned?
        
       | MattGaiser wrote:
       | The ignoring of this law in software development is referred to
       | as Scrum.
        
       | azhenley wrote:
       | I wrote a blog post a few months ago about Goodhart's Law in my
       | life, titled "Gamification, life, and the pursuit of a gold
       | badge".
       | 
       | https://web.eecs.utk.edu/~azh/blog/gamification.html
       | 
       | The Tyranny of Metrics is a good book that covers real-world
       | cases of metrics gone wrong.
        
       | paulpauper wrote:
       | This is why efforts at raising school test scores have not
       | improved actual achievement
        
         | wyager wrote:
         | And why sending everyone to college hasn't achieved anything
         | good.
        
         | dnautics wrote:
         | This is why efforts at alleviating poverty have not improved
         | actual poverty...
        
           | umvi wrote:
           | Poverty is a special case though because it's relative. You
           | could be a millionaire with a yacht on earth but be below the
           | poverty line if the middle and upper classes live in space
           | stations or on other planets with higher standards of living.
           | 
           | An impoverished person in the US is rich compared to an
           | impoverished person in India or Africa.
        
       | nicodds wrote:
       | It is like quantum mechanics: the measurement process produces a
       | perturbation of the physical system
        
       | rpdillon wrote:
       | A closely-related effect I often cite is the McNamara fallacy[0],
       | which is essentially about the tendency to focus on aspects of a
       | system that are easily measurable, often at the expense of
       | aspects that are not. I see it as one of the weaknesses of the
       | data-driven decision-making movement, since many interpret "data"
       | to mean "numbers". I think this fallacy can partly explain why
       | Goodhart's Law holds: it's the non-measurable (or difficult-to-
       | measure) aspects that suffer most when a metric becomes a target,
       | since measurable aspects could be (and often are) integrated into
       | the target metric.
       | 
       | [0]: https://en.wikipedia.org/wiki/McNamara_fallacy
        
         | serverholic wrote:
         | Another problem is that overly focusing on metrics means that
         | people with good instincts are brought down to the same level
         | as those with poor instincts.
         | 
         | As long as metrics are increasing someone can write shit code
         | and design a bloated product.
        
         | burnafter182 wrote:
         | As it's told by Mandelbrot, the January effect was identified.
         | Rather quickly it was exploited to the point where it no longer
         | existed because the market exerted pressures to counteract it
         | in the process of exploiting it.
         | 
         | "Consider three cases. First, suppose a clever chart-reader
         | thinks he has spotted a pattern in the old price records--say,
         | every January, stock prices tend to rise. Can he get rich on
         | that information, by buying in December and selling in January?
         | Answer: No. If the market is big and efficient then others will
         | spot the trend, too, or at least spot his trading on it. Soon,
         | as more traders anticipate the January rally, more people are
         | buying in December--and then, to beat the trend for a December
         | rally, in November. Eventually, the whole phenomenon is spread
         | out over so many months that it ceases to be noticeable. The
         | trend has vanished, killed by its very discovery. In fact, in
         | 1976 some economists spotted just such a pattern of regular
         | January rallies in the stocks of small companies. Many
         | investors close their losing positions towards the end of the
         | year so they can book the loss as a tax deduction--and the
         | market rebounds when they reinvest early in the new tax year.
         | The effect is most pronounced on small stocks, which are more
         | sensitive to small money movements. Alas, before you rush out
         | to trade on this trend, you should know that its discovery
         | seems to have killed it. After all the academic hoopla over it,
         | it no longer shows up as clearly in price charts."
         | 
         | -Benoit Mandelbrot, The Misbehavior of Markets
         | 
         | https://en.wikipedia.org/wiki/January_effect
        
           | Stratoscope wrote:
           | This reminds me so much of a Christmas present hack my sister
           | and I invented when we were kids. We used to open all our
           | presents on Christmas morning.
           | 
           | Then we begged "Can't we open a couple of presents on
           | Christmas Eve?" So we got to open a few that night.
           | 
           | Next year was "Well, how about Christmas Eve Morning? Maybe
           | just one or two?"
           | 
           | And the next year was "The 23rd is practically Christmas Eve,
           | isn't it? It's just a few hours apart. Can't we open all our
           | presents on the evening of the 23rd?" And we did!
           | 
           | We didn't push it past that: we were already so happy that we
           | got our presents a day and a half before all our friends!
        
           | kbelder wrote:
           | "First get rich; then, publish".
           | 
           | If that's not a law, it should at least be a rule-of-thumb.
        
           | miki123211 wrote:
           | As far as I'm aware, this is why markets are considered a
           | second-order chaotic system. In those systems, measurements
           | of how the system performs can actually influence what
           | happens next. This is in contrast to first order systems,
           | i.e. the weather, which are hard to simulate, but the results
           | of the simulations don't affect their accuracy.
        
         | rossdavidh wrote:
         | In the context of manufacturing, W.E.Deming said something
         | similar: "that which gets measured, gets improved". His
         | conclusion from this was a little different than McNamara's,
         | though. Since you will inevitably want to track your progress,
         | make sure you track as many things as possible, because
         | anything which is not tracked will get sacrificed to that which
         | is. Up to a point, it's true.
         | 
         | One issue is that some things, like vulnerability to supply
         | chain disruptions, are intrinsically harder to track because
         | they are based on rare occurrences. Thus, they will tend to get
         | sacrificed in favor of measures which are more frequent,
         | leading to an emphasis on short-term strategies.
        
           | CalChris wrote:
           | Actually, Deming said much the opposite:                 It
           | is wrong to suppose that if you can't measure it, you can't
           | manage it - a costly myth.
           | 
           | p. 26 of The New Economics for Industry, Government,
           | Education
           | 
           | It wasn't Drucker either.
           | 
           | https://medium.com/centre-for-public-impact/what-gets-
           | measur...
           | 
           | This whole mindless _must measure, must measure_ mentality
           | has been criticized since the 50s. Measurement is a tool.
           | There are many tools.
        
             | rossdavidh wrote:
             | In my experience, if it isn't measured, then it is assumed
             | that the policy (whatever it is) is working. If you don't
             | measure, you can't be surprised by "wow, it didn't work
             | like we thought". Therefore, mistakes don't get recognized
             | or corrected. There are many tools, but measurement is one
             | of the only ones that brings unexpected bad news to the
             | user, and that is invaluable.
        
             | mumblemumble wrote:
             | I would argue that it's worthwhile to measure as much as
             | you can, insofar as it facilitates orderly decisionmaking
             | processes.
             | 
             | The problem is that people tend to think that all
             | measurement is necessarily quantitative. I think that this
             | might be a version of the streetlight effect? Quantitative
             | measurements tend to be much easier to collect and analyze
             | than qualitative measurements. Oftentimes you can let it
             | all run on autopilot, whereas doing good qualitative work
             | always requires concentration, effort, and expertise.
        
           | mhink wrote:
           | > One issue is that some things, like vulnerability to supply
           | chain disruptions, are intrinsically harder to track because
           | they are based on rare occurrences. Thus, they will tend to
           | get sacrificed in favor of measures which are more frequent,
           | leading to an emphasis on short-term strategies.
           | 
           | I suppose this is part of why "chaos engineering" has gained
           | popularity- introducing artificial disruptions at a known
           | rate makes it easier to quantify the impact of otherwise-
           | unusual events.
        
             | rossdavidh wrote:
             | Ooh, good point! Another example is auditing, where you
             | substitute regular, frequent disruptions (being audited is
             | disruptive to normal operations) for infrequent, less
             | predictable, bigger disruptions.
        
         | cryptica wrote:
         | To make matters worse, it's also a vicious cycle. For example,
         | if everyone is focused only on specific kinds of data and
         | ignores observable reality, the trend in the data will become
         | self-fulfilling until a point when the perceived inconsistency
         | between the data and observable reality has grown so large that
         | it becomes impossible to ignore.
         | 
         | To understand what the big problems are today, you just have to
         | think about the kinds of data which people in government (and
         | the public) haven't been thinking about or aiming for. For
         | example: Happiness, honesty, altruism, sanity... These are not
         | measured and not targeted so they got completely crushed.
         | 
         | In the past, large, powerful religious groups would target
         | these characteristics but nowadays, society is more secular so
         | these aspects of our lives have suffered significantly.
        
           | serverholic wrote:
           | That's one of the thing Andrew Yang talks about. GDP isn't
           | the only thing that matters, nor is it the best metric.
        
         | miki123211 wrote:
         | I believe this effect greatly contributes to why government IT
         | systems are so bad.
         | 
         | When you're writing a procurement contract, it's relatively
         | easy to describe what the requested system must do, but almost
         | impossible to enforce a great UI design, as great UI design
         | isn't objectively measurable.
         | 
         | As a contractor, you're optimizing for minimum money spent, so
         | if good design is not required, good design gets sacrificed
         | first.
         | 
         | One solution to this specific problem would be to conduct user
         | surveys on how pleasant the system is to use, requiring a
         | specific score before the contract is deemed completed.
         | 
         | This trend manifests more generally in bigger organizations.
         | Smaller orgs let people judge things subjectively, so all
         | possible aspects are taken into account, making those things
         | relatively good; this is why startups succeed. In a bigger org,
         | there are often objective judgement measures to prevent the
         | influence of personal biases, politics or even bribes. However,
         | those measures poorly reflect how good the thing in question
         | actually is. This is why a big corp might produce worse
         | software, even when competing against a small and underfunded
         | startup.
         | 
         | As an example, Apple exempted the first iPhone crew from most
         | internal company procedures, creating a quasi-startup inside
         | Apple. Steve Jobs always had the final say, and his opinions
         | were based on what he thought personally, not on how many
         | points in a requirements specification were satisfied. I
         | believe this was one of the reasons for the iPhone's success.
        
           | cratermoon wrote:
           | > government IT systems
           | 
           | This is not limited to governments. Although it's a common
           | naive bias to assert that governments are worse and less
           | efficient than private industry, what is really happening is
           | that government budgets and projects are open to the public,
           | done in the open. For every failed Healthcare.gov, there are
           | dozens of private industry failures that don't make the news
           | because the operations are not subject to the public
           | disclosure rules.
        
         | ghaff wrote:
         | It's a genuinely hard problem because the quantifiable output
         | metrics are easy to measure and an individual often does have
         | some level of direct control over them. So we convince
         | ourselves that they're a reasonable proxy for something we care
         | about but aren't sure how to measure and that we have control
         | systems in place, whether management or individual
         | responsibility, that largely prevent e.g. quality being thrown
         | away in pursuit of quantity.
         | 
         | And we're often not entirely wrong if we do pick reasonable
         | proxies and have reasonable control systems in place. Because
         | throwing up our hands and saying metrics are useless is usually
         | not the answer either.
        
           | serverholic wrote:
           | The better option is to put someone in charge with a vision.
           | However that is risky.
        
           | TeMPOraL wrote:
           | A flavor of this problem is what could be called "diffusion
           | of responsibility", for a lack of better term. An individual
           | who defined the measure and then optimizes for it will
           | quickly figure out when their measure stops being a good
           | proxy. But in organizations there usually isn't a single
           | person who both understands what the measures are proxying,
           | and has the power to remove a metric once it is used up, or
           | get people to stop overfitting it.
        
             | Jtsummers wrote:
             | Another thing I've observed is when you have two measures
             | that operate at different time scales. Both may even be
             | valid measures, but the one measured (and responded to)
             | more often has a stronger impact, and can negatively impact
             | the less frequently measured metric when there's a conflict
             | between them.
             | 
             | A particular instance for this has been (to keep it simple)
             | quantity (speed) and quality in production environments
             | (factories and the like). Daily throughput measures paired
             | with less frequent quality measures. The desire is to keep
             | throughput high, and quality ends up suffering as a result.
             | By integrating quality measures into the process you make
             | the two measures compete on more equal footing, forcing a
             | balance. At least one factory I worked in (well, adjacent
             | to, I was in the software portion not the assembly line)
             | massively reduced their quality problems by integrating
             | quality checks between each station. This contrasted with
             | the prior years where throughput, being measured and
             | reacted to daily, drove them to make things so fast that
             | they had piles of rework at the end. Integrating the
             | quality measures between stations slowed them down, but
             | their rework numbers turned into a rounding error (over a
             | decade ago so I've forgotten the exact numbers, but they
             | went from having items needing rework nearly every day to
             | maybe one or two a month). As a result their real
             | (deliverable to customers) production increased and their
             | cost per unit dropped.
        
               | AceyMan wrote:
               | I see this as the root cause of the recently announced
               | class action suit against LADWP over their implementation
               | of tiered electricity pricing.
               | 
               | The tiers (kwh rates) are in hunks of the a day measured
               | in hours.
               | 
               | But the _reporting_ is only available to the consumer in
               | the form of a monthly bill, so by the time you discover
               | you were eating pixies in the Peak Cost hours the heat
               | wave is over and your bill is already through the roof.
               | 
               | (Any local SoCal residents please feel free to pick my
               | analysis apart, but that was my first take when I heard
               | about the legal action.)
        
         | cratermoon wrote:
         | Also https://en.wikipedia.org/wiki/Campbell%27s_law
        
       | dang wrote:
       | Past related threads. In this case a few of the 1-or-2 comment
       | threads have particularly good posts:
       | 
       |  _Goodhart 's Law_ -
       | https://news.ycombinator.com/item?id=26839177 - April 2021 (2
       | comments)
       | 
       |  _Goodhart's Law Rules the Modern World. Here Are Nine Examples_
       | - https://news.ycombinator.com/item?id=26604130 - March 2021 (3
       | comments)
       | 
       |  _Goodhart 's Law and how systems are shaped by the metrics you
       | chase_ - https://news.ycombinator.com/item?id=23762526 - July
       | 2020 (58 comments)
       | 
       |  _When Goodharting Is Optimal_ -
       | https://news.ycombinator.com/item?id=22054359 - Jan 2020 (3
       | comments)
       | 
       |  _Goodhart's Law: Are Academic Metrics Being Gamed?_ -
       | https://news.ycombinator.com/item?id=21065507 - Sept 2019 (27
       | comments)
       | 
       |  _Goodhart's Law: Are Academic Metrics Being Gamed?_ -
       | https://news.ycombinator.com/item?id=20076485 - June 2019 (2
       | comments)
       | 
       |  _When targets and metrics are bad for business_ -
       | https://news.ycombinator.com/item?id=19135694 - Feb 2019 (6
       | comments)
       | 
       |  _Goodhart 's Law: When a measure becomes a target, it ceases to
       | be a good measure_ -
       | https://news.ycombinator.com/item?id=17320640 - June 2018 (134
       | comments)
       | 
       |  _Goodhart 's Law_ -
       | https://news.ycombinator.com/item?id=10075780 - Aug 2015 (1
       | comment)
       | 
       |  _Goodhart 's law_ - https://news.ycombinator.com/item?id=1368745
       | - May 2010 (1 comment)
        
       | rfreytag wrote:
       | Earlier posts here (134 comments):
       | https://news.ycombinator.com/item?id=17320640
       | 
       | and here (58 comments):
       | https://news.ycombinator.com/item?id=23762526
       | 
       | Also NPR's Planet Money (audio) also covered this interviewing
       | Goodhart himself:
       | https://www.npr.org/sections/money/2018/11/19/669395064/epis...
        
       | lisper wrote:
       | It is possible to turn this effect to your advantage. I wrote a
       | spam filter that takes advantage of signals that would be easy
       | for spammers to spoof (like the list-id header), but no one
       | spoofs them because no one but me uses this approach to spam
       | filtering. So, ironically, if more people used my spam filter, it
       | would probably stop working as well as it does now.
        
       | paulorlando wrote:
       | Good that this topic gets some attention. I see Goodhart's Law
       | again and again in metrics. I wrote about this a while back,
       | including why we use the misleading name.
       | https://unintendedconsequenc.es/new-morality-of-attainment-g...
        
       | bedhead wrote:
       | I do investment stuff for a living and this was one of the
       | single-most important things I ever learned.
        
       | pikwip wrote:
       | Here's a interesting paper I found that attempts to categorize
       | the mechanisms by which Goodhart's Law operates in the real
       | world. The variants are separated into Causal and Non-causal
       | mechanisms.
       | 
       | https://arxiv.org/abs/1803.04585
        
         | derbOac wrote:
         | Thanks for posting that. I was going to say -- it's interesting
         | to think about the reasons why Goodhart's Law might hold if it
         | does.
         | 
         | I've always assumed the problem is that the metric is always
         | influenced by other, nontarget variables that become more
         | causally important when the metric becomes a proxy target. So,
         | for example, "gaming the metric" becomes important (in a
         | percent variance sense) after the metric becomes a target. I
         | think the paper's adversarial scenario is closest to this
         | maybe.
         | 
         | They discuss some other factors that seem more relevant to
         | individual cases at any moment in time than an explanation for
         | why a metric's utility might decline over time. In that sense
         | the paper seems to be more about Goodheart-like phenomena in
         | general.
         | 
         | It would be interesting to demonstrate Goodheart's law
         | conclusively with real data in some domains.
        
       | vagab0nd wrote:
       | Here's another interesting one that's somewhat related:
       | 
       | https://en.wikipedia.org/wiki/Decline_effect
        
         | rfreytag wrote:
         | Could be...yes.
         | 
         | "Decline effect" could also be mostly due to the
         | https://en.wikipedia.org/wiki/Replication_crisis
         | 
         | Goodhart's Law starts out as effective and the social system it
         | purports to measure adapts (some might say 'distorts'), till
         | the measure not longer serves its original purpose.
        
       | brightball wrote:
       | One of the most important lessons to promote.
        
       | gftsantana wrote:
       | My first job was at the anti-fraud department at a telecom
       | company in the early 00s. Our job was basically to determine
       | whether a new landline or mobile contract was fraudulent or not.
       | Some requests would be flagged by an automated piece of software
       | that was basically a black box to most employees, myself
       | included. We would basically look at the documents sent by the
       | clients and, sometimes, ask a few questions via phone.
       | 
       | I was very young at the time, but I remember basically deriving
       | Goodhart's law after a few months in the job. I don't remember
       | clearly most of the things that led me to that conclusion, but I
       | do remember the most extreme: at some point, management started
       | requiring us to block clearly non-fraudulent phones because the
       | directors decided to increase the blocked installations target.
       | It would include even old contracts by good paying customers that
       | happened to be flagged.
       | 
       | I remember trying to talk to people about this, but the idea that
       | trying to reach a target by any means necessary is usually not a
       | good idea was incomprehensible to most people. Years later, I
       | realized that others knew exactly what was going on; they just
       | didn't care, and I was naive for not seeing that.
       | 
       | It was a few years later when I learned about perverse
       | incentives, Goodhart's law, the cobra effect, etc., and it
       | allowed me to have more productive conversations with people
       | about targets and incentives.
        
       | cratermoon wrote:
       | When I explain the concept to people I usually call it the
       | Goodhart-Strathern principle, to recognize the generalization she
       | contributed and acknowledge the author of the most-commonly
       | quoted form of the law: "When a measure becomes a target, it
       | ceases to be a good measure"
        
       | colechristensen wrote:
       | The only systems that escape goodhearts or similar laws are those
       | that have either A) people who genuinely care about quality and
       | have the judgment for it or B) systems where the mechanics of how
       | the control variable effects the system are well understood.
       | (i.e. no black boxes)
        
         | leephillips wrote:
         | I think the only way to apply a metric while avoiding the
         | consequences of the Law is to keep the metric secret. As soon
         | as the subjects are aware of the metric, they will try to
         | optimize it, rather than the real performance that the metric
         | is supposed to indicate.
         | 
         | There is a close connection with security, say screening at
         | airports. If you fail to keep your screening criteria secret,
         | the terrorists will simply ensure that they do not match the
         | criteria. One way to thwart this is through randomness. There
         | is a long public debate between Sam Harris and Bruce Schneier
         | where the latter tried in vain to explain this to the former,
         | who insisted that it was a waste of resources to search little
         | old ladies. If one of your metrics is "don't search little old
         | ladies", the terrorists will discover this through time by
         | observation. The next bomb will be carried by a little old
         | lady.
        
       | _greim_ wrote:
       | What's the takeaway? How can statistics inform action, if such
       | action invalidates those statistics?
        
         | ItsMonkk wrote:
         | You need to measure the KPIs.
         | 
         | The reason we use metrics is because things have scaled out of
         | control, and using a "real" judgment system is no longer
         | possible. Not for everyone, anyway.
         | 
         | Using hiring as an example, but this should work for nearly all
         | metrics driven workloads. Hire a small subset of your staff
         | using a real judgment system, hire most of your staff using
         | metrics, then take a sample of those hired using metrics and
         | take a real look at them compared against those hired using
         | real judgments.
         | 
         | If they are reasonably close, the KPIs are working. If they are
         | alien to each-other, you need to either stop using KPIs or
         | alter them significantly to fit with what the more effective
         | people are doing.
        
           | rhizome wrote:
           | > _You need to measure the KPIs._
           | 
           | What KPIs do you use when the statistics are being used to
           | measure whether the KPIs are the right ones?
        
         | cryptica wrote:
         | The take away is that governments and large institutions which
         | impact large numbers of people should never attempt to set
         | quantifiable targets and should never attempt to meet
         | quantifiable targets.
        
         | Jtsummers wrote:
         | That we can't stop thinking. It becomes too easy to stop using
         | our own judgement when we have numbers to back up our
         | decisions. We can point to he analysis and say, "Look, we've
         | improved <metric>!". That itself becomes the justification for
         | the actions taken, regardless of their actual sensibility.
         | 
         | This is an area where discipline remains essential, and
         | maintaining discipline is a constant battle.
        
         | castlecrasher2 wrote:
         | That statistics should inform action or measure of efficacy,
         | but not drive it.
         | 
         | A dev manager running on only numbers will inevitably get
         | empty, meaningless values such as tickets resolved or lines of
         | code written, while the polar opposite manager will run on
         | intuition alone. I imagine most would agree neither type is
         | generally effective and a balance should be struck, and
         | Goodhart's Law means you should be aware of what's important,
         | pay attention to it, but do not make it your sole focus. And
         | for God's sake, don't make a public dashboard for it.
        
       | starnger wrote:
       | I could not help but relate it to Heisenberg uncertainty
       | principle.
        
         | thekhatribharat wrote:
         | me too :)
        
       ___________________________________________________________________
       (page generated 2021-09-17 23:01 UTC)