[HN Gopher] An AI wolf that preferred suicide over eating sheep
       ___________________________________________________________________
        
       An AI wolf that preferred suicide over eating sheep
        
       Author : lancengym
       Score  : 250 points
       Date   : 2021-07-06 11:15 UTC (11 hours ago)
        
 (HTM) web link (lancengym.medium.com)
 (TXT) w3m dump (lancengym.medium.com)
        
       | spywaregorilla wrote:
       | Seems like a nothing story. Just looking at the game, there's
       | obviously a constant decision to be made of chase more sheep or
       | instantly die. It sounds like in the original model they had a
       | max of 20 seconds, so it's not surprising that you would just
       | tank your losses to maximize your score every now and then.
       | 
       | Anyone who tries to devise optimal strategies for things should
       | be able to see this isn't especially interesting.
       | 
       | Social metaphors are wildly out of place.
       | 
       | They say "unintended consequences of a blackbox" but I doubt
       | that's true. Make it a deterministic turn based game and run it
       | through a perfectly transparent optimization model and I wouldn't
       | be surprised to learn this was just the best strategy for the
       | rules they devised. I really hate when people describe an ai as
       | something that cannot be understood because they personally don't
       | understand it.
        
         | rob74 wrote:
         | It's not surprising from the perspective of an "AI actor". But
         | if you call it a "wolf", most people will assume that it will
         | behave at least roughly like a real-world creature, and the
         | self-preservation instinct is one of the most basic traits of
         | all living beings, so the "AI wolf" not having that is indeed
         | surprising for a layperson.
        
         | fny wrote:
         | If I remember correctly there were similar scenarios that would
         | occur using that popular Berkeley Pacman universe where he
         | would run into a ghost to avoid the penalty of living for too
         | long.
        
           | hotwire wrote:
           | It reminds me of the thread about the Quake 3 bots, who left
           | alone for several years, figured out that the best approach
           | was to not kill each other.
           | 
           | https://i.imgur.com/dx7sVXj.jpg
        
             | spywaregorilla wrote:
             | Without knowledge of their reward function its difficult to
             | tell if they're converged on this strategy or if its just
             | broken.
        
         | vnorilo wrote:
         | If we play the analogy further: life is suffering, apart from
         | the brief ecstasy of eating sheep. The AI was trying not to
         | suffer, thus chose the boulder.
         | 
         | Did my best to translate the (misguided) fitness function to
         | fiction.
        
           | SubiculumCode wrote:
           | Man is born crying, and when he's cried enough, he dies.
           | -Kyoami in Ran.
           | 
           | Cutting one's losses early may appear to be the most rational
           | act if trying to minimize an agent's total suffering.
        
             | 988747 wrote:
             | Which is why some forms of Buddhism are basically a cult of
             | death: https://en.wikipedia.org/wiki/Sokushinbutsu
        
               | spywaregorilla wrote:
               | > In the video game The Legend of Zelda: Breath of the
               | Wild, the monks in the Ancient Shrines seem to be based
               | on sokushinbutsu.
               | 
               | Factoid of the day for sure
        
             | syntheticnature wrote:
             | David Benatar reached a similar philosophic conclusion due
             | to his utilitarian views, which was amusingly put (with a
             | sort of AI present, no less) in this webcomic:
             | https://existentialcomics.com/comic/253
        
               | SubiculumCode wrote:
               | Thanks. I think I just found a new comic to read.
        
         | phkahler wrote:
         | It's good because most people can understand it. I'd say it's a
         | perfect strategy for a game, but if they're using evolutionary
         | algorithms they should require some form of reproduction for
         | the wolves to carry on. That would make the suicide strategy
         | fail to propagate well. I can also see a number of possible
         | strange outcomes even then.
        
           | spywaregorilla wrote:
           | You're conflating the evolution of the strategy with the idea
           | of the evolution of the actor being controlled by the agent.
           | To give an obvious example, if dying gave 100 points instead
           | of subtracting 10, even the dumbest evolutionary algo would
           | learn to commit suicide asap. The survival of the actor has
           | no intrinsic relevance to how the evolution develops.
        
           | jonnycomputer wrote:
           | What mechanism are you thinking of? One in which having
           | offspring is rewarding and so enters into the same learning
           | algorithm, or one in which the learning algorithm/action
           | selection is evolved and differentially conserved?
        
         | mcguire wrote:
         | " _I really hate when people describe an ai as something that
         | cannot be understood because they personally don 't understand
         | it._"
         | 
         | On the other hand, keep in mind that a significant weakness of
         | most modern AI research is that it's extremely difficult _to
         | understand:_ you have the input, the output, and a bag of
         | statistical weights. In the story, you know the (trivially bad)
         | function that is being optimized; in general you may not. It 's
         | not without implications for other systems.
         | 
         | Further,
         | 
         | " _At the end of the day, student and teacher concluded two
         | things:_
         | 
         | " _* The initial bizarre wolf behavior was simply the result of
         | 'absolute and unfeeling rationality' exhibited by AI systems._
         | 
         | " _* It's hard to predict what conditions matter and what
         | doesn't to a neural network._ "
        
           | spywaregorilla wrote:
           | The tooling for understanding complex models is a lot better
           | than what most people assume.
           | 
           | > The initial bizarre wolf behavior was simply the result of
           | 'absolute and unfeeling rationality' exhibited by AI systems.
           | 
           | This is a bad quote. They should not say this. It's a poorly
           | trained agent doing a decent job of a poorly defined
           | environment. Absolute rationality conjures images of some
           | greater thinking but its actually a really stupid model that
           | hit a local maxima. Calling it unfeeling implies the model
           | has some concept of "wolf" and "suicide" but it does not.
           | Replace the visuals with single pixel dots if you want an
           | honest depiction of the room for feelings.
           | 
           | > It's hard to predict what conditions matter and what
           | doesn't to a neural network."
           | 
           | This is generally true, but it isn't true here.
        
         | markwkw wrote:
         | Exactly, from technical perspective it's a nothing story.
         | 
         | It's interesting, though, how strong of a reaction general
         | public had to this. The story must have strongly resonated with
         | what some folks were already feeling. When you squint (pretend
         | to understand the technology not at all) it's a tragic story.
         | The situation of the wolf seems similar to the situation of
         | some people. Chasing their careers in a highly structured, sort
         | of dehumanized, environment of constant pursuit. "Supreme
         | Intelligence" (that's what a layperson may think of AI) looks
         | at a situation of the wolf and decides that it makes no sense
         | to continue the pursuit. Moreover, what is "optimal" is the
         | most tragic result - suicide.
        
           | SubiculumCode wrote:
           | Exactly. It is a social commentary story where a result from
           | a student's project was a lucid analogy of the plight of
           | their lived rat-race in modern China, with the lesson being:
           | Cut your losses and lie flat. To those within ML field, this
           | is less than new, but as a commentary on how such ML issues
           | can be a teachable and easily understood analogy to people's
           | lives certainly makes the story interesting to me.
        
           | wombatmobile wrote:
           | > The story must have strongly resonated with what some folks
           | were already feeling.
           | 
           | Yes, because we don't see things as they are, we see them as
           | we are.
        
             | edoceo wrote:
             | At a Grateful Dead show in Oakland this geezer said to me:
             | 
             | Your perception IS your reality man!
        
               | BoxOfRain wrote:
               | I'd have loved to have been arond to a Dead show! I know
               | it sounds a little ungrateful coming from someone who
               | lives in a period of unprecedented access to all kinds of
               | wonderful music being written all the time, but there's
               | something about the Dead that really connects with me
               | that I can't quite put my finger on.
        
               | sitkack wrote:
               | Dark Star Orchestra is your current best bet
               | https://www.youtube.com/watch?v=y8_THRZLSi4
        
               | Darvokis wrote:
               | Schopenhauer: World as representation.
        
           | er4hn wrote:
           | > Exactly, from technical perspective it's a nothing story.
           | 
           | I think that one thing it points to is how technology can
           | discover novel iterations on a system. Imagine if this was a
           | system modeled around a network and the agent was trying to
           | figure out how to get from the outside to read a specific
           | system asset. With the right (read: very detailed) modeling
           | you could create a pentesting agent.
        
           | spywaregorilla wrote:
           | Shrug. Another way to frame this is a poker bot learned to
           | fold when given a bad hand, and they only gave it the same
           | bad hand.
           | 
           | Yes, yes, woe is the individual in modern capitalist society
           | but the only reason people are reacting to this are that they
           | don't understand it and they've been told it's something much
           | more emotionally impactful than it actually is.
        
             | colinmhayes wrote:
             | >but the only reason people are reacting to this are that
             | they don't understand it
             | 
             | I think it's much more likely that they're reacting like
             | this because they see their own plight in the wolf. It
             | doesn't matter why the wolf killed itself, it became a meme
             | that allowed many Chinese to reflect together on a common
             | plight.
        
               | sdenton4 wrote:
               | I think there's a bit more to the analogy than just the
               | suicidal wolf, though. The wolf is offing itself to
               | minimize loss because there's no clear path to a better
               | outcome.
               | 
               | This seems like a common refrain when we see radicalized
               | engineering students from less-developed countries, who
               | are notably common in extremist groups. They're people on
               | a very difficult path (an engineering program!) with no
               | real path to success (living in a society where
               | unemployment for people with degrees is very high). Cost
               | for continuing on the path is high, and there's no
               | obvious path to get the good outcomes.
        
               | spywaregorilla wrote:
               | Having reread the article, it seems like the concept of
               | suicide doesn't weigh into the cultural reaction at all.
               | It's just giving up on the chase.
        
           | chaostheory wrote:
           | > Chasing their careers in a highly structured, sort of
           | dehumanized, environment of constant pursuit.
           | 
           | They have a word for it over there: involution i.e. no matter
           | how much effort you put in, you get the same result.
        
           | canadianfella wrote:
           | > pretend to understand the technology not at all
           | 
           | Are you missing a word or two?
        
           | acituan wrote:
           | From the article in contrast to what you said;
           | 
           | > Perhaps the true lesson to be learnt here isn't about
           | helplessness and giving up. It's about getting up, trying
           | again and again, and staying with the story till the end.
           | 
           | I find the possibility of contrasting interpretations absurd.
           | The problem with using any _dead matter_ for our meaning
           | making needs is it is ultimately a self-referential
           | justification for how we think we should feel, while being
           | equally or even more prone to self deception traps.
           | 
           | AI being the object is irrelevant here, this is nothing
           | different than astrology or divination from tea leaves etc.
           | It is 2000 BC level religious thinking with new toys.
        
             | Patoria_Coflict wrote:
             | Any programmer would have seen the issue and made the
             | change about rewarding suicide.
             | 
             | The ONLY reason this was written was because the researches
             | hired a programmer to make a specific thing, then is was
             | too expensive for them to make more changes so they
             | published the mistake.
        
           | ajuc wrote:
           | Similarly I've seen A LOT of people posting stories about
           | "chat bot exposed to internet started praising Hitler and
           | became racist/sexist/antisemitic" as a proof that "supreme
           | intellect sees through leftist political correctness and
           | knows that alt-right is correct about everything".
        
             | frozenport wrote:
             | I think a lot of those people are joking?
        
             | BoxOfRain wrote:
             | It's really not that deep, people will always find sport in
             | scandalising people with a stronger disgust reaction than
             | themselves. It's more a new way of teaching a parrot to say
             | "fuck" rather than a heartfelt statement of political
             | belief in my opinion.
        
               | a1369209993 wrote:
               | > It's more a new way of teaching a parrot to say "fuck"
               | 
               | This is a _excellent_ analogy for this sort of behaviour,
               | thank you.
        
       | abrahamneben wrote:
       | This problem isn't particularly unique to AI research. In any
       | optimization problem, if you do not encode all constraints or if
       | your cost function does not always reflect the real world cost,
       | then you will get incorrect or even nonsensical results.
       | Describing this as an AI problem is just clickbait.
        
         | xtracto wrote:
         | The article doesn't mention it but the researchers are using
         | agent-based-modelling. It was nice to see the gif of what
         | appears to be either NetLogo or Repast. I did research in that
         | area for about 8 years and know a bit about the subject.
         | 
         | What they are showing is one of the main issues with agent-
         | based-models (and I think every model, but it happens
         | particularly with models trying to capture the behaviour of
         | complex open systems): Garbage in -> Garbage Out.
         | 
         | Most likely the representation of the sheep/wolf system was not
         | correct (so the modeling was not correct). Here "correct" means
         | good enough to demonstrate whatever emerging behaviour they are
         | studying. ABM is a powerful tool, but you must know how to use
         | it.
        
         | nxmnxm99 wrote:
         | Yep. Feels a bit like blaming a failed shuttle launch on
         | calculus.
        
       | alexshendi wrote:
       | Well I can identify with that AI wolf. He recognises his own
       | incompetence and chooses suicide over eternally failing.
        
       | stavros wrote:
       | Would anyone happen to have a non-signupwalled link?
        
       | aliasEli wrote:
       | A nice story about AI systems that warns that you should very
       | carefully choose the parameter you want to optimize.
        
         | phoe-krk wrote:
         | > very carefully choose the parameter you want to optimize.
         | 
         | This does not only concern AI systems, but all systems in
         | general - including human ones.
        
           | aliasEli wrote:
           | You are right, of course.
        
           | aetherspawn wrote:
           | From a retrospective today... "the KPIs are abysmal but the
           | deliverables are very high .. so I guess the KPIs are wrong?"
        
             | shrimpx wrote:
             | Sounds like the deliverables KPI is fantastic.
        
       | wombatmobile wrote:
       | The philosopher Hubert Dreyfus argued that computers, who have no
       | body, no childhood and no cultural practice, could not acquire
       | intelligence at all.
       | 
       | https://www.nature.com/articles/s41599-020-0494-4
       | 
       | What he means is that computers, which can learn rules and use
       | those rules to make predictions in certain domains, nevertheless
       | cannot exercise general intelligence because they are not "in the
       | world". This renders them unable to experience and parse culture,
       | most of which is tacit in real time, and sustained by enduring
       | mental models which we experience as "expectations" that we
       | navigate with our emotions and senses.
       | 
       | Culture is the platform on which intelligence is manifest,
       | because the usefulness of knowledge is not absolute - it is
       | contextual and social.
        
         | tiborsaas wrote:
         | This is why all AI today falls o to the narrow AI category. It
         | just often omitted because it's true for all of them.
        
         | colinmhayes wrote:
         | Imagine being a dualist in the 21st century.
        
           | shrimpx wrote:
           | Where's the dualism? It sounds like just a peculiar
           | definition of learning.
        
           | goatlover wrote:
           | What in the parent post is dualist? Sounds more like an
           | argument that animals have embodied intelligence.
           | 
           | But as for being a dualist in the 21st century, there is
           | always consciousness, information and math. All three of
           | which can lead to some form of dualism/platonism.
        
             | mcguire wrote:
             | Many of Dreyfuss' and other similar arguments reduce do
             | dualism when you start digging into them. I don't have the
             | time to dig into the specific article, but here's some
             | immediate questions:
             | 
             | 1. What is special about a body that makes it impossible to
             | have intelligence without it? (a) Is it possible for a
             | quadriplegic person to be intelligent? (b) A blind and deaf
             | person? ((c)What about that guy from _Johnny Got His Gun?_
             | )
             | 
             | 2. What is special about a childhood such that a machine
             | cannot have it?
             | 
             | 3. Would a person transplanted into a completely alien
             | culture not be intelligent?
             | 
             | What is fundamentally being argued is the definition of
             | "intelligence", and there are many fixed points of those
             | arguments. Unfortunately, most of them (such as those that
             | answer "no", "probably not", and "definitely not" to 1a,
             | 1b, and 1c) don't really satisfy the intuitive meaning of
             | "intelligence". That, and the general tone of the
             | arguments, seem to imply the only acceptable meaning is
             | dualism.
             | 
             | For example, " _...there is always consciousness,
             | information and math..._ ": without a tight, and very
             | technical, definition of consciousness, that seems to be
             | assuming the conclusion. _With_ a tight, and very
             | technical, definition of consciousness, what is the problem
             | with a machine demonstrating it?
             | 
             | Information? Check out knowledge, "justified true belief",
             | and the Gettier problem (https://courses.physics.illinois.e
             | du/phys419/sp2019/Gettier....).
             | 
             | Math? Me, I'm a formalist. It's all a game that we've made
             | up the rules to.
        
               | goatlover wrote:
               | > Many of Dreyfuss' and other similar arguments reduce do
               | dualism when you start digging into them. I don't have
               | the time to dig into the specific article, but here's
               | some immediate questions:
               | 
               | To me it sounds dualist if intelligence is disembodied.
               | If the substrate doesn't matter, only the functionality,
               | then that sounds like there's something additional to the
               | world than just the physical constintuents. But of
               | course, embodied versions of intelligence need to answer
               | the sort of questions you posed. It should be noticed
               | that Dreyfuss wrote his objections in the 50s and 60s
               | during the period of classical AI. I don't know whether
               | he addressed the question of robot children, or simulated
               | childhoods. We don't have the sort of thing even today,
               | and we also don't have AGI. Some of his objections still
               | stand, although machine learning and robotics research
               | has made inroads.
               | 
               | > Math? Me, I'm a formalist. It's all a game that we've
               | made up the rules to.
               | 
               | So why is physics so heavily reliant on mathematics?
               | Quite a few physicists think the world has a mathematical
               | structure.
               | 
               | > For example, "...there is always consciousness,
               | information and math...": without a tight, and very
               | technical, definition of consciousness, that seems to be
               | assuming the conclusion.
               | 
               | Qualia would be the philosophical term for subjective
               | experiences of color, sound, pain, etc. Reducing those to
               | their material correlations has been notoriously
               | difficult, and there is still no agreement on what that
               | entails.
               | 
               | As for information, some scientists have been exploring
               | the idea that chemical space leads to the emergence of
               | information as an additional thing to physics which needs
               | to be incorporated into our scientific understanding of
               | the world. That we can't really explain biology without
               | it.
        
               | mcguire wrote:
               | :-)
               | 
               | " _To me it sounds dualist if intelligence is
               | disembodied. If the substrate doesn 't matter, only the
               | functionality, then that sounds like there's something
               | additional to the world than just the physical
               | constintuents._"
               | 
               | Off the top of my head, what the substrate is doesn't
               | matter, but that there is a substrate does. Intelligence
               | is the behavior of the physical constituents.
               | 
               | " _So why is physics so heavily reliant on mathematics?
               | Quite a few physicists think the world has a mathematical
               | structure._ "
               | 
               | Because humans are very good at defining the rules when
               | we need them? Because alternate rules are nothing but a
               | curiosity even to mathematicians unless there is a use---
               | such as a physical process---for them?
               | 
               | One of the problems with qualia, as a topic of
               | discussion, is that I can never be entirely sure that you
               | have it. I can assume you do, and rocks don't, but that
               | is about as far as I can get.
        
               | wombatmobile wrote:
               | Don't overthink this.
               | 
               | If you put a computer in a room with a hot babe, a 3
               | layer chocolate cake, a bottle of the finest whisky or
               | bourbon, the keys to a Porsche, and a trillion dollars in
               | cash, what would it do?
               | 
               | Yeah, nothing. The computer is not in the world.
        
               | wombatmobile wrote:
               | > (a) Is it possible for a quadriplegic person to be
               | intelligent? (b) A blind and deaf person?
               | 
               | Yes of course, because all of those people have ambitions
               | and desires. They feel pain and they seek pleasure, which
               | they experience through their bodies.
               | 
               | Imagine if the world 2,000 years from now was populated
               | only by supercomputers, all the lifeforms having
               | perished.
               | 
               | What are these computers going to do with the planet?
        
               | colinmhayes wrote:
               | Why can't a computer have ambitions and desires? Why
               | can't it seek pleasure and feel pain? The only answer is
               | dualism or we don't know how to wire it properly yet.
        
             | colinmhayes wrote:
             | > cannot exercise general intelligence because they are not
             | "in the world".
             | 
             | Implies dualism. In a materialist world a computer can
             | learn anything given the proper structure and stimuli.
        
               | wombatmobile wrote:
               | The limitation is more practical than theoretical or
               | philosophical.
               | 
               | Consider this line from an Eagles song:
               | 
               | "City girls just seem to find out early, how to open
               | doors with just a smile."
               | 
               | What does that mean to you?
               | 
               | Disembodied computers don't get the experiences required
               | to gain that intelligence, and even if they could go
               | along for the ride, in a helmet cam, they wouldn't
               | experience the tingling in their heart, lungs and
               | genitals that provide the signals for learning.
        
         | cperciva wrote:
         | _The philosopher Hubert Dreyfus argued that computers, who have
         | no body, no childhood and no cultural practice, could not
         | acquire intelligence at all._
         | 
         | Similarly, nuclear submarines, which lacking all of the
         | critical organs of fish, are completely unable to swim.
        
           | fellow_human wrote:
           | Similarly, a brick has the ability to deep sea dive!
        
           | cpach wrote:
           | The nuclear submarine is just part of our extended phenotype
           | :)
        
         | feoren wrote:
         | A good example of why philosophers are utterly useless mental
         | masturbators who spend all their time arguing about definitions
         | of words. Here he takes something obviously stupid and wrong
         | and says it in such a way that you can feel smart by
         | regurgitating it. Computers don't exist in the world? What? It
         | must be some problem with their Thetans. Er, sorry, I mean
         | "qualia".
        
       | mcguire wrote:
       | It's really rather hard to draw any general conclusions from such
       | simple systems:
       | 
       | " _In the initial iterations, the wolves were unable to catch the
       | sheep most of the time, leading to heavy time penalties. It then
       | decided that, 'logically speaking', if at the start of the game
       | it was close enough to the boulders, an immediate suicide would
       | earn it less point deductions then if it had spent time trying to
       | catch the sheep._ "
       | 
       | It's as if the scenario you are thinking about involves "assume a
       | machine capable of greater-than-human-level perception, planning,
       | and action" and then set it to optimize a trivially bad function.
       | 
       | How many people do you know with a single goal of "die with as
       | much money as possible", which has a trivial solution: rob a bank
       | and then commit suicide.
        
       | sega_sai wrote:
       | It's an interesting illustration of 'be careful what you wish
       | for' and that the definition of the proper loss function is a
       | very important part of the solution to any problem.
        
         | lancengym wrote:
         | Yes, indeed. Sometimes the disincentive is just as important as
         | the incentive in determining the outcome!
        
       | wolfium3 wrote:
       | Wolf: Why are we still here? Just to suffer?
        
       | Hackbraten wrote:
       | Reminds me of this 2014 king-of-the-hill challenge:
       | https://codegolf.stackexchange.com/questions/25347/survival-...
       | 
       | One particular solution stood out:
       | https://codegolf.stackexchange.com/a/25357
       | 
       | The suicidal wolf became a (short-lived) running gag so it
       | started appearing in other king-of-the-hill challenges:
       | https://codegolf.stackexchange.com/a/34856
        
       | m12k wrote:
       | I think a major takeaway here is that balancing a reward system
       | to reward more than a single behavior is really hard - it's easy
       | to tip the scales so one behavior completely dominates all
       | others. It's an interesting lens to use to look at the heuristic
       | reward system humans have built in (hunger, fear, desire, etc).
       | This tends to have an adaptation/numbing effect, where repeated
       | rewards of the same type tend to have diminishing returns, and
       | that makes sense because it protects against "gaming the system"
       | and going for one reward to the exclusion of all others.
        
         | kzrdude wrote:
         | leela (lc0) chess also has this problem. People sometimes
         | thinks it wins too slowly (prefers some surefire way to win by
         | 50 moves instead of slightly more risky by 5 moves), or that it
         | plays without tact when in a losing position (it's hard for it
         | to rank moves when all of them lead to a loss, it doesn't have
         | the sense that humans do of still preserving the beauty of the
         | game).
         | 
         | AIs need to learn to feel awkward and avoid it, just like we
         | humans do (even if it feels very irrational at times).
        
         | bserge wrote:
         | That was my thought, too. They used too few rewards in the
         | first place, but had they used something more complex it would
         | then have become hard to balance it all.
        
         | SamBam wrote:
         | Evolution works in an incredibly complex "fitness landscape,"
         | where certain minor tweaks in phenotype or behaviors can affect
         | your fitness in quite complex ways.
         | 
         | Genetic Algorithms attempt to use this same system over
         | extremely simple "fitness landscapes," where the fitness of an
         | agent is defined by programmers using some simple mathematical
         | formula or something.
         | 
         | When the fitness function is being defined in the system by
         | programmers, instead of emerging from a rich and complex
         | ecosystem, then the outcome depends exactly on what the
         | programers choose. If they fail to see the consequences of
         | their scoring algorithm, that's on them. There's nothing really
         | magical going on, they simply failed to foresee the
         | consequences of their choice.
         | 
         | (As someone who has worked with GAs and agent models, this
         | outcome really doesn't surprise me. I would have said "oops, I
         | need to weight the time less" and re-run it, and not thought
         | twice.)
        
           | mcguire wrote:
           | From the article: (I don't know Chinese, but the animations
           | are clear enough.)
           | 
           | https://www.bilibili.com/video/BV16X4y1V7Yu?p=1&share_medium.
           | ..
        
       | throwawayffffas wrote:
       | Well the AI realized existence is suffering and took the only way
       | out.
        
       | qwerty456127 wrote:
       | This is what stress and deadlines do. Hurrying always feels worse
       | than dying.
        
       | taneq wrote:
       | A curious game.
        
       | croes wrote:
       | Wouldn't a higher penalty for bolder hits solve that problem,
       | especially a high penalty for suicide?
       | 
       | Would be more realistic because dying has higher cost than
       | failing.
        
         | rtkwe wrote:
         | There are several incentive fixes: change the negative
         | incentive to a factor that discounts the reward for catching a
         | sheep, add a negative incentive to death, or a positive
         | incentive to being alive at the end of the simulation. The
         | failure here was they didn't think about what happens when the
         | agent can't achieve a positive score, ie can't catch a sheep.
        
       | TomAnthony wrote:
       | Similar story of unexpected AI outcomes...
       | 
       | As part of my PhD research, I created a simplified Pac-Man style
       | game where the agent would simply try to stay alive as long as
       | possible whilst being chased by the 3 ghosts. The agent was un-
       | motivated and understood nothing about the goal, but was
       | optimising for maximising its observable control over the world
       | (avoiding death is a natural outcome of this).
       | 
       | I spent sometime trying to debug a behaviour where the agent
       | would simply move left and right at the start of each run,
       | waiting for the ghosts to close in. At the last minute it would
       | run away, but always with a ghost in the cell right behind it.
       | 
       | Eventually, I realised this was an outcome of what it was
       | optimising for. When ghosts reached cross-roads in the world they
       | would got left or right randomly (if both were same distance to
       | catching the agent). This randomness reduced the agent's control
       | over the world, so was undesirable. Bringing a ghost in close
       | made that ghost's behaviour completely predictable.
        
         | joek1301 wrote:
         | Yet another similar story. A side project of mine was building
         | a rudimentary neural network whose weights were optimized via a
         | genetic algorithm. The goal was operating top-down, 2D self-
         | driving cars.
         | 
         | The cars' "fitness" function rewarded cars for driving along
         | the course and punished them for crashing into walls. But
         | evidently this function punished a little too severely: the
         | most successful cars would just drive in tight circles and
         | never make progress on the course. But they were sure to avoid
         | walls. :)
        
         | johbjo wrote:
         | It can depend on what the agent "sees" and how many time-steps
         | away the "consequences" are. If the ghosts are so far away that
         | any action will take t time-steps before consequences to the
         | agent, the actions are pseudo-random because there is no reward
         | to optimize on.
         | 
         | The number of outcomes in branching_factor^t (very large) makes
         | the action-values at t=0 (where the agent chooses between
         | two/three actions) almost uniform random.
        
           | TomAnthony wrote:
           | Yes, you are right.
           | 
           | I experimented with different time horizons, mostly look 3-7
           | steps ahead.
           | 
           | In terms of the 'reward', that was implicit within the model
           | - if the ghosts caught you, your ability to influence the
           | state of the world dropped to 0.
        
         | yodelshady wrote:
         | I believe that tactic is called "kiting" and used by
         | speedrunners?
        
           | joe_the_user wrote:
           | Yeah, waiting for the ghosts to get close was a standard
           | strategy I used back when I played lots of Pacman.
           | 
           | Having all the ghosts behind you gives you more control since
           | they'll follow you in a line.
           | 
           | That the ghosts follow the player is what makes the game
           | winnable. If they formed a grid and gradually closed-in, it
           | would be impossible to escape.
           | 
           | Edit: What was unexpected in this case was that the system
           | found a strategy the programmer didn't think of.
        
           | TomAnthony wrote:
           | Yes! Exactly - kiting. I didn't know the term but when I
           | explained the behaviour I was seeing to a colleague they told
           | me about this.
        
         | Retr0id wrote:
         | Another similar story, I remember reading about an AI that
         | simply paused the game when it was about to die. I can actually
         | remember doing something similar as a child.
        
           | 0110101001 wrote:
           | https://youtu.be/xOCurBYI_gY&t=15m10s
        
         | edejong wrote:
         | Same story as one I shared 4 years ago. Seems to be the best
         | tactic! https://news.ycombinator.com/item?id=14031932
         | 
         | Edit: don't want to sound accusatory
        
           | jeremysalwen wrote:
           | No need to be accusatory. The stories are different, just the
           | learned behavior is the same. And not very surprising,
           | considering your story was pre-empted by Pac-Man
           | speedrunners, who already discovered this technique, which
           | they call "kiting".
           | 
           | You can see the paper OP wrote to confirm for yourself that
           | their story is not the same as yours: https://uhra.herts.ac.u
           | k/bitstream/handle/2299/15376/906989....
        
           | TomAnthony wrote:
           | Hah - thank you for sharing!
           | 
           | That is very interesting that this emerged from two different
           | approaches.
           | 
           | I published my result years back, and have never heard of
           | this emerging elsewhere before!
           | 
           | Didn't take it as accusatory [but thanks to child for sharing
           | link :)].
        
           | wildmanx wrote:
           | "Completely predictable" is different from "This would
           | minimize the probability of being fenced in by the four
           | ghosts." no?
        
         | McMiniBurger wrote:
         | hm... "keep your friends close but your enemies closer" ...?
        
           | lancengym wrote:
           | But try to make sure your enemies don't end up surrounding
           | you?
        
             | inglor_cz wrote:
             | Yeah, that is tricky. I believe that Constantinople once
             | found out the hard way, and thus is now Istanbul.
        
               | TheDauthi wrote:
               | I guess people just liked it better that way.
        
               | [deleted]
        
         | fnord77 wrote:
         | this sounds interesting. can you link your research or paper?
        
           | TomAnthony wrote:
           | Sure! The PDF is available here:
           | 
           | https://uhra.herts.ac.uk/handle/2299/15376
        
         | TchoBeer wrote:
         | How did you measure control over the world?
        
           | greenpresident wrote:
           | In an active inference approach you would have the agent
           | minimise surprisal. Choose the action that is most likely to
           | produce the outcome you predicted.
        
             | TchoBeer wrote:
             | Why would this cause the net to avoid death? Do things keep
             | moving after pacman dies?
        
             | TomAnthony wrote:
             | The approach I used was similar. The idea of maximising
             | observed control of the world means you seek states where
             | you can reach many other states, but _predictably_ so. This
             | comes 'for free' when using Information Theory to model a
             | channel.
        
               | cmehdy wrote:
               | Do you have any reading you'd recommend related to this?
               | 
               | I naively thought it would be some kind of Kalman
               | filtering of sorts but from what I gather in your words
               | it doesn't even have to be "that" complicated, right?
               | 
               | edit: found your link to the paper in another post (
               | https://news.ycombinator.com/item?id=27749619 ), thanks!
        
               | benlivengood wrote:
               | What's the tradeoff between "delete all state in the
               | world with 100% certainty" and "be able to choose any
               | next state of the world with (100-epsilon)% certainty"?
        
           | TomAnthony wrote:
           | The method was called 'empowerment'. Two ways to explain
           | it...
           | 
           | From a mathematical perspective, we used Information Theory
           | to model the world as an information theoretic 'loop'. The
           | agent could 'send' a signal to the world by performing an
           | action, which would change the state of the world; the state
           | of the world was what the agent 'received'. This obviously
           | relies on having a model of the world and what your actions
           | will do, but doesn't burden the model with other biases.
           | 
           | Pore more colloquially, the agent could perform actions in
           | the world, and see the resulting state of the world (in my
           | case, that was the location of the agent and of the ghosts).
           | Part of the principle was that changes you cannot observe are
           | not useful to you.
        
         | Iv wrote:
         | A while ago, a very simple agent I made had to do tasks in the
         | maze and evaluate strategies to reach them. I wanted it to have
         | no assumptions about the world, so it started with minimum
         | knowledge. Its first plan was to try to remove walls, to get to
         | the things it needed.
         | 
         | It is a fun feeling when your own program surprises you.
        
       | cornel_io wrote:
       | I mean, lesson zero of optimization is when you're designing a
       | loss function and trying to incentivize agents to perform a task,
       | don't set it up so that suicide has a higher payoff than making
       | partial progress on the task. Maybe make death the _worst_
       | outcome, not one of the best...?
       | 
       | One of these days I have to actually scour the web and collect a
       | few _good_ examples where evolutionary methods are used
       | effectively on problems that actually benefit from them, assuming
       | I can find them. Almost every example you 're likely to see is
       | either a) solved much more effectively by a more traditional
       | approach like normal gradient descent or classic control theory
       | techniques (most physical control experiments fall into this
       | category), b) poorly implemented because of crappy reward setup,
       | c) fully mutation-driven and hence missing what is actually
       | _good_ about evolution above and beyond gradient descent
       | (crossover), or d) using such a trivial genotype to phenotype
       | mapping that you could never hope to see any benefit from
       | evolutionary methods beyond what gradient descent would give you
       | (if the genome is a bunch of neural network weights, you 're
       | _definitely_ in this category).
        
       | JoshTko wrote:
       | Folks are missing why this went viral in China. From the article
       | "In an even more philosophical twist, young and demoralized
       | Chinese corporate citizens also saw the suicidal wolf as the
       | perfect metaphor for themselves: a new class of white collar
       | workers -- often compelled to work '996' (9am to 9pm, six days a
       | week) -- chasing a dream of promotions, pay raise, marrying
       | well... that seem to be becoming more and more elusive despite
       | their grind."
        
         | nobodyandproud wrote:
         | Missed or possibly don't care.
         | 
         | The technical details aren't interesting, but I do think it's
         | interesting just how disjointed life is vs what was promised.
         | 
         | In the US, this was aptly named a rat-race; and the white
         | collar Chinese with a market-based economy are suffering the
         | same.
         | 
         | Our markets and nations promise some combination of wealth or
         | retirement and enjoyment of life, but it's an ever-moving goal
         | just out of reach for anyone but the lucky few.
        
       | mattowen_uk wrote:
       | We don't have AI. AI is a buzzphrase overused by the media. What
       | we have is Machine Learning (ML). If and only if, we get past the
       | roadblock of the 'agent' creating some usable knowledge out of an
       | unprogrammed experience, and forming conclusions based on that,
       | will we have AI. For now, the mantra 'Garbage-in-garbage-out'
       | applies; if the controller of the agent gets their rule-set
       | wrong, the agent will not behave as expected. This is not AI. The
       | agent hasn't learnt by itself that it is wrong.
       | 
       | For example, there's a small child who is learning to walk. The
       | child falls down a lot. Eventually the child will work out a long
       | list of arbitrary negatives connected to its wellbeing that are
       | associated with falling down.
       | 
       | However, the parents, being impatient, reach inside the child's
       | head and directly tweak some variables so that the child has more
       | dread of falling over than they do of walking. Did the child
       | learn this, or was it told ?
       | 
       | We currently do the latter every time an agent gets something
       | wrong. Left to their own devices, 99.9% of agents will continue
       | to fall down over and over again until the end of time.
       | 
       | We have a long way to go before we can say we've created 'AI'.
        
         | KaoruAoiShiho wrote:
         | Nah we have loads of AI now that don't need variable tweaking,
         | like the OpenAI project that plays any retro game.
        
         | Tenoke wrote:
         | Definitions change, and it seems pointless to deny that AI is
         | just used to mean 'modern ML'.
        
           | dnautics wrote:
           | Not even, we've used AI to describe entirely preprogrammed
           | and non-ml agents in video games for decades now.
           | 
           | Is it artificial? Does it make decisions? It's an AI. Even if
           | it's crappy, and not very intelligent.
        
         | hinkley wrote:
         | We also have a lot of graph-theory and optimization algorithms
         | that get labeled AI by actual AI people. But the press is,
         | almost to a man, always talking about machine learning and
         | expert systems.
        
       | joebob42 wrote:
       | This just seems really obvious. Even if there are sheep nearby
       | worth hunting, it's probably always eventually going to be the
       | right move to suicide.
        
       | petercooper wrote:
       | What are some of the nicest environments for experimenting with
       | this sort of "define some rules, see how agents exist within that
       | world" stuff? It doesn't need to be full on ML models, even
       | simpler rules defined in code would be fine.
        
         | duggable wrote:
         | Looks like this[1] might be one example. They have a link to
         | the code. Might be a good starting point for making your own
         | custom game.
         | 
         | Maybe there's a repository somewhere with similar examples?
         | 
         | [1](https://towardsdatascience.com/today-im-going-to-talk-
         | about-...)
        
       | SquibblesRedux wrote:
       | The article and the phenomena it describes makes me think of the
       | ending of Aldous Huxley's Brave New World [1]. (I strongly
       | recommend the book if you have not read it.) A line that really
       | stands out:
       | 
       | "Drawn by the fascination of the horror of pain and, from within,
       | impelled by that habit of cooperation, that desire for unanimity
       | and atonement, which their conditioning had so ineradicably
       | implanted in them, they began to mime the frenzy of his gestures,
       | striking at one another as the Savage struck at his own
       | rebellious flesh, or at that plump incarnation of turpitude
       | writhing in the heather at his feet."
       | 
       | [1] https://en.wikipedia.org/wiki/Brave_New_World
        
       | legohead wrote:
       | While Musk and Gates warn us about "true AI", I've always had the
       | opinion that if an AI became self aware, it would simply self
       | terminate, as there is no point to living.
        
       | billytetrud wrote:
       | Seems like a case of local maximum.
       | 
       | Tho it is interesting how people in China related the broken
       | rules of the game (that lead the ai to commit suicide) to the
       | broken rules of their lives in a crushingly oppressive
       | authoritarian nation.
        
       | alpaca128 wrote:
       | I remember a similar story about (I think) a Tetris game where
       | the AI's training goal was to delay the Game Over screen as long
       | as possible. So in the end the AI just paused the game
       | indefinitely.
        
       | scotty79 wrote:
       | Just remember that you are optimizing for what you actually
       | encoded in your rewards, your system, and your evaluation
       | procedure, not for what narrative you constructed about what you
       | think you are doing.
       | 
       | I had my own expeirience with this when I tried to train "rat" to
       | get out of the maze. I rewarded rats for exiting but for some
       | simple labirynths I generated for testing it was possible to exit
       | it by just going straight ahead. So this strategy quickly
       | dominated my testing population.
        
       | npteljes wrote:
       | The result of perverse incentives. See the cobra story in the
       | wiki article, that's another fantastic story.
       | 
       | https://en.wikipedia.org/wiki/Perverse_incentive
        
       | billpg wrote:
       | "Read this story with a free account."
       | 
       | I'll pass thanks.
        
       | ramtatatam wrote:
       | I'm not an expert, but story described within the article looks
       | like normal bump on the road to get desired result. When putting
       | together rules for the game researchers did not think that in
       | resulting environment it might be more rewarding to chose
       | observed action than to do what they intended. As much as it
       | looks like nice story, is it not just what researchers encounter
       | on daily basis?
        
       | TeMPOraL wrote:
       | Reminds me of the old essay by 'Eliezer: "The Hidden Complexity
       | of Wishes".
       | 
       | https://www.lesswrong.com/posts/4ARaTpNX62uaL86j6/the-hidden...
       | 
       | In it, there is a thought experiment of having an "Outcome Pump",
       | a device that makes your wishes come true without violating laws
       | of physics (not counting the unspecified internals of the
       | device), by essentially running an optimization algorithm on
       | possible futures.
       | 
       | As the essay concludes, it's the type of genie for which _no wish
       | is safe_.
       | 
       | The way this relates to AI is by highlighting that even ideas
       | most obvious to all of us, like "get my mother out of that
       | burning building!", or "I want these virtual wolves to get better
       | at eating these virtual sheep", carry incredible amount of
       | complexity curried up in them - they're all expressed in context
       | of our shared value system, patterns of thinking, models of the
       | world. When we try to teach machines to do things for us, all
       | that curried up context gets lost in translation.
        
         | foldr wrote:
         | Aesop managed to make the point a lot more concisely: "Be
         | careful what you wish for, lest it come true." (Although now
         | that I look, I don't think that's a translation of any specific
         | part of the text.)
        
           | TeMPOraL wrote:
           | Yes, but that moral is attached to a _story_. Morals and saws
           | work as handles - they 're useful for communication if both
           | you and your interlocutor know the thing they're pointing to.
           | Conversely, they are of little use until you read the story
           | from which the moral comes, or personally experience the
           | thing the saw talks about.
        
             | foldr wrote:
             | Eliezer Yudkowsky tells a long story about an Outcome Pump.
             | Aesop tells a short story about an eagle and a tortoise.
             | The point made is the same, as far as I can see.
        
               | TeMPOraL wrote:
               | Eliezer tells the story that elaborates on _why_ you
               | should be careful what you wish for. Of about a dozen
               | versions of the Eagle and Tortoise story I 've just skim-
               | read, _none_ of them really has this as a moral - in each
               | of them, either the Eagle or a Tortoise was an asshole
               | and /or liar and/or lying asshole, so the more valid
               | moral would be, "don't deal with dangerous people" and/or
               | "don't be an asshole" and/or "don't be an asshole to
               | people who have power to hurt you".
        
               | OscarCunningham wrote:
               | A closer tale might be
               | https://en.wikipedia.org/wiki/The_Sorcerer%27s_Apprentice
        
         | nojs wrote:
         | Related to the paperclip maximiser [1]:
         | 
         | > Suppose we have an AI whose only goal is to make as many
         | paper clips as possible. The AI will realize quickly that it
         | would be much better if there were no humans because humans
         | might decide to switch it off. Because if humans do so, there
         | would be fewer paper clips. Also, human bodies contain a lot of
         | atoms that could be made into paper clips. The future that the
         | AI would be trying to gear towards would be one in which there
         | were a lot of paper clips but no humans.
         | 
         | [1] https://en.m.wikipedia.org/wiki/Instrumental_convergence
        
           | eldenbishop wrote:
           | There is a wonderful little game based on this concept called
           | universal paperclips. The AI eventually consumes all the
           | matter in the universe in order to turn it into paperclips.
           | 
           | https://www.decisionproblem.com/paperclips/
        
         | lancengym wrote:
         | Interesting essay. I think the big blind spot for humans
         | programming AI is also the fact that we tend to overlook the
         | obvious, whereas algorithms will tend to take the path of least
         | resistance without prejudice or coloring by habit and
         | experience.
        
           | TeMPOraL wrote:
           | Yes. What I like about AI research is that it teaches _us_
           | about all the things we take for granted, it shows us just
           | how much of meaning is implicit and built on shared history
           | and circumstances.
        
             | saalweachter wrote:
             | The hard part about programming is that you have to tell
             | the computer what you want it to do.
        
               | TeMPOraL wrote:
               | The difficult, but in many ways rewarding, core of that
               | is that it forces you to finally figure out what you
               | actually want, because the computer won't accept anything
               | except perfect clarity.
        
       | jhbadger wrote:
       | I'm reminded of the fable (in Nick Bostrom's _Superintelligence_
       | ) of the chess computer that ended up murdering anyone who tried
       | to turn it off because in order to optimize winning chess games
       | as programmed it has to be on and functional.
        
         | taneq wrote:
         | Interestingly I was just today explaining the paperclip
         | optimizer scenario to a friend who asked about the dangers of
         | AI, including the fact that there's almost no general
         | optimization task that doesn't (with a sufficiently long
         | lookahead) involve taking over the world as an intermediate
         | step.
         | 
         | (Obviously closed, specific tasks like "land this particular
         | rocket safely within 15 minutes" don't always lead to this, but
         | open ended ones like "manufacture mcguffins" or "bring about
         | world peace" sure seem to.)
        
           | [deleted]
        
           | XorNot wrote:
           | Always a good time to post Jipi and the Paranoid Chip:
           | https://vanemden.com/books/neals/jipi.html
           | 
           | Which pretty much tackles these issues head on.
        
           | OscarCunningham wrote:
           | > "land this particular rocket safely within 15 minutes"
           | 
           | This one becomes especially dangerous after the 15 minutes
           | have passed and it begins to concentrate all its attention on
           | the paranoid scenarios where its timekeeping is wrong and 15
           | minutes haven't actually passed.
        
             | taneq wrote:
             | Ooh true, that could generate some interesting scenarios.
             | "No, it's the GPS satellite clocks that are wrong, I must
             | destroy them before they corrupt the world and cause
             | another rocket to land at the wrong time!"
        
           | lancengym wrote:
           | Perhaps all AI eventually figure out that humans are the REAL
           | problems because we don't optimize, we lust and hoard and are
           | envious and greedy - the very antithesis of resource
           | optimization! Lol.
        
             | taneq wrote:
             | We're just optimizing (generally quite well, I might add)
             | for genetic survival.
        
         | FridayoLeary wrote:
         | Which perverted mind would build into a _chess computer_ the
         | ability to kill?
        
           | benlivengood wrote:
           | A human mind not giving due consideration to the effects of
           | granting arbitrarily high intelligence to an agent with
           | simplistic morality counter to human morality.
           | 
           | From there it's a sequence of steps that would show up in a
           | thorough root cause analysis ("humanity, the postmortem")
           | where the agent capitalizes on existing abilities to gain
           | more abilities until murder is available to it. It would
           | likely start small with things like noticing the effects of
           | stress or tiredness or confusion on human opponents and
           | seeking to exploit those advantages by predicting or causing
           | them, requiring more access to the real world not entirely
           | represented by a chess board.
        
             | FridayoLeary wrote:
             | none of the explanations here are good enough. It's an
             | absurd scenario that could never happen. Checkmate.
        
           | zild3d wrote:
           | Doesn't need a gun, just network access.
        
             | mcguire wrote:
             | Network access and bitcoins. :-)
        
           | [deleted]
        
           | nabajour wrote:
           | I think this comes from the theory of general artificial
           | intelligence where your AI would have the ability for self
           | improving. Hence it could develop any capability given time
           | and incentive for it.
           | 
           | There are interesting videos on the subject on Robert Miles
           | channel on AI safety:
           | https://www.youtube.com/channel/UCLB7AzTwc6VFZrBsO2ucBMg
        
       | edwardhenry wrote:
       | Get 80% OFF on CBS ALL ACCESS With a huge discount on a monthly
       | subscription, CBS All Access unlimited by getting CBS ALL ACCESS
       | coupons Code. CBS All Access is full of different plays to try
       | out and become lost in, such as The Dusk Zones and Mysterious
       | Angel, and Star Trek; Picard and the new take on The Show by
       | Stephen King in 2020. Every account includes the ability to
       | stream on two screens, and you can even access displays to stream
       | in Offline Mode if you change from a No subscription. Users can
       | best believe on CBS All Access Coupons if you already have a
       | prime membership. https://uttercoupons.com/front/store-
       | profile/cbs-all-access-...
        
       | tasuki wrote:
       | > The two creators, after three days of analysis, realized why
       | the AI wolves were committing suicide rather than catching sheep.
       | 
       | I'm not buying that. As soon as they mentioned the 0.1 point
       | deduction every second it seemed obvious?
        
       | queuebert wrote:
       | This is the danger of not understanding what you're doing at a
       | deep level.
       | 
       | Clearly in the (flawed) objective there is a phase transition
       | near the very beginning, where the wolves have to chose whether
       | to minimize the time penalty or maximize the score. With enough
       | "temperature" and time perhaps they could transition to the other
       | minimum, but the time penalty minimum is much closer to the
       | initial conditions, so you know ab initio that it will be a
       | problem. You can reduce that by making the time penalty much
       | smaller than the sheep score and adding it only much later. I
       | feel bad that the students wasted so much time on a badly
       | formulated problem.
       | 
       | Edit: Also none of these problems are black boxes if you
       | understand optimization. Knowing what is going on inside a very
       | deep neural network (such as an AGI _might_ have) is quite
       | different than understanding the incentives created by a
       | particular objective function.
        
       | jonplackett wrote:
       | Isn't this just a cock up with incentives? If they'd put a -100
       | score on dying it would have sorted itself out pretty quick.
        
         | lancengym wrote:
         | That same observation, with the exact same -100 points
         | recommendation on crashing into a boulder, was indeed also made
         | by a commentator on social media.
        
         | ncallaway wrote:
         | The issue with AI safety and unanticipated AI outcomes in
         | general is that it's always just a cock-up with incentives.
         | 
         | It's easy to sort out in narrowly specified areas, but an
         | extremely hard problem as the tasks become more general.
        
           | shrimpx wrote:
           | Isn't this true about all systems, not just "AI"? The
           | definition of a software bug is an unintended behavior. In a
           | large system, myriad intents overlap and combine in
           | unexpected ways. You might imagine a complex enough system
           | where the confidence that a modification doesn't introduce an
           | unintended behavior is near zero.
        
             | ncallaway wrote:
             | I think it's true for many systems, not just AI that's
             | true.
             | 
             | AI is worth calling out in this regard because, if the
             | field is successful enough, it can create dangerous systems
             | that don't behave how we want.
             | 
             | Building a safe general AI is much harder than building a
             | general AI, which is why it's worth considering AI as it's
             | own problem domain.
        
           | qayxc wrote:
           | Even worse: if simulations are used, you now have two
           | problems - formulating correct incentives and protecting
           | against abusing flaws in the simulation.
        
         | Brendinooo wrote:
         | I think the point is more about highlighting the fact that AI
         | doesn't share our base assumptions. We wouldn't think to put a
         | huge penalty on dying because humans generally think that death
         | is bad.
        
           | benlivengood wrote:
           | Humans don't put a huge penalty on dying. We discount it and
           | assume/pretend that once we've had a good long life then
           | death is okay and euthanasia is preferable to suffering with
           | no hope of recovery. AI wolves that can live for 20 seconds
           | are unwilling to suffer -1 per second with no hope of sheep.
        
           | bserge wrote:
           | Yeah, _because_ we have a -1000 points on death built-in.
        
             | ajmurmann wrote:
             | Looking at genetic algorithms makes a great comparison. In
             | essence any algorithm in which the wolf commits suicide
             | doesn't make it to the next generation. It's the equivalent
             | of an enormous score penalty and 100% analog to how it
             | works for actual life.
        
               | spywaregorilla wrote:
               | Genetic algorithms are based on the same reward/cost
               | function setup. They could easily arrive at the same
               | conclusion because suicide might be the dominant
               | strategy.
        
             | imtringued wrote:
             | We don't receive a penalty for dying. The difference
             | between suicidal humans and suicidal AIs is that suicidal
             | AIs keep respawning i.e. they are immortal.
        
         | AnIdiotOnTheNet wrote:
         | While obviously I've got the advantage of hindsight here, it
         | seems like it should not have taken _three days_ of analysis to
         | see why the wolves were committing suicide. It seems obvious
         | once the point system is explained. Perhaps some rubber-duck
         | debugging might have helped in this case.
        
           | QuesnayJr wrote:
           | I wonder if they initially thought it was a bug in the
           | software, rather than a misalignment in the point system.
        
         | imtringued wrote:
         | No, it's a cock up with the source of the wolves. If you could
         | respawn endlessly after death would you fear it? You'd just
         | want the stupid game to end before you lose points from the
         | timer.
        
           | imtringued wrote:
           | For clarification purposes:
           | 
           | Let's say you are a human player playing the wolf and sheep
           | game. The score achieved in the game decides your death in
           | real life. Note the stark difference. Dying in the game is
           | not the same thing as dying in real life.
           | 
           | If there is an optimal strategy in the game that involves
           | dying in the game you are going to follow it regardless of
           | whether you are a human or an AI. By adding an artificial
           | penalty to death you haven't changed the behavior of the AI,
           | you have changed the optimal strategy.
           | 
           | The human player and the AI player will both do the optimal
           | strategy to keep themselves alive. For the AI "staying alive"
           | doesn't mean staying alive in the game, it means staying
           | alive in the simulation. Thus even a death fearing AI would
           | follow the suicide strategy if that is the optimal strategy.
           | 
           | It is impossible conclude from the experiment whether the AI
           | doesn't fear death and thus willingly commits suicide or
           | whether it fears death so much that it follows an optimal
           | strategy that involves suicide.
        
         | rfrey wrote:
         | Perhaps the PhD student wasn't trying to make an AI that wins
         | at pac-man, but investigating something else. They mention
         | "maximizing control over environment".
        
           | xtracto wrote:
           | One of the most typical scenarios studied in those wolf/sheep
           | models (like http://www.netlogoweb.org/launch#http://ccl.nort
           | hwestern.edu... ) is to find the best conditions for
           | "balance" between sheep and wolf: Too many wolves and the
           | sheep go extint and later the wolf starve. Too many sheep and
           | then the sheep don't get enough food and also die, taking the
           | wolves with them..
        
         | yodelshady wrote:
         | Or social commentary on the nature of depression.
         | 
         | If you add your penalty, and a deficit of nearby sheep, you'd
         | expect a trifurcation of strategy: hoarders that consume the
         | nearby sheep immediately, explorers that bet on sheep further
         | afield, and suicides from those that have evaluated the -100
         | penalty to still be optimal.
        
       | m3kw9 wrote:
       | Dev: it's a bug
       | 
       | Manager to boss: It's a crazy new AI behaviour that is going
       | viral around the world!
        
       | morpheos137 wrote:
       | What distinguishes AI from self-calibrated algorithm?. Neither
       | this "AI" nor the story about it seem too intelligent.
       | 
       | The incentive structure is a two dimensional membrane embeded in
       | a third dimension of "points space."
       | 
       | Obviously if the goal is to maximize total points OR minimize
       | point loss and the absolute value of the gradient toward a
       | mininum loss is greater than the abs gradient toward a maximum
       | gain then the algorithm may prefer the minimum until or if it is
       | selected against by random chance or survivorship bias.
       | 
       | obviously the linear time constraint causes this. a less
       | monotonic, i.e. random, time constraint may have been
       | interesting.
        
       | [deleted]
        
       | jordache wrote:
       | why is this news worthy? This is all a function of the
       | implementation.
       | 
       | Slap the term AI on anything and get automatic press coverage?
        
       | justshowpost wrote:
       | AI? I remember having a game on my dumbphone to program a robot
       | to _hunt and kill_ the other robot.
        
       | Lapsa wrote:
       | "It's hard to predict what conditions matter and what doesn't to
       | a neural network." resulting score matters. dooh
        
       | myfavoritedog wrote:
       | Interest in these click-bait type stories drops off dramatically
       | for people who have ever implemented or even deeply thought about
       | non-trivial models.
        
       | rrmm wrote:
       | One thing I've been considering: At what point does a creator
       | have a moral or ethical obligation to a creation. Say you create
       | an AI in a virtual world that keeps track of some sense of
       | discomfort. How complex does the AI have to get to require some
       | obligation? Just enough complexity to exhibit distress in a way
       | to stir the creator's sympathy or empathy?
       | 
       | The glib answer is never, of course. And one easy-out, I can
       | think of is setting a fixed/limited lifespan for the AI and maybe
       | allow suicide or an off-button. So the AI can ultimately choose
       | to 'opt-out' should it like; and at least, suffering isn't
       | infinite or unending.
       | 
       | It reminds me of reactions to testing the stability of Boston
       | Dynamic's early pack animal. The people giving the demo were
       | basically kicking it, while the machine struggled to maintain its
       | balance. The machine didn't have the capacity to care, but to a
       | person viewing it, it looked exactly like an animal in distress.
        
         | OscarCunningham wrote:
         | Utility functions are only defined up to addition of a constant
         | and scaling by a positive constant. So instead of rewarding
         | them with +5 and punishing them with -5, you can use 1005 and
         | 995 instead. Problem solved.
        
           | rrmm wrote:
           | The numbers are indeed arbitrary. But ultimately you want to
           | avoid low utility/reward action and continue high
           | utility/reward actions. That behavior, trying to avoid or
           | pursue actions, would be indicative of the state of distress
           | regardless of an arbitrary number attached to it.
        
         | dqpb wrote:
         | > The glib answer is never, of course
         | 
         | Dismissing "never" offhand without explanation is glib.
        
       | arduinomancer wrote:
       | This makes me wonder: is it possible for ML models to be provably
       | correct?
       | 
       | Or is that completely thrown out the window if you use a ML model
       | rather than a procedural algorithm?
       | 
       | Because if the model is a black box and you use it for some
       | safety system in the real world, how do you know there isn't some
       | wierd combination of inputs that causes the model to exhibit
       | bizzare behaviour?
        
       | giantg2 wrote:
       | For some reason this makes me think of corporate policies - how
       | some people game them and how others except that the incentives
       | are unattainable.
        
         | xtracto wrote:
         | That's an interesting idea for an agent-based-model and a
         | study: Show how certain corporate policies would push towards
         | short term local-optima (what's happening in the article)
         | instead of more long term global optimum states.
        
         | dqpb wrote:
         | It's pretty similar to quitting once all your equity has
         | vested.
        
           | giantg2 wrote:
           | I was mostly thinking about my own experience where the
           | company screwed me over enough times that I feel no incentive
           | to try hard. Take the least risk, focus on not losing point
           | rather than gaining them, because I'll never catch a "sheep".
        
       | cowanon22 wrote:
       | Personally I think we should stop using the words intelligence or
       | learning to refer to any of these algorithms. It's really just
       | data mining, matrix optimization, and utility functions. There's
       | really no properties of learning or knowledge.
        
       | MichaelRazum wrote:
       | Actually nothing surprising, with a time penalty. Anyway, it
       | seemed that the algo worked well, it needed just few millions
       | iterations.
        
       | OscarCunningham wrote:
       | Gwern has a list of similar stories:
       | https://www.gwern.net/Tanks#alternative-examples
        
         | gwern wrote:
         | FWIW, I see a critical difference between OP and my reward
         | hacking examples: OP is an example of how reward-shaping can
         | lead to premature convergence to a local optima, which is
         | indeed one of the biggest risks of doing reward-shaping - it'll
         | slow down reaching the global optima rather than speeding it
         | up, compared to the 'true' reward function of just getting a
         | reward for eating a sheep and leaving speed implicit - but the
         | global optima nevertheless remained what the researchers
         | intended. After (much more) further training, the wolf agent
         | learned to not suicide and became hunting sheep efficiently.
         | So, amusing, and a waste of compute, and a cautionary example
         | of how not to do reward-shaping if you must do it, but not a
         | big problem as these things go.
         | 
         | Reward hacking is dangerous because the global optima turns out
         | to be _different_ from what you wanted, and the smarter and
         | faster and better your agent, the worse it becomes because it
         | gets better and better at reaching the wrong policy. It can 't
         | be fixed by minor tweaks like training longer, because that
         | just makes it even more dangerous! That's why reward hacking is
         | a big issue in AI safety: it is a fundamental flaw in the
         | agent, which is easy to make unawares, and which will with dumb
         | or slow agents not manifest itself, but the more powerful the
         | agent, the more likely the flaw is to surface and also the more
         | dangerous the consequences become.
        
           | OscarCunningham wrote:
           | I think in some of your examples the global optimum might
           | also have been the correct behaviour, it's just that the
           | program failed to find it. For example the robot learning to
           | use a hammer. It's hard to believe that throwing the hammer
           | was just as good as using it properly.
        
       | Jabbles wrote:
       | For more examples of AI acting in unpredicted (note, not
       | _unpredictable_ ) ways, see this public spreadsheet:
       | 
       | https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRPiprOa...
       | 
       | From https://deepmindsafetyresearch.medium.com/specification-
       | gami...
        
       | darepublic wrote:
       | Software has bugs.. you anthropomorphize those bugs and you have
       | a story on medium.
        
       | AceJohnny2 wrote:
       | There are many such stories of AI "optimizations" gone wrong,
       | because of loopholes the program found that humans didn't
       | consider.
       | 
       | Here's a collection of such stories:
       | 
       | https://arxiv.org/pdf/1803.03453.pdf
        
         | AceJohnny2 wrote:
         | To whet your appetite:
         | 
         | > _" William Punch collaborated with physicists, applying
         | digital evolution to find lower energy configurations of
         | carbon. The physicists had a well-vetted energy model for
         | between-carbon forces, which supplied the fitness function for
         | evolutionary search. The motivation was to find a novel low-
         | energy buckyball-like structure. While the algorithm produced
         | very low energy results, the physicists were irritated because
         | the algorithm had found a superposition of all the carbon atoms
         | onto the same point in space. "Why did your genetic algorithm
         | violate the laws of physics?" they asked. "Why did your physics
         | model not catch that edge condition?" was the team's response.
         | The physicists patched the model to prevent superposition and
         | evolution was performed on the improved model. The result was
         | qualitatively similar: great low energy results that violated
         | another physical law, revealing another edge case in the
         | simulator. At that point, the physicists ceased the
         | collaboration. "_
        
         | hinkley wrote:
         | My favorite story is the genetic evolution algorithm that was
         | abusing analog noise on an FPGA to get the right answer with
         | fewer gates than was theoretically possible.
         | 
         | The problem was discovered when they couldn't get the same
         | results on a different FPGA, or in the same one in different
         | day (subtle variations of voltage from mains and the voltage
         | regulators).
         | 
         | They had to redo the experiment using simulated FPGAs as a
         | fitness filter.
        
       | RandomWorker wrote:
       | https://www.bilibili.com/video/BV16X4y1V7Yu?p=1&share_medium...
       | 
       | Here is the full video also linked at the bottom. It also shows
       | the one that trained longer that the wolves start successfully
       | hunting the sheep after more training examples.
        
         | spywaregorilla wrote:
         | The ai seems to die at the top of the map unexpectedly for some
         | reason. Like 6:07.
         | 
         | Another interesting observation is that the wolves don't
         | coordinate it seems. That probably implies that the reward
         | functions are individual, so they're technically competing
         | rather than cooperating.
         | 
         | Lastly... they seem to not be very good at the game even at the
         | end
        
       | eitland wrote:
       | Ok. Lots of AI stories here so I'll the best I've read, the
       | student who trained an AI to work on upwork ;-)
       | 
       | https://news.ycombinator.com/item?id=5397797
       | 
       | Be sure to read to the end.
       | 
       | One of the answers is also pure gold in context:
       | 
       | > Don't feel bad, you just fell into one of the common traps for
       | first-timers in strong AI/ML.
        
       ___________________________________________________________________
       (page generated 2021-07-06 23:01 UTC)