[HN Gopher] Reward Is Unnecessary [pdf]
       ___________________________________________________________________
        
       Reward Is Unnecessary [pdf]
        
       Author : optimalsolver
       Score  : 58 points
       Date   : 2021-06-28 09:01 UTC (2 days ago)
        
 (HTM) web link (tilde.town)
 (TXT) w3m dump (tilde.town)
        
       | mrkramer wrote:
       | I think intelligence is an ability to solve problems. There are
       | many types of intelligence and there are many types of problems.
       | For example there is hardcoded primitive intelligence in
       | calculators which enables them to solve math problems and there
       | is high intelligence of humans which is emergent naturally
       | occurring product of evolution which enables them(us) to solve
       | very complex problems of all sorts.
        
         | grey-area wrote:
         | But to solve a problem any more complex than 2 + 2 you need to
         | have a model of the world, however simplistic.
         | 
         | For example to count how many bananas are in a box properly,
         | without weird edge cases, you need to know what a banana is and
         | what it looks like and what a box is and what it looks like.
        
           | mrkramer wrote:
           | Like what I said when speaking of Alien Life [0] it all comes
           | down to chemistry and biology and the evolution when talking
           | about naturally occurring intelligence.
           | 
           | When speaking about AI as far as I understand it should be
           | human like in order to be called artificial intelligence. If
           | it mimics human intelligence and solves trivial problems then
           | it is nothing than calculator with primitive intelligence
           | that I mentioned before.
           | 
           | The thing you are talking is evolution, you acquire model of
           | the world during millions of years of the evolution step by
           | step but today computer scientists are trying to hardcode
           | human like intelligence into computers which is extremely
           | hard taking in consideration how long it took for humans to
           | become highly intelligent. I think better approach would be
           | to write evolutionary algorithms which perhaps can yield
           | human like intelligence.
           | 
           | There is some secret ingredient in human evolution which made
           | us the most intelligent specie on Earth no specie is even
           | remotely close to be as intelligent as we are. I think nobody
           | really knows what that secret ingredient is.
           | 
           | [0] https://news.ycombinator.com/item?id=27574320
        
             | mannykannot wrote:
             | I certainly do not know what the secret ingredient is, but
             | it is quite plausible that a number of extinct species
             | (Neanderthals, Homo erectus, maybe Homo habilis) also had
             | it to some degree.
        
               | mrkramer wrote:
               | We made it because we were and are more cooperative and
               | more social in others words we gather together in order
               | to survive and work together but other species such as
               | bees or ants are also very much cooperative and social
               | but they are not highly intelligent although they have
               | collective intelligence.
               | 
               | I think anatomy of humans played the crucial role in our
               | evolution. Hands are powerful tools which enabled us to
               | create and develop other tools and technologies.
        
       | wombatmobile wrote:
       | The limitation of world models, in the context of achieving
       | artificial general intelligence, is not the validity or
       | granularity or faithfulness of the model, but rather the
       | limitation argued by Hubert Dreyfus, that computers are not in
       | the world.
       | 
       | Whilst physics can be modelled, and hence kinematics and dynamics
       | can be modelled, intelligence, in the human sense, is different.
       | Intelligence, for humans, is sociological, and driven by biology.
       | 
       | Computers cannot parse culture because culture is comprised of
       | arbitrary, contradictory paradigms. Cultural values can only be
       | resolved in an individual context as an integration of a lifetime
       | of experiences.
       | 
       | Computers cannot do this because they cannot feel pleasure or
       | pain, fear or optimism, joy or sorrow, opportunity, disinterest
       | or attraction. They cannot grow older, give birth, or die. As a
       | consequence, they lack the evaluative tools of emotion and
       | experience that humans use to participate in culture.
       | 
       | But wait, you may protest. Computers don't need to feel emotions
       | since these can be modelled. A computer can recognise a man
       | pointing a gun at it demanding money, which is as good as the
       | ability to feel fear, right?
       | 
       | A computer can recognise faces, so surely its only a small step
       | further to recognise beauty, which is enough to simulate the
       | feeling of attraction, right?
       | 
       | A computer won't feel sorrow, but it can know that the death of a
       | loved one or the loss of money are appropriate cues, so that is
       | as good as the feeling of sorrow, right?
       | 
       | The limitation of this substitution of emotions with modelling is
       | that the modelling, and remodelling has to take place externally
       | to the computer. In biological organisms that are in the world,
       | each experience yields an emotional response that is incorporated
       | into the organism. The organism is the sum of its experiences,
       | mediated by its biology.
       | 
       | Consider this question: In a room full of people, who should you
       | talk to? What should you say to them? What should you not say?
       | 
       | A computer can only be programmed to operate in that environment
       | with respect to some externally programmed objective. e.g. if the
       | computer were programmed to maximise its chances of being offered
       | a ride home from a party, it might select which person to talk to
       | based on factors such as sobriety, and an observation of factors
       | indicating who had driven to the party in their own vehicle.
       | 
       | But without the externally programmed objective, how is the
       | computer, or AGI agent to navigate the questions?
       | 
       | Humans, of course, have those questions built-in to the fabric of
       | their thoughts, which spring from their biological desires, and
       | the answers come from their cumulative experiences in the world.
        
         | candiodari wrote:
         | There is in fact artificial general intelligence and emotions,
         | desires, etc, in computer worlds:
         | 
         | 1) multiplayer games with "bots"
         | 
         | 2) all these things, on a lower level, serve to make groups of
         | entities communicate and cooperate. Even in solitary animals
         | like cats these emotions serve to facilitate cooperation, to
         | produce offspring or share territory optimally. There is no
         | problem with creating that artificially: just have multiplayer
         | environments with multiple artificial entities cooperating,
         | resource constraints, "happiness", "pain" and "sorrow".
         | 
         | It's going to take a while before we see these entities compose
         | poetry when their mate dies, but it'll go in the same
         | direction.
        
         | eru wrote:
         | > Computers cannot parse culture because culture is comprised
         | of arbitrary, contradictory paradigms. Cultural values can only
         | be resolved in an individual context as an integration of a
         | lifetime of experiences.
         | 
         | Eh, GPT-3 is pretty good at imitating (some parts of) culture
         | already.
         | 
         | > A computer can only be programmed to operate in that
         | environment with respect to some externally programmed
         | objective. e.g. if the computer were programmed to maximise its
         | chances of being offered a ride home from a party, it might
         | select which person to talk to based on factors such as
         | sobriety, and an observation of factors indicating who had
         | driven to the party in their own vehicle.
         | 
         | Have you ever tried debugging a program? Programs can do stuff
         | that's really hard to predict, even if you wrote them
         | specifically to be easy to predict (ie understandable).
        
         | eli_gottlieb wrote:
         | I don't see why you can't just give your "AGIs" actual
         | emotions. The argument you're making doesn't make sense for
         | emotions as the outcomes of our neurobiology, only for emotions
         | as a kind of immaterial spirit inexplicably housed in meat-
         | shells that need no explanation.
        
       | canjobear wrote:
       | How will you train your world model? Cross-entropy loss? Oh, it's
       | reward again.
        
       | shenberg wrote:
       | The origins of the approach of a perfect world model equals AI,
       | at least how I encountered it, are from Marcus Hutter. H proved
       | that perfect compression (=prediction of what will happen next)
       | is the ideal way to drive an agent in reinforcement learning[1]
       | (the choice of action being the likeliest high-reward action).
       | 
       | So a perfect world model is enough to win at reinforcement
       | learning. Can you show that if you maximized reward in some RL
       | problem, it means you necessarily built a perfect world-model?
       | 
       | [1] A Theory of Universal Artificial Intelligence based on
       | Algorithmic Complexity, 2000 -
       | https://arxiv.org/pdf/cs/0004001.pdf Side-note: since building a
       | perfect model is uncomputable, this is all a theoretical
       | discussion. The paper also discusses time-bounded computation and
       | has some interesting things to say about optimality in this case.
        
         | abeppu wrote:
         | > So a perfect world model is enough to win at reinforcement
         | learning. Can you show that if you maximized reward in some RL
         | problem, it means you necessarily built a perfect world-model?
         | 
         | No? Because maximizing reward for a specific problem may mean
         | avoiding some states entirely, so your model has no need of
         | understanding transitions out of those states, only how to
         | avoid landing in them.
         | 
         | E.g. if you have a medical application about interventions to
         | improve outcomes for patients with disease X, it's unnecessary
         | to refine the part of your model which would predict how fast a
         | patient would die after you administer a toxic dose of A
         | followed by a toxic dose of B. Your model only need to know
         | that administering a toxic dose of A always leads to lower
         | value states than some other action.
         | 
         | I think a "perfect" world model is required by a "universal" AI
         | in the sense that the range of problems it can handle must be
         | solved by optimal policies which together "cover" all state
         | transitions (in some universe of states).
        
       | southerntofu wrote:
       | I'm not really into AI, but i love that this person is posting
       | their blogpost as a LateX-formatted PDF on their personal page on
       | a tilde server.
       | 
       | For those who don't know, a tilde server is a community-operated
       | server distributing shell accounts (via SSH) to its members, and
       | sometimes other services. See tildeverse.org for a small
       | federation of such operators.
        
         | rhn_mk1 wrote:
         | I like people trying out things, but at the same time I can't
         | help but be disappointed that PDF was chosen as the
         | presentation format.
         | 
         | I like my text to fit my screen/window rather than an arbitrary
         | piece of paper.
        
           | southerntofu wrote:
           | I personally also despise PDF as a medium. I'm just really
           | happy somebody is daring to defy established norms because
           | they feel like it.
           | 
           | In an era of ultra-conformity on the web i find it refreshing
           | to see that some people still use HTTP as a means to share
           | documents of their choice, not just single-page applications.
        
         | Zababa wrote:
         | I assumed that the LateX-formatted PDF was to give the
         | impression that it's a paper, since the document also follows
         | the norm of how paper are written (abstract, we, references,
         | etc).
        
       | grey-area wrote:
       | This is a really interesting definition of intelligence as
       | building a model of the world:
       | 
       | Solving intelligence is a highly complex problem, in part because
       | it is nearly impossible to get any significant number of people
       | to agree about what intelligence actually means. We eliminate
       | this dilemma by choosing to ignore any kind of consensus, instead
       | defining it as "the ability to predict unknown information given
       | known information".
       | 
       | To put it more put it more simply, we define intelligence as a
       | model of the world.
        
         | _0ffh wrote:
         | Uh, came here to make essentially the same remark, but as
         | criticism. A mere world model, however perfect, is a really
         | hollow definition of intelligence IMHO. It's the definition of
         | a tool, at best. It's only with the introduction of goals that
         | we get to things like taking action, planning, etc., which
         | bring the whole thing to life.
         | 
         | So while I'd agree that a world model is necessary, I seriously
         | doubt it's sufficient for anything that I'd call intelligence.
        
           | beardedetim wrote:
           | I think the world model is one step towards intelligence or
           | one half of it. I've come to believe that the ability to
           | _change_ your world model as new information comes into play
           | is the other half.
        
           | bjornsing wrote:
           | Well, among tools conditional probability is kind of the one
           | ring to rule them all. Just give me some samples from P(plan
           | | goal) and "things like taking action, planning, etc" are
           | trivial (or really, part of the sampling process).
        
             | davidhunter wrote:
             | You're trivialising planning given a model
        
               | bjornsing wrote:
               | Nope. In this definition of intelligence planning is part
               | of the model. The model includes a probability
               | distribution over all possible plans, just like GPT-3
               | includes a probability distribution over all possible
               | news articles.
        
               | _0ffh wrote:
               | >planning is part of the model
               | 
               | Even if it was (which I doubt, except implicitly, which
               | is not the same thing), there can be no planning without
               | a goal.
               | 
               | It's a kind of mirrored Chinese Room fallacy: In that
               | case, the complaint is that the performance of the system
               | cannot be ascribed to any distinct part of the whole,
               | concluding that the whole cannot perform. In this case,
               | the performance of the system is falsely ascribed to one
               | distinct part, ignoring the contribution of the other.
        
           | neatze wrote:
           | Not sure what you mean by goals, because to a degree you
           | don't need dynamic goals (eg. goals that change throughout
           | lifetime of a system) for reactive behavior.
        
           | grey-area wrote:
           | It may be necessary but not sufficient, that's somewhere to
           | start at least.
        
       ___________________________________________________________________
       (page generated 2021-06-30 23:03 UTC)