[HN Gopher] Reward Is Unnecessary [pdf]
___________________________________________________________________
Reward Is Unnecessary [pdf]
Author : optimalsolver
Score : 58 points
Date : 2021-06-28 09:01 UTC (2 days ago)
(HTM) web link (tilde.town)
(TXT) w3m dump (tilde.town)
| mrkramer wrote:
| I think intelligence is an ability to solve problems. There are
| many types of intelligence and there are many types of problems.
| For example there is hardcoded primitive intelligence in
| calculators which enables them to solve math problems and there
| is high intelligence of humans which is emergent naturally
| occurring product of evolution which enables them(us) to solve
| very complex problems of all sorts.
| grey-area wrote:
| But to solve a problem any more complex than 2 + 2 you need to
| have a model of the world, however simplistic.
|
| For example to count how many bananas are in a box properly,
| without weird edge cases, you need to know what a banana is and
| what it looks like and what a box is and what it looks like.
| mrkramer wrote:
| Like what I said when speaking of Alien Life [0] it all comes
| down to chemistry and biology and the evolution when talking
| about naturally occurring intelligence.
|
| When speaking about AI as far as I understand it should be
| human like in order to be called artificial intelligence. If
| it mimics human intelligence and solves trivial problems then
| it is nothing than calculator with primitive intelligence
| that I mentioned before.
|
| The thing you are talking is evolution, you acquire model of
| the world during millions of years of the evolution step by
| step but today computer scientists are trying to hardcode
| human like intelligence into computers which is extremely
| hard taking in consideration how long it took for humans to
| become highly intelligent. I think better approach would be
| to write evolutionary algorithms which perhaps can yield
| human like intelligence.
|
| There is some secret ingredient in human evolution which made
| us the most intelligent specie on Earth no specie is even
| remotely close to be as intelligent as we are. I think nobody
| really knows what that secret ingredient is.
|
| [0] https://news.ycombinator.com/item?id=27574320
| mannykannot wrote:
| I certainly do not know what the secret ingredient is, but
| it is quite plausible that a number of extinct species
| (Neanderthals, Homo erectus, maybe Homo habilis) also had
| it to some degree.
| mrkramer wrote:
| We made it because we were and are more cooperative and
| more social in others words we gather together in order
| to survive and work together but other species such as
| bees or ants are also very much cooperative and social
| but they are not highly intelligent although they have
| collective intelligence.
|
| I think anatomy of humans played the crucial role in our
| evolution. Hands are powerful tools which enabled us to
| create and develop other tools and technologies.
| wombatmobile wrote:
| The limitation of world models, in the context of achieving
| artificial general intelligence, is not the validity or
| granularity or faithfulness of the model, but rather the
| limitation argued by Hubert Dreyfus, that computers are not in
| the world.
|
| Whilst physics can be modelled, and hence kinematics and dynamics
| can be modelled, intelligence, in the human sense, is different.
| Intelligence, for humans, is sociological, and driven by biology.
|
| Computers cannot parse culture because culture is comprised of
| arbitrary, contradictory paradigms. Cultural values can only be
| resolved in an individual context as an integration of a lifetime
| of experiences.
|
| Computers cannot do this because they cannot feel pleasure or
| pain, fear or optimism, joy or sorrow, opportunity, disinterest
| or attraction. They cannot grow older, give birth, or die. As a
| consequence, they lack the evaluative tools of emotion and
| experience that humans use to participate in culture.
|
| But wait, you may protest. Computers don't need to feel emotions
| since these can be modelled. A computer can recognise a man
| pointing a gun at it demanding money, which is as good as the
| ability to feel fear, right?
|
| A computer can recognise faces, so surely its only a small step
| further to recognise beauty, which is enough to simulate the
| feeling of attraction, right?
|
| A computer won't feel sorrow, but it can know that the death of a
| loved one or the loss of money are appropriate cues, so that is
| as good as the feeling of sorrow, right?
|
| The limitation of this substitution of emotions with modelling is
| that the modelling, and remodelling has to take place externally
| to the computer. In biological organisms that are in the world,
| each experience yields an emotional response that is incorporated
| into the organism. The organism is the sum of its experiences,
| mediated by its biology.
|
| Consider this question: In a room full of people, who should you
| talk to? What should you say to them? What should you not say?
|
| A computer can only be programmed to operate in that environment
| with respect to some externally programmed objective. e.g. if the
| computer were programmed to maximise its chances of being offered
| a ride home from a party, it might select which person to talk to
| based on factors such as sobriety, and an observation of factors
| indicating who had driven to the party in their own vehicle.
|
| But without the externally programmed objective, how is the
| computer, or AGI agent to navigate the questions?
|
| Humans, of course, have those questions built-in to the fabric of
| their thoughts, which spring from their biological desires, and
| the answers come from their cumulative experiences in the world.
| candiodari wrote:
| There is in fact artificial general intelligence and emotions,
| desires, etc, in computer worlds:
|
| 1) multiplayer games with "bots"
|
| 2) all these things, on a lower level, serve to make groups of
| entities communicate and cooperate. Even in solitary animals
| like cats these emotions serve to facilitate cooperation, to
| produce offspring or share territory optimally. There is no
| problem with creating that artificially: just have multiplayer
| environments with multiple artificial entities cooperating,
| resource constraints, "happiness", "pain" and "sorrow".
|
| It's going to take a while before we see these entities compose
| poetry when their mate dies, but it'll go in the same
| direction.
| eru wrote:
| > Computers cannot parse culture because culture is comprised
| of arbitrary, contradictory paradigms. Cultural values can only
| be resolved in an individual context as an integration of a
| lifetime of experiences.
|
| Eh, GPT-3 is pretty good at imitating (some parts of) culture
| already.
|
| > A computer can only be programmed to operate in that
| environment with respect to some externally programmed
| objective. e.g. if the computer were programmed to maximise its
| chances of being offered a ride home from a party, it might
| select which person to talk to based on factors such as
| sobriety, and an observation of factors indicating who had
| driven to the party in their own vehicle.
|
| Have you ever tried debugging a program? Programs can do stuff
| that's really hard to predict, even if you wrote them
| specifically to be easy to predict (ie understandable).
| eli_gottlieb wrote:
| I don't see why you can't just give your "AGIs" actual
| emotions. The argument you're making doesn't make sense for
| emotions as the outcomes of our neurobiology, only for emotions
| as a kind of immaterial spirit inexplicably housed in meat-
| shells that need no explanation.
| canjobear wrote:
| How will you train your world model? Cross-entropy loss? Oh, it's
| reward again.
| shenberg wrote:
| The origins of the approach of a perfect world model equals AI,
| at least how I encountered it, are from Marcus Hutter. H proved
| that perfect compression (=prediction of what will happen next)
| is the ideal way to drive an agent in reinforcement learning[1]
| (the choice of action being the likeliest high-reward action).
|
| So a perfect world model is enough to win at reinforcement
| learning. Can you show that if you maximized reward in some RL
| problem, it means you necessarily built a perfect world-model?
|
| [1] A Theory of Universal Artificial Intelligence based on
| Algorithmic Complexity, 2000 -
| https://arxiv.org/pdf/cs/0004001.pdf Side-note: since building a
| perfect model is uncomputable, this is all a theoretical
| discussion. The paper also discusses time-bounded computation and
| has some interesting things to say about optimality in this case.
| abeppu wrote:
| > So a perfect world model is enough to win at reinforcement
| learning. Can you show that if you maximized reward in some RL
| problem, it means you necessarily built a perfect world-model?
|
| No? Because maximizing reward for a specific problem may mean
| avoiding some states entirely, so your model has no need of
| understanding transitions out of those states, only how to
| avoid landing in them.
|
| E.g. if you have a medical application about interventions to
| improve outcomes for patients with disease X, it's unnecessary
| to refine the part of your model which would predict how fast a
| patient would die after you administer a toxic dose of A
| followed by a toxic dose of B. Your model only need to know
| that administering a toxic dose of A always leads to lower
| value states than some other action.
|
| I think a "perfect" world model is required by a "universal" AI
| in the sense that the range of problems it can handle must be
| solved by optimal policies which together "cover" all state
| transitions (in some universe of states).
| southerntofu wrote:
| I'm not really into AI, but i love that this person is posting
| their blogpost as a LateX-formatted PDF on their personal page on
| a tilde server.
|
| For those who don't know, a tilde server is a community-operated
| server distributing shell accounts (via SSH) to its members, and
| sometimes other services. See tildeverse.org for a small
| federation of such operators.
| rhn_mk1 wrote:
| I like people trying out things, but at the same time I can't
| help but be disappointed that PDF was chosen as the
| presentation format.
|
| I like my text to fit my screen/window rather than an arbitrary
| piece of paper.
| southerntofu wrote:
| I personally also despise PDF as a medium. I'm just really
| happy somebody is daring to defy established norms because
| they feel like it.
|
| In an era of ultra-conformity on the web i find it refreshing
| to see that some people still use HTTP as a means to share
| documents of their choice, not just single-page applications.
| Zababa wrote:
| I assumed that the LateX-formatted PDF was to give the
| impression that it's a paper, since the document also follows
| the norm of how paper are written (abstract, we, references,
| etc).
| grey-area wrote:
| This is a really interesting definition of intelligence as
| building a model of the world:
|
| Solving intelligence is a highly complex problem, in part because
| it is nearly impossible to get any significant number of people
| to agree about what intelligence actually means. We eliminate
| this dilemma by choosing to ignore any kind of consensus, instead
| defining it as "the ability to predict unknown information given
| known information".
|
| To put it more put it more simply, we define intelligence as a
| model of the world.
| _0ffh wrote:
| Uh, came here to make essentially the same remark, but as
| criticism. A mere world model, however perfect, is a really
| hollow definition of intelligence IMHO. It's the definition of
| a tool, at best. It's only with the introduction of goals that
| we get to things like taking action, planning, etc., which
| bring the whole thing to life.
|
| So while I'd agree that a world model is necessary, I seriously
| doubt it's sufficient for anything that I'd call intelligence.
| beardedetim wrote:
| I think the world model is one step towards intelligence or
| one half of it. I've come to believe that the ability to
| _change_ your world model as new information comes into play
| is the other half.
| bjornsing wrote:
| Well, among tools conditional probability is kind of the one
| ring to rule them all. Just give me some samples from P(plan
| | goal) and "things like taking action, planning, etc" are
| trivial (or really, part of the sampling process).
| davidhunter wrote:
| You're trivialising planning given a model
| bjornsing wrote:
| Nope. In this definition of intelligence planning is part
| of the model. The model includes a probability
| distribution over all possible plans, just like GPT-3
| includes a probability distribution over all possible
| news articles.
| _0ffh wrote:
| >planning is part of the model
|
| Even if it was (which I doubt, except implicitly, which
| is not the same thing), there can be no planning without
| a goal.
|
| It's a kind of mirrored Chinese Room fallacy: In that
| case, the complaint is that the performance of the system
| cannot be ascribed to any distinct part of the whole,
| concluding that the whole cannot perform. In this case,
| the performance of the system is falsely ascribed to one
| distinct part, ignoring the contribution of the other.
| neatze wrote:
| Not sure what you mean by goals, because to a degree you
| don't need dynamic goals (eg. goals that change throughout
| lifetime of a system) for reactive behavior.
| grey-area wrote:
| It may be necessary but not sufficient, that's somewhere to
| start at least.
___________________________________________________________________
(page generated 2021-06-30 23:03 UTC)