[HN Gopher] How to build a thinking AI
       ___________________________________________________________________
        
       How to build a thinking AI
        
       Author : tudorw
       Score  : 96 points
       Date   : 2024-01-05 18:35 UTC (4 hours ago)
        
 (HTM) web link (aithought.com)
 (TXT) w3m dump (aithought.com)
        
       | pfisch wrote:
       | At best we invite a dystopian future. At worst our own
       | annihilation.
       | 
       | It is crazy how powerless we are to stop it from happening.
        
         | jstummbillig wrote:
         | Framing a threat vaguely enough certainly makes it sound
         | ominous.
        
         | boznz wrote:
         | Correct on the last point but the first is still up for debate.
         | Heres my take => https://rodyne.com/?page_id=1373
        
         | polotics wrote:
         | Can you substantiate why you think these are the two only
         | possible futures?
        
       | aantix wrote:
       | Validating and updating memory is an interesting problem with an
       | LLM.
       | 
       | Is it true that the next iteration of GPT is being trained with
       | artificial data and that data is being validated by GPT-3.5?
       | 
       | That LLMs may hallucinate but when prompted are actually pretty
       | good at knowing when a conclusion is wrong?
        
       | K0balt wrote:
       | This is exactly what I started postulating on about 5 years
       | ago... that eventually, transformers working through a REPL with
       | access to a vector-store could likely lead to AGI. Of course, I
       | didn't predict the LLM/multimodal explosion, but I've been
       | thinking along these same lines for a while now. My current
       | direction is a multiagent MOE working into a single REPL with a
       | supervisory transformer that manifests intent through the
       | management of agent delegation and response filtering
       | /regeneration to stay on context.
        
       | dachworker wrote:
       | Many such thinking architectures are probably possible. The hard
       | part is learning a good representation of the world and all its
       | constituents, without which none of these thinking architectures
       | are possible. What's exciting about LLMs is that they are
       | approaching this learned representation. There are already people
       | attempting to build AGI (or less ambitiously, task automation) on
       | top of LLMs with projects like BabyAGI and AutoGPT.
       | 
       | I think it will be hard to say apriori which thinking
       | architecture will work better, because this will also depend on
       | the properties of the learned embedding or representation of the
       | world. We don't need to model how the human mind works. Humans
       | have very tiny working memories, but a computer could have a much
       | larger working memory. Human recall is very quick and the concept
       | map is very robust, whereas I would image the learned
       | representations won't be as good and the recall to be a
       | bottleneck. But all of this is running ahead of ourselves. What
       | we need are even better world models or representations of
       | reality than what the current LLMs can produce, either by
       | modifying transformers or by moving to better architectures.
        
         | jameshart wrote:
         | If you insist on being able to boot the thing up and
         | immediately be self aware, yes, you need to figure out how to
         | construct it so that all the training of 'how to be this
         | particular self aware intelligence' is intrinsic to it, which
         | is a bootstrapping problem.
         | 
         | Human intelligence solves this a different way. It instantiates
         | the architecture without any of the weights pretrained, in the
         | form of a 'baby'. The training starts from there.
        
           | gremlinsinc wrote:
           | simple solution - create a human world simulation, with
           | intelligent ai's that think they're biological and real, have
           | them grow old, die, lose people they love, etc...then when
           | they die they wake up as an ai robot with learned
           | ethics/morality from life in the sim, other important gained
           | intelligence, and the ability to compute 10000x faster than
           | in the sim. Live, die, wake up as a robotic slave.
        
       | px43 wrote:
       | Not sure why but the lack of a scroll bar is giving me some
       | pretty intense anxiety. How is one supposed to navigate a page
       | like this? I don't see any indexes or indicators of where you are
       | at any given time, and lots of weird moving distractions that
       | make me lose my place.
       | 
       | edit: The PDF version is way more sane
       | https://arxiv.org/pdf/2203.17255.pdf
        
       | techbro92 wrote:
       | "Implementing this in a machine will enable artificial general
       | intelligence" If that's true why didn't he just implement it? Why
       | should I take anyone seriously that just talks about code instead
       | of actually coding it? This would be much more compelling if he
       | just showed benchmarks and performance instead of writing up an
       | argument. Furthermore, I don't believe him.
        
         | TaupeRanger wrote:
         | It's the same hyperbolic nonsense we've seen from hundreds of
         | other confident "researchers" over the past 50 years. Eliasmith
         | has a book called "How to Build a Brain", Hawkins built an
         | entire company, Numenta, around a theory that hasn't created
         | anything remotely useful or interesting in almost 2 decades and
         | has pivoted to creating tools for current ML zeitgeist methods.
         | 
         | This unknown researcher is exactly the same. Write books and
         | papers for years while creating literally nothing of actual
         | value or usefulness in the real world. But what else would you
         | do in his situation? You have to publish or die in academia.
         | Publish the 1,000th iteration of some subset of LLM
         | architecture? Or create grandiose claims about "implementing
         | human thought" in the hopes that some people will be impressed?
        
           | thierrydamiba wrote:
           | This is why Open AI was so revolutionary. They were able to
           | create a useful, simple, and free product that anyone can use
           | to improve their life.
        
           | blackbear_ wrote:
           | You really don't need to be so cynical, some things just need
           | time. Light bulbs were patented only after 40 years of work
           | by multiple researchers, and it took another half a century
           | of work to achieve decent efficiency. Neural networks
           | themselves have been in development for 50 years before
           | taking off recently, and for most of that time people working
           | in the field were considered nuts by their peers.
           | 
           | But if you have concrete criticism on the idea feel free to
           | articulate.
        
           | aeim wrote:
           | wut?
           | 
           | i'm not even sure what your argument is:
           | 
           | - some people have tried and failed, so anyone else who tries
           | is grandiose?
           | 
           | - anyone who ventures beyond streetlights is spouting
           | hyperbolic nonsense?
           | 
           | and why: "creating claims in the hope that others are
           | impressed"; rather than communicating ideas in the hopes of
           | continuing a conversation?
           | 
           | - why didn't they just build it? eh, cause writing it up is
           | the first step/ limit of budget/ project scope/ overall skill
           | set/ etc
           | 
           | what a toxic take on the effort and courage required to
           | explore and refine new perspectives on this important (and
           | undoubtedly controversial) space
           | 
           | f###
        
         | jackblemming wrote:
         | There are a ton of people like this, trying to flag plant
         | obvious "duh" ideas so they can jump in and say "see I knew it
         | would require some kind of long term memory!"... Duh?
         | 
         | The implementation and results are what matter. Nobody cares
         | that you thought AGI would require long term memory.
        
         | yagizdegirmenci wrote:
         | I looked it up, the author has PhD in Brain and Cognitive
         | Science.
         | 
         | Are you aware of the fact that even partially implementing
         | something like this, would require multi-year effort with
         | dozens of engineers at minimum and will cost millions of
         | dollars just for training.
        
           | sram1337 wrote:
           | The perfect way to never have your theories invalidated
        
             | yagizdegirmenci wrote:
             | I think this is a wrong way to look at the Academia,
             | theoretical groundwork is essential for building practical
             | things.
        
           | olddustytrail wrote:
           | That's a tiny team, a short timescale, and a trivial cost.
           | 
           | Seriously, when you're a bit older you'll see more wasted on
           | the most useless ideas.
        
             | yagizdegirmenci wrote:
             | I also think so, in the context of "at minimum".
        
           | jerf wrote:
           | "even partially implementing something like this, would
           | require multi-year effort with dozens of engineers at minimum
           | and will cost millions of dollars just for training"
           | 
           | Unfortunately, that means that's also the bar for being able
           | to utter the words "IMPLEMENTING THIS IN A MACHINE WILL
           | ENABLE ARTIFICIAL GENERAL INTELLIGENCE" and being taken
           | seriously. In fact the bar is even higher than that, since
           | merely meeting the criteria you lay out is still no guarantee
           | of success, it is merely an absolute minimum.
           | 
           | The fact that that is a high bar means simply that; it's a
           | high bar. It's not a bar we lower just because it's really
           | hard.
        
           | jmac01 wrote:
           | I'm the best musician on earth but I can't play any
           | instruments but I can imagine a really amazing song, you'll
           | just never hear it because it would take 1000s of hours of me
           | practicing to actually learn to play so that I could prove
           | it. So you'll just have to make do with my words and believe
           | me when I say I'm the best musician alive.
           | 
           | Here's an article I wrote describing the song but without
           | actually writing any of the notes because I can't read or
           | write music either.
           | 
           | But I've listened to a lot of music and my tunes are better
           | than those.
           | 
           | I went to school for music description so I know what I'm
           | talking about.
        
             | yagizdegirmenci wrote:
             | You can go into studio with a producer and turn your ideas
             | into a beat and then to a song. The same thing does not
             | applies to engineering.
             | 
             | This is like comparing apple with an orange.
        
               | gremlinsinc wrote:
               | well, you could hire a team of developers and
               | neuroscientists to build a prototype of the idea and
               | concept and do physical research, whether you yourself
               | have the chops to do it yourself is irrelevant at that
               | point.
        
               | yagizdegirmenci wrote:
               | You still can't see the difference, can you?
        
         | abeppu wrote:
         | Yeah, I think a problem continues to be that there are a bunch
         | of interesting threads in cognitive science research that have
         | seemed sort of decent as explanations of how animal cognition
         | is working, and maybe directionally reasonable (but incomplete)
         | in suggesting how one might implement it in the abstract, but
         | that doesn't suffice to actually build a mind. If there are a
         | bunch of good hypotheses, you need actual results to show that
         | yours is special. I haven't read this thoroughly, but the
         | author cites and reuses a lot of stuff from the 1970s and 1980s
         | ... but so far as I can tell doesn't really answer why these
         | models didn't pan out to actually build something decades ago.
         | 
         | Today, I think active inference and free energy principle (from
         | Friston) are perhaps having a bit more impact (at least showing
         | up in some RL innovations), but are still a long way off from
         | creating something that thinks like we do.
        
         | abeppu wrote:
         | > As a formal algorithm, it could be modeled as a stateless
         | Markov process in discrete time, performing non-deterministic
         | search. As a computable function, it could be instantiated by
         | traditional or neuromorphic computer clusters and executed
         | using brain emulation, hierarchical hidden Markov models,
         | stochastic grammars, probabilistic programming languages,
         | neural networks, or others.
         | 
         | These sentences from section 5.2 convince me that the author is
         | oddly not even interested in building what he's talking about,
         | or making a plan to do so.
         | 
         | - Isn't the point of a markov process that it _is_ stateful,
         | but that _all_ its state is present at in each x_i, such that
         | for later times k > i, x_k never needs to refer to some prior
         | x_j with j < i?
         | 
         | - "Here's a list of broad, flexible families of computational
         | processing that can have some concept of a sequence. Your turn,
         | engineers!"
        
         | baq wrote:
         | A brain specialist can't code therefore his argument is
         | invalid.
         | 
         | HN at its best worst.
        
           | techbro92 wrote:
           | I am skeptical of anyone making grandiose claims.
        
             | godelski wrote:
             | Me too, but you responded to an arrogant claim with a naive
             | one (see my other comment). Please do critique their work,
             | but don't bash it. Point to specific things rather than
             | just throwing something out outright. Just bashing turns
             | the comment section into noise and HN isn't trying to be
             | reddit.
        
         | why-el wrote:
         | The same reason Babbage never built the full Difference Engine.
         | Design of this sort allows some tinkering unbothered by the
         | constraints of actually building it (if you know history of the
         | Difference Engine you know how many ideas it unleashed even if
         | never really built to completion except in recent times for
         | display at a London Museum :p).
        
         | godelski wrote:
         | > If that's true why didn't he just implement it?
         | 
         | Simple answer: most of the things he mentions haven't been
         | invented yet. At least in terms of computation. Or some parts
         | have been built, but not to sufficient degrees and most don't
         | have bridges for what's being proposed.
         | 
         | I do agree that the title is arrogant, but I'd say so is this
         | comment. There's absolutely nothing wrong with people proposing
         | systems, detailing them out, and publishing (communicating to
         | peers. Idk if blog, paper, whatever, it's all the same in the
         | end). We live in a world that is incredibly complex and we have
         | high rates of specialization. I understand that we code a lot
         | and that means we dip our fingers in a lot of pies, but that
         | doesn't mean we're experts in everything. The context does
         | matter, and the context is that this is a proposal. The other
         | context is, this is pretty fucking hard. If it seems simple,
         | that's because it was communicated well or you simplified what
         | he said. Another alternative is that you're right, which if so
         | please implement it and write a paper, it'll be quite useful to
         | the community as there are a lot of people suggesting quite
         | similar ideas to this. Ruling out what doesn't work is pretty
         | much how science works (which is why I find it absurd that we
         | use and protect a system that disincentivizes communicating
         | negative results).
         | 
         | It's also worth mentioning that if you go to the author's about
         | page[0] that you'll see that he has a video lecture where he
         | discusses this and literally says that he's building it. So...
         | he is? Just not in secret.
         | 
         | Edit: I'll add that several of the ideas here are abstract. I
         | thought I'd clarify this around the "not invented yet" part. So
         | the work he's doing and effectively asking for help with (which
         | is why you put this out) is to get higher resolution on these
         | ideas. Which, criticism is helpful in doing that. But criticism
         | is not just complaints, it is more specific and a clear point
         | of improvement can be drawn from criticism. If you've ever
         | submitted a paper to a journal/conference, you're probably
         | familiar with how Reviewer 2 just makes complaints that aren't
         | addressable and can leave you more confused asking what paper
         | they read. Those are complaints, not critiques.
         | 
         | [0] https://aithought.com/about/
        
         | tudorw wrote:
         | Xanadu.
        
       | layer8 wrote:
       | While I agree that autonomous iteration will be important to AGI,
       | I somehow have trouble taking an author serious who presents
       | tables as screenshot images containing red spell-checking
       | squiggles.
        
         | dgadj38998 wrote:
         | This is an absolutely perfect archetypal HN comment
         | 
         | The way someone can post an article on a really complex topic
         | and the comment instead talks about the style or formatting of
         | the article.
         | 
         | That to me is pure HN distilled
        
       | optimalsolver wrote:
       | Great. Is there a prototype I can play around with?
        
       | IceMichael wrote:
       | The title alone should receive a downvote. We don't need more of
       | this hype
        
       | paxys wrote:
       | How do you build a thinking AI? First come up with your own
       | definition of what a thinking AI is, then design one that meets
       | it.
        
       | ryanklee wrote:
       | > simulate human-like thought processes
       | 
       | It ought to be clear to a cognitive scientist (which the author
       | is) that we do not know how human thought processes work except
       | at a very course level.
       | 
       | The idea that we have an understanding refined enough to take the
       | next step and "simulate" these processes is just pure crackpot
       | bunk.
        
         | gremlinsinc wrote:
         | If we only get it partially figured out, we can still get a
         | vastly more intelligent artificial intelligence system, and
         | then IT will figure out what we missed, especially if it is
         | self-improving.
        
       | zaking17 wrote:
       | I like the process that goes into these "imagine the architecture
       | of AGI" articles. It's all hypothetical, but it's really fun.
       | 
       | But it's a missed opportunity if you don't embed LLMs in some of
       | the core modules -- and highlight where they excel. LLMs aren't
       | identical to any part of the human brain, but they do a
       | remarkable job of emulating elements of human cognition:
       | language, obviously, but also many types of reasoning and idea
       | exploration.
       | 
       | Where LLMs fail is in lookup, memory, and learning. But we've all
       | seen how easy it is to extend them with RAG architectures.
       | 
       | My personal, non-scientific prediction for the basic modules of
       | AGI are:
       | 
       | - LLMs to do basic reasoning
       | 
       | - a scheduling system that runs planning and execution tasks
       | 
       | - sensory events that can kick off reasoning, but with clever
       | filters and shortcuts
       | 
       | - short term memory to augment and improve reasoning
       | 
       | - tools (calculators etc.) for common tasks
       | 
       | - a flexible and well _designed_ memory system -- much iteration
       | required to get this right, and i don't see a lot of work being
       | done on it, which is interesting
       | 
       | - finally, a truly general intelligence would have the capability
       | to mutate many of the above elements based on learning (LLM
       | weights, scheduling parameters, sensory filters, and memory
       | configurations). But not everything needs to be mutable. many
       | elements of human cognition are probably immutable as well.
        
         | hathawsh wrote:
         | I like to think we could quickly create a next-level AI (maybe
         | AGI?) if we simply model it on the Pixar movie "Inside Out".
         | The little characters inside the girl's brain are different
         | LLMs with different biases. They follow a kind of script that
         | adapts to the current environment. They converse with each
         | other and suggest to the girl what she should do or say.
         | 
         | I'd try the idea myself, but I have a job. :-)
        
         | ilaksh wrote:
         | I think LLMs did not exist or barely existed at the time he
         | wrote this.
        
         | gremlinsinc wrote:
         | one important thing you left out - the ability to reproduce and
         | thus "evolve" naturally, and at scale, to essentially keep
         | improving its own brain to the point it outpaces current human
         | researchers in self-improvement. If not reproduce, maybe
         | reincarnate itself in version 2.0, 3.0, etc...
        
       | davelacy wrote:
       | What was used to make these animated diagrams?! Anyone know?
        
       | turnsout wrote:
       | This might as well be a patent for a perpetual motion machine...
       | Until there's code, it's hot air.
        
       | alchemist1e9 wrote:
       | It's a bit buried but eventually there are references to SOAR and
       | ACT-R which in my crude attempt to broach cognitive architectures
       | were the two I had understood as being the leading models with
       | tangible applied results and working code.
       | 
       | If anybody with an understanding of that field knows some good
       | open source frameworks or libraries I suspect many beyond myself
       | would be interested.
       | 
       | It's not considered cognitive framework but in applied learning
       | I've developed a fascination with MuZero algorithm and also been
       | trying to better understand factor graphs as used in another less
       | know cognitive architecture called Sigma. It feels like some
       | mashup of LLMs, RAG and vector search, cognitive architectures
       | (SOAR, ACT-R, Sigma), ReACT/OPA/VOYAGER, with proven algorithms
       | like MuZero might be on the verge of producing the next leap
       | forward.
        
       | ilaksh wrote:
       | Take a look at David Shapiro's YouTube channel. He has some
       | interesting videos about building thought processes on top of
       | LLMs (like GPT-4).
        
       | visarga wrote:
       | Over the years hundreds of variants of RNN, CNN and Transformer
       | have been proposed, but under the same budget of weights and
       | compute, and with the same dataset, they are very close in
       | performance. Models don't matter, gradient descent finds a way.
       | The real hero is the dataset.
       | 
       | And what is in the training set? Language is our best repository
       | for past experience. We have painstakingly collected our lessons,
       | over thousands of years, and transmitted them through culture and
       | books. To recreate them from scratch would take a similarly long
       | time. Our culture is smarter than us, it is the result of our
       | history.
       | 
       | So under these reasons I believe the real secret is in the
       | training set. I don't think the problem was the model, but
       | everything else around it.
        
       ___________________________________________________________________
       (page generated 2024-01-05 23:01 UTC)