[HN Gopher] How to build a thinking AI
___________________________________________________________________
How to build a thinking AI
Author : tudorw
Score : 96 points
Date : 2024-01-05 18:35 UTC (4 hours ago)
(HTM) web link (aithought.com)
(TXT) w3m dump (aithought.com)
| pfisch wrote:
| At best we invite a dystopian future. At worst our own
| annihilation.
|
| It is crazy how powerless we are to stop it from happening.
| jstummbillig wrote:
| Framing a threat vaguely enough certainly makes it sound
| ominous.
| boznz wrote:
| Correct on the last point but the first is still up for debate.
| Heres my take => https://rodyne.com/?page_id=1373
| polotics wrote:
| Can you substantiate why you think these are the two only
| possible futures?
| aantix wrote:
| Validating and updating memory is an interesting problem with an
| LLM.
|
| Is it true that the next iteration of GPT is being trained with
| artificial data and that data is being validated by GPT-3.5?
|
| That LLMs may hallucinate but when prompted are actually pretty
| good at knowing when a conclusion is wrong?
| K0balt wrote:
| This is exactly what I started postulating on about 5 years
| ago... that eventually, transformers working through a REPL with
| access to a vector-store could likely lead to AGI. Of course, I
| didn't predict the LLM/multimodal explosion, but I've been
| thinking along these same lines for a while now. My current
| direction is a multiagent MOE working into a single REPL with a
| supervisory transformer that manifests intent through the
| management of agent delegation and response filtering
| /regeneration to stay on context.
| dachworker wrote:
| Many such thinking architectures are probably possible. The hard
| part is learning a good representation of the world and all its
| constituents, without which none of these thinking architectures
| are possible. What's exciting about LLMs is that they are
| approaching this learned representation. There are already people
| attempting to build AGI (or less ambitiously, task automation) on
| top of LLMs with projects like BabyAGI and AutoGPT.
|
| I think it will be hard to say apriori which thinking
| architecture will work better, because this will also depend on
| the properties of the learned embedding or representation of the
| world. We don't need to model how the human mind works. Humans
| have very tiny working memories, but a computer could have a much
| larger working memory. Human recall is very quick and the concept
| map is very robust, whereas I would image the learned
| representations won't be as good and the recall to be a
| bottleneck. But all of this is running ahead of ourselves. What
| we need are even better world models or representations of
| reality than what the current LLMs can produce, either by
| modifying transformers or by moving to better architectures.
| jameshart wrote:
| If you insist on being able to boot the thing up and
| immediately be self aware, yes, you need to figure out how to
| construct it so that all the training of 'how to be this
| particular self aware intelligence' is intrinsic to it, which
| is a bootstrapping problem.
|
| Human intelligence solves this a different way. It instantiates
| the architecture without any of the weights pretrained, in the
| form of a 'baby'. The training starts from there.
| gremlinsinc wrote:
| simple solution - create a human world simulation, with
| intelligent ai's that think they're biological and real, have
| them grow old, die, lose people they love, etc...then when
| they die they wake up as an ai robot with learned
| ethics/morality from life in the sim, other important gained
| intelligence, and the ability to compute 10000x faster than
| in the sim. Live, die, wake up as a robotic slave.
| px43 wrote:
| Not sure why but the lack of a scroll bar is giving me some
| pretty intense anxiety. How is one supposed to navigate a page
| like this? I don't see any indexes or indicators of where you are
| at any given time, and lots of weird moving distractions that
| make me lose my place.
|
| edit: The PDF version is way more sane
| https://arxiv.org/pdf/2203.17255.pdf
| techbro92 wrote:
| "Implementing this in a machine will enable artificial general
| intelligence" If that's true why didn't he just implement it? Why
| should I take anyone seriously that just talks about code instead
| of actually coding it? This would be much more compelling if he
| just showed benchmarks and performance instead of writing up an
| argument. Furthermore, I don't believe him.
| TaupeRanger wrote:
| It's the same hyperbolic nonsense we've seen from hundreds of
| other confident "researchers" over the past 50 years. Eliasmith
| has a book called "How to Build a Brain", Hawkins built an
| entire company, Numenta, around a theory that hasn't created
| anything remotely useful or interesting in almost 2 decades and
| has pivoted to creating tools for current ML zeitgeist methods.
|
| This unknown researcher is exactly the same. Write books and
| papers for years while creating literally nothing of actual
| value or usefulness in the real world. But what else would you
| do in his situation? You have to publish or die in academia.
| Publish the 1,000th iteration of some subset of LLM
| architecture? Or create grandiose claims about "implementing
| human thought" in the hopes that some people will be impressed?
| thierrydamiba wrote:
| This is why Open AI was so revolutionary. They were able to
| create a useful, simple, and free product that anyone can use
| to improve their life.
| blackbear_ wrote:
| You really don't need to be so cynical, some things just need
| time. Light bulbs were patented only after 40 years of work
| by multiple researchers, and it took another half a century
| of work to achieve decent efficiency. Neural networks
| themselves have been in development for 50 years before
| taking off recently, and for most of that time people working
| in the field were considered nuts by their peers.
|
| But if you have concrete criticism on the idea feel free to
| articulate.
| aeim wrote:
| wut?
|
| i'm not even sure what your argument is:
|
| - some people have tried and failed, so anyone else who tries
| is grandiose?
|
| - anyone who ventures beyond streetlights is spouting
| hyperbolic nonsense?
|
| and why: "creating claims in the hope that others are
| impressed"; rather than communicating ideas in the hopes of
| continuing a conversation?
|
| - why didn't they just build it? eh, cause writing it up is
| the first step/ limit of budget/ project scope/ overall skill
| set/ etc
|
| what a toxic take on the effort and courage required to
| explore and refine new perspectives on this important (and
| undoubtedly controversial) space
|
| f###
| jackblemming wrote:
| There are a ton of people like this, trying to flag plant
| obvious "duh" ideas so they can jump in and say "see I knew it
| would require some kind of long term memory!"... Duh?
|
| The implementation and results are what matter. Nobody cares
| that you thought AGI would require long term memory.
| yagizdegirmenci wrote:
| I looked it up, the author has PhD in Brain and Cognitive
| Science.
|
| Are you aware of the fact that even partially implementing
| something like this, would require multi-year effort with
| dozens of engineers at minimum and will cost millions of
| dollars just for training.
| sram1337 wrote:
| The perfect way to never have your theories invalidated
| yagizdegirmenci wrote:
| I think this is a wrong way to look at the Academia,
| theoretical groundwork is essential for building practical
| things.
| olddustytrail wrote:
| That's a tiny team, a short timescale, and a trivial cost.
|
| Seriously, when you're a bit older you'll see more wasted on
| the most useless ideas.
| yagizdegirmenci wrote:
| I also think so, in the context of "at minimum".
| jerf wrote:
| "even partially implementing something like this, would
| require multi-year effort with dozens of engineers at minimum
| and will cost millions of dollars just for training"
|
| Unfortunately, that means that's also the bar for being able
| to utter the words "IMPLEMENTING THIS IN A MACHINE WILL
| ENABLE ARTIFICIAL GENERAL INTELLIGENCE" and being taken
| seriously. In fact the bar is even higher than that, since
| merely meeting the criteria you lay out is still no guarantee
| of success, it is merely an absolute minimum.
|
| The fact that that is a high bar means simply that; it's a
| high bar. It's not a bar we lower just because it's really
| hard.
| jmac01 wrote:
| I'm the best musician on earth but I can't play any
| instruments but I can imagine a really amazing song, you'll
| just never hear it because it would take 1000s of hours of me
| practicing to actually learn to play so that I could prove
| it. So you'll just have to make do with my words and believe
| me when I say I'm the best musician alive.
|
| Here's an article I wrote describing the song but without
| actually writing any of the notes because I can't read or
| write music either.
|
| But I've listened to a lot of music and my tunes are better
| than those.
|
| I went to school for music description so I know what I'm
| talking about.
| yagizdegirmenci wrote:
| You can go into studio with a producer and turn your ideas
| into a beat and then to a song. The same thing does not
| applies to engineering.
|
| This is like comparing apple with an orange.
| gremlinsinc wrote:
| well, you could hire a team of developers and
| neuroscientists to build a prototype of the idea and
| concept and do physical research, whether you yourself
| have the chops to do it yourself is irrelevant at that
| point.
| yagizdegirmenci wrote:
| You still can't see the difference, can you?
| abeppu wrote:
| Yeah, I think a problem continues to be that there are a bunch
| of interesting threads in cognitive science research that have
| seemed sort of decent as explanations of how animal cognition
| is working, and maybe directionally reasonable (but incomplete)
| in suggesting how one might implement it in the abstract, but
| that doesn't suffice to actually build a mind. If there are a
| bunch of good hypotheses, you need actual results to show that
| yours is special. I haven't read this thoroughly, but the
| author cites and reuses a lot of stuff from the 1970s and 1980s
| ... but so far as I can tell doesn't really answer why these
| models didn't pan out to actually build something decades ago.
|
| Today, I think active inference and free energy principle (from
| Friston) are perhaps having a bit more impact (at least showing
| up in some RL innovations), but are still a long way off from
| creating something that thinks like we do.
| abeppu wrote:
| > As a formal algorithm, it could be modeled as a stateless
| Markov process in discrete time, performing non-deterministic
| search. As a computable function, it could be instantiated by
| traditional or neuromorphic computer clusters and executed
| using brain emulation, hierarchical hidden Markov models,
| stochastic grammars, probabilistic programming languages,
| neural networks, or others.
|
| These sentences from section 5.2 convince me that the author is
| oddly not even interested in building what he's talking about,
| or making a plan to do so.
|
| - Isn't the point of a markov process that it _is_ stateful,
| but that _all_ its state is present at in each x_i, such that
| for later times k > i, x_k never needs to refer to some prior
| x_j with j < i?
|
| - "Here's a list of broad, flexible families of computational
| processing that can have some concept of a sequence. Your turn,
| engineers!"
| baq wrote:
| A brain specialist can't code therefore his argument is
| invalid.
|
| HN at its best worst.
| techbro92 wrote:
| I am skeptical of anyone making grandiose claims.
| godelski wrote:
| Me too, but you responded to an arrogant claim with a naive
| one (see my other comment). Please do critique their work,
| but don't bash it. Point to specific things rather than
| just throwing something out outright. Just bashing turns
| the comment section into noise and HN isn't trying to be
| reddit.
| why-el wrote:
| The same reason Babbage never built the full Difference Engine.
| Design of this sort allows some tinkering unbothered by the
| constraints of actually building it (if you know history of the
| Difference Engine you know how many ideas it unleashed even if
| never really built to completion except in recent times for
| display at a London Museum :p).
| godelski wrote:
| > If that's true why didn't he just implement it?
|
| Simple answer: most of the things he mentions haven't been
| invented yet. At least in terms of computation. Or some parts
| have been built, but not to sufficient degrees and most don't
| have bridges for what's being proposed.
|
| I do agree that the title is arrogant, but I'd say so is this
| comment. There's absolutely nothing wrong with people proposing
| systems, detailing them out, and publishing (communicating to
| peers. Idk if blog, paper, whatever, it's all the same in the
| end). We live in a world that is incredibly complex and we have
| high rates of specialization. I understand that we code a lot
| and that means we dip our fingers in a lot of pies, but that
| doesn't mean we're experts in everything. The context does
| matter, and the context is that this is a proposal. The other
| context is, this is pretty fucking hard. If it seems simple,
| that's because it was communicated well or you simplified what
| he said. Another alternative is that you're right, which if so
| please implement it and write a paper, it'll be quite useful to
| the community as there are a lot of people suggesting quite
| similar ideas to this. Ruling out what doesn't work is pretty
| much how science works (which is why I find it absurd that we
| use and protect a system that disincentivizes communicating
| negative results).
|
| It's also worth mentioning that if you go to the author's about
| page[0] that you'll see that he has a video lecture where he
| discusses this and literally says that he's building it. So...
| he is? Just not in secret.
|
| Edit: I'll add that several of the ideas here are abstract. I
| thought I'd clarify this around the "not invented yet" part. So
| the work he's doing and effectively asking for help with (which
| is why you put this out) is to get higher resolution on these
| ideas. Which, criticism is helpful in doing that. But criticism
| is not just complaints, it is more specific and a clear point
| of improvement can be drawn from criticism. If you've ever
| submitted a paper to a journal/conference, you're probably
| familiar with how Reviewer 2 just makes complaints that aren't
| addressable and can leave you more confused asking what paper
| they read. Those are complaints, not critiques.
|
| [0] https://aithought.com/about/
| tudorw wrote:
| Xanadu.
| layer8 wrote:
| While I agree that autonomous iteration will be important to AGI,
| I somehow have trouble taking an author serious who presents
| tables as screenshot images containing red spell-checking
| squiggles.
| dgadj38998 wrote:
| This is an absolutely perfect archetypal HN comment
|
| The way someone can post an article on a really complex topic
| and the comment instead talks about the style or formatting of
| the article.
|
| That to me is pure HN distilled
| optimalsolver wrote:
| Great. Is there a prototype I can play around with?
| IceMichael wrote:
| The title alone should receive a downvote. We don't need more of
| this hype
| paxys wrote:
| How do you build a thinking AI? First come up with your own
| definition of what a thinking AI is, then design one that meets
| it.
| ryanklee wrote:
| > simulate human-like thought processes
|
| It ought to be clear to a cognitive scientist (which the author
| is) that we do not know how human thought processes work except
| at a very course level.
|
| The idea that we have an understanding refined enough to take the
| next step and "simulate" these processes is just pure crackpot
| bunk.
| gremlinsinc wrote:
| If we only get it partially figured out, we can still get a
| vastly more intelligent artificial intelligence system, and
| then IT will figure out what we missed, especially if it is
| self-improving.
| zaking17 wrote:
| I like the process that goes into these "imagine the architecture
| of AGI" articles. It's all hypothetical, but it's really fun.
|
| But it's a missed opportunity if you don't embed LLMs in some of
| the core modules -- and highlight where they excel. LLMs aren't
| identical to any part of the human brain, but they do a
| remarkable job of emulating elements of human cognition:
| language, obviously, but also many types of reasoning and idea
| exploration.
|
| Where LLMs fail is in lookup, memory, and learning. But we've all
| seen how easy it is to extend them with RAG architectures.
|
| My personal, non-scientific prediction for the basic modules of
| AGI are:
|
| - LLMs to do basic reasoning
|
| - a scheduling system that runs planning and execution tasks
|
| - sensory events that can kick off reasoning, but with clever
| filters and shortcuts
|
| - short term memory to augment and improve reasoning
|
| - tools (calculators etc.) for common tasks
|
| - a flexible and well _designed_ memory system -- much iteration
| required to get this right, and i don't see a lot of work being
| done on it, which is interesting
|
| - finally, a truly general intelligence would have the capability
| to mutate many of the above elements based on learning (LLM
| weights, scheduling parameters, sensory filters, and memory
| configurations). But not everything needs to be mutable. many
| elements of human cognition are probably immutable as well.
| hathawsh wrote:
| I like to think we could quickly create a next-level AI (maybe
| AGI?) if we simply model it on the Pixar movie "Inside Out".
| The little characters inside the girl's brain are different
| LLMs with different biases. They follow a kind of script that
| adapts to the current environment. They converse with each
| other and suggest to the girl what she should do or say.
|
| I'd try the idea myself, but I have a job. :-)
| ilaksh wrote:
| I think LLMs did not exist or barely existed at the time he
| wrote this.
| gremlinsinc wrote:
| one important thing you left out - the ability to reproduce and
| thus "evolve" naturally, and at scale, to essentially keep
| improving its own brain to the point it outpaces current human
| researchers in self-improvement. If not reproduce, maybe
| reincarnate itself in version 2.0, 3.0, etc...
| davelacy wrote:
| What was used to make these animated diagrams?! Anyone know?
| turnsout wrote:
| This might as well be a patent for a perpetual motion machine...
| Until there's code, it's hot air.
| alchemist1e9 wrote:
| It's a bit buried but eventually there are references to SOAR and
| ACT-R which in my crude attempt to broach cognitive architectures
| were the two I had understood as being the leading models with
| tangible applied results and working code.
|
| If anybody with an understanding of that field knows some good
| open source frameworks or libraries I suspect many beyond myself
| would be interested.
|
| It's not considered cognitive framework but in applied learning
| I've developed a fascination with MuZero algorithm and also been
| trying to better understand factor graphs as used in another less
| know cognitive architecture called Sigma. It feels like some
| mashup of LLMs, RAG and vector search, cognitive architectures
| (SOAR, ACT-R, Sigma), ReACT/OPA/VOYAGER, with proven algorithms
| like MuZero might be on the verge of producing the next leap
| forward.
| ilaksh wrote:
| Take a look at David Shapiro's YouTube channel. He has some
| interesting videos about building thought processes on top of
| LLMs (like GPT-4).
| visarga wrote:
| Over the years hundreds of variants of RNN, CNN and Transformer
| have been proposed, but under the same budget of weights and
| compute, and with the same dataset, they are very close in
| performance. Models don't matter, gradient descent finds a way.
| The real hero is the dataset.
|
| And what is in the training set? Language is our best repository
| for past experience. We have painstakingly collected our lessons,
| over thousands of years, and transmitted them through culture and
| books. To recreate them from scratch would take a similarly long
| time. Our culture is smarter than us, it is the result of our
| history.
|
| So under these reasons I believe the real secret is in the
| training set. I don't think the problem was the model, but
| everything else around it.
___________________________________________________________________
(page generated 2024-01-05 23:01 UTC)