[HN Gopher] Project Sid: Many-agent simulations toward AI civili...
___________________________________________________________________
Project Sid: Many-agent simulations toward AI civilization
Author : talms
Score : 155 points
Date : 2024-11-03 19:09 UTC (3 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jlaneve wrote:
| Here's their blog post announcement too:
| https://digitalhumanity.substack.com/p/project-sid-many-agen...
| isoprophlex wrote:
| Now these seem to be truly artificially intelligent agents.
| Memory, volition, autonomy, something like an OODA loop or
| whatever you want to call it, and a persistent environment. Very
| nice concept, and I'm positive the learnings can be applied to
| more mundane business problems, too.
|
| If only I could get management to understand that a bunch of
| prompts shitting into eachother isn't "cutting-edge agentic
| AI"...
|
| But then again _their_ jobs probably depend on selling something
| that looks like real innovation happening to the C-levels...
| Carrok wrote:
| > If only I could get management to understand that a bunch of
| prompts shitting into eachother isn't "cutting-edge agentic
| AI"...
|
| It's unclear to me how the linked project is different from
| what you described.
|
| Plenty of existing agents have "memory" and many other things
| you named.
| jsemrau wrote:
| >If only I could get management to understand that a bunch of
| prompts shitting into eachother isn't "cutting-edge agentic
| AI"...
|
| It should never be this way. Even with narrow AI, there needs
| to be a governance framework that helps measure the output and
| capture potential risks (hallucinations, wrong data / links,
| wrong summaries, etc)
| echelon wrote:
| All of their domains and branding are .aL
|
| I had no idea .aL was even a domain name. That's wild. I wonder
| how many of those are going to take off.
| semanticc wrote:
| .al is just the TLD for Albania, similarly as .ai is for
| Anguilla. No idea why anyone would choose the former.
| aithrowawaycomm wrote:
| Reading the paper, this seems like putting the cart before the
| horse: the agents individually are not actually capable of
| playing Minecraft and cannot successfully perform the tasks
| they've assigned or volunteered for, so in some sense the authors
| are having dogs wear human clothes and declaring it's a human-
| like civilization. Further, crucial things are essentially hard-
| coded: what types of societies are available and (I believe) the
| names of the roles. I am not exactly sure what the social
| organization is supposed to imply: the strongest claim you could
| make is that the agent framework could work for video game NPCs
| because the agents stick to their roles and factions. The claim
| that agents "can use legal structures" strikes me as especially
| specious, since "use the legal structure" is hard-wired into the
| various agents' behavior. Trying to extend all this to actual
| human society seems ridiculous, and it does not help that the
| authors blithely ignore sociology and anthropology.
|
| There are some other highly specious claims:
|
| - I said "I believe" the names of the roles are hard-coded, but
| unless I missed something the information is unacceptably vague.
| I don't see anything in the agent prompts that would make them
| create new roles, or assign themselves to roles at all. Again I
| might be missing something, but the more I read the more confused
| I get.
|
| - claiming that the agents formed long-term social relationships
| over the course of 12 Minecraft days, but that's only four real
| hours and the agents experience real time: the length of a
| Minecraft day is immaterial! I think "form long-term social
| relationships" and "use legal structures" aren't merely immodest,
| they're dishonest.
|
| - the meme / religious transmission stuff totally ignores
| training data contamination with GPT-4. The summarized meme
| clearly indicates awareness of the real-world Pastafarian meme,
| so it is simply wrong to conclude that this meme is being
| "transmitted," when it is far more likely that it was _evoked_ in
| an agent that already knew the meme. Why not run this experiment
| with a truly novel fake religion? Some of the meme examples do
| seem novel, like "oak log crafting syndrome," but others like
| "meditation circle" or "vintage fashion and retro projects" have
| nothing to do with Minecraft and are almost certainly GPT-4
| hallucinations.
|
| In general using GPT-4 for this seems like a terrible mistake (if
| you are interested in doing honest research).
| jsemrau wrote:
| You are on the right track in my opinion. The key is to encode
| the interface between the game and the agent so that the agent
| can make a straightforward choice. For example, by giving the
| agent the state of a nxn board as the world model, and then a
| finite set of choices, an agent is capable of playing the game
| robustly and explaining the decision to the game master. This
| gives the illusion that the agent reasons. I guess my point is
| that it's an encoding problem of the world model to break it
| down into a simple choice.
|
| [1] https://jdsemrau.substack.com/p/evaluating-consciousness-
| and...
| airstrike wrote:
| I've thought about this a lot. I'm no philosopher or AI
| researcher, so I'm just spitballing... but if I were to try my
| hand at it, I think I'd like to start from "principles" and let
| systems evolve or at least be discoverable over time
|
| Principles would be things like self-preservation, food, shelter
| and procreating, communication and memory through a risk-reward
| calculation prism. Maybe establishing what is "known" vs what is
| "unknown" is a key component here too, but not in such a binary
| way.
|
| "Memory" can mean many things, but if you codify it as a function
| of some type of subject performing some type of action leading to
| some outcome with some ascribed "risk-reward" profile compared to
| the value obtained from empirical testing that spans from very
| negative to very positive, it seems both wide encompassing and
| generally useful, both to the individual and to the collective.
|
| From there you derive the need to connect with others, disputes
| over resources, the need to take risks, explore the unknown,
| share what we've learned, refine risk-rewards, etc. You can guide
| the civilization to discover certain technologies or inventions
| or locations we've defined ex ante as their godlike DM which is a
| bit like cheating because it puts their development "on rails"
| but also makes it more useful, interesting and relatable.
|
| It sounds computationally prohibitive, but the game doesn't need
| to play out in real time anyway...
|
| I just think that you can describe _a lot_ of the human condition
| in terms of "life", "liberty", "love/connection" and "greed".
|
| Looking at the video in the repo, I don't like how this throws
| "cultures", "memes" and "religion" into the mix instead of
| letting them be an emergence from the need to communicate and
| share the belief systems that emerge from our collective
| memories. Because it seems like a distinction without a
| difference for the purposes of analyzing this. Also "taxes are
| high!" without the underlying "I don't have enough resources to
| get by" seems too much like a mechanical turk
| grugagag wrote:
| Many of these projects are inch deep into intelligence and
| miles deep into the current technology. Some things will see
| tremendous benefits but as far as artificial intelligence we're
| not there yet. Im thinking gaming will benefit a lot from
| these..
| farias0 wrote:
| You mean we're not there in simulating an actual human brain?
| Sure. But we're seeing AI work like a human well enough to be
| useful, isn't that the point?
| jsemrau wrote:
| Memory is really interesting. For example, if you play 100,000
| rounds of 5x5 Tic Tac Toe. Do you really need to remember game
| 51247 or do you recognize and remember a winning pattern? In
| Reinforcement Learning you would based on each win revise the
| policy. How would that work for genAI?
| Tiberium wrote:
| Honestly I'm really excited about this. I've always dreamed of
| full blown sandbox games with extremely advanced NPCs (which the
| current LLMs can already kinda emulate), but on the bigger scale.
| In just a few decades this will finally be made into proper
| games. I can't wait.
| aleph_minus_one wrote:
| > Honestly I'm really excited about this. I've always dreamed
| of full blown sandbox games with extremely advanced NPCs (which
| the current LLMs can already kinda emulate), but on the bigger
| scale.
|
| I don't believe that you want this. Even really good players
| don't have a chance against super-advanced NPCs (think how
| chess grandmasters have barely any chance against modern chess
| programs running on a fast computer). You will rather get
| crushed.
|
| What you likely want is NPC that "behave more human-like (or
| animal-like)" - whatever this means.
| Tiberium wrote:
| Oh, I should've clarified - I don't want to _fight_ against
| them, I just want to watch and sometimes interfere to see how
| the agents react ;) A god game like WorldBox /Galimulator, if
| you will. Or observer mode in tons of games like almost all
| Paradox ones.
| aleph_minus_one wrote:
| > I just want to watch and sometimes interfere to see how
| the agents react ;)
|
| Even there, I am not sure whether if the AI bcomes too
| advanced, it will be of interest for many players ( _you_
| might of course nevertheless be interested):
|
| Here, the relevant comparison is to watching (the past)
| games of AlphaGo against Go grandmasters, where even the
| highly qualified commentators had insane difficulties
| explaining AlphaGo's moves because many of the moves were
| so different from the strategy of any Go game before. The
| commentors could just accept and grasp that these highly
| advanced moves _did_ crush the Go grandmaster opponents.
|
| In my opinion, the "typical" sandbox game player wants to
| watch something that he still can "somewhat" grasp.
| com2kid wrote:
| I'm working on something similar, https://www.generativesto
| rytelling.ai/tinyllmtown/index.html a small town where all
| NPCs are simulated using a small LLM. They react to
| everything the hero does, which means no more killing a
| dragon and having no one even mention it.
|
| Once I release it, I'll have it simulate 4 hours every 2
| hours or so of real time, and visitors can vote on what
| quest the hero undertakes next.
|
| The simulation is simpler, I am aiming to keep everything
| to a size that can run on a local GPU with a small model.
|
| Right now you can just watch the NPCs try to figure out
| love triangles, hide their drinking problems, complain
| about carrots, and celebrate when the hero saves the town
| yet again.
| kgeist wrote:
| >Even really good players don't have a chance against super-
| advanced NPCs
|
| I guess you can make them dumber by randomly switching to
| hardcoded behavioral trees (without modern AI) once in a
| while so that they made mistakes (while feeling pretty
| intelligent overall), and the player would then have a chance
| to outsmart them.
| ted_bunny wrote:
| Game designers have barely scratched the surface of NPC
| modeling even as it is. Rimworld is considered deep but it's
| nothing close to it.
| jsemrau wrote:
| I think it can be quite interesting especially if you consider
| different character types (in Anthropic lingo this
| "personality"). The only problem right now is that using a
| proprietary LLM is incredibly expensive. Therefore having a
| local LLM might be the best option. Unfortunately, these are
| still not on the same level as their larger brethren.
|
| [1] https://jdsemrau.substack.com/p/evaluating-consciousness-
| and...
| aleph_minus_one wrote:
| The video cannot be played in Mozilla Firefox (Windows); the
| browser claims that the file is damaged.
| wslh wrote:
| I cannot open the PDF, is it available somewhere else?
| NoboruWataya wrote:
| This seems very cool - I am sceptical of the supposed benefits
| for "civilization" but it could at least make for some very
| interesting sim games. (So maybe it will be good for Civilization
| moreso than civilization.)
| caseyy wrote:
| Indeed sounds better for Civilization than civilization. This
| could be quite exciting for gaming.
| dmix wrote:
| GTA6 suddenly needs another 2 years :)
| bbor wrote:
| Yeah, I was dissapointed (and thrilled, from a p(doom)
| perspective) to see it implemented in Minecraft instead of
| Civilization VI, Humankind, or any of the main Paradox grand
| strategies (namely Stellaris, Victoria, Crusader Kings, and
| Europa Universalis). To say the least, the stakes are higher
| and more realistic than "lets plan a feast" "ok, I'll gather
| some wood!"
|
| To be fair, they might tackle this in the paper -- this is a
| preprint of a preprint, somehow...
| m0llusk wrote:
| Interesting context, but highlights all the problems of machine
| learning models: the lack of reason and abstraction and so on.
| Hard to say yet how much of an issue this might be, but the
| medium will almost certainly reveal something about our potential
| options for social organization.
| zombiwoof wrote:
| Agentic is an annoying word.
| catlifeonmars wrote:
| This looks like it is a really cool toy.
|
| It does not strike me as particularly useful from a scientific
| research perspective. There does not appear to be much thought
| put into experimental design and really no clear objectives. Is
| the bar really this low for academic research these days?
| gmuslera wrote:
| They probably will fall fast into tragedy of the commons kind of
| situations. We developed most of our civilization while there was
| enough room for growing and big decisions were centralized, and
| started to get into bad troubles when things became global
| enough.
|
| With AIs some of those "protections" may not be there. And
| hardcoding strategies to avoid this may already put a limit on
| what we are simulating.
| interstice wrote:
| Does this mean that individual complexity is a natural enemy of
| group cohesiveness? Or is individual 'selfishness' more a
| product of evolutionary background.
|
| On our planet we don't have ant colony dynamics at the physical
| scale of high intelligence (that I know of), but there are very
| physical limitations to things like food sources.
|
| Virtual simulations don't have the same limitations, so the
| priors may be quite different.
| gmuslera wrote:
| Taking the "best" course of action from your own point of
| view could not be so good from a more broad perspective. We
| might have evolved some small group collaboration approaches
| that in the long run plays better, but in large groups that
| doesn't go that well. And for AIs trying to optimize
| something without some big picture vision, things may go
| wrong faster.
| nachoab wrote:
| Really interesting but curious how civilization here holds up
| without deeper human-like complexity, feels like it might lean
| more toward scripted behaviors than real societies
| userbinator wrote:
| _feels like it might lean more toward scripted behaviors than
| real societies_
|
| Guess what's happening with "real societies" now... There's a
| reason "NPC" is used as an insult.
| luxuryballs wrote:
| Just yesterday I was wondering how the Midjourney equivalent
| world gen mod for Minecraft might be coming along. Imagine
| prompting the terrain gen?? That could be pretty mind blowing.
|
| Describe the trees hills vines, tree colors/patterns, castles,
| towns, details of all buildings and other features. And have it
| generate as high quality in Minecraft as image gen can be in
| stable diffusion?
| caetris2 wrote:
| I've reviewed the paper and I'm confident this paper was
| fabricated over a collection of false claims. The claims made are
| not genuine and should not be taken at face value without peer
| review. The provided charts and graphics are sophisticated
| forgeries in many cases when reviewing and vetting their
| applicability to the claims made.
|
| It is currently not possible for any kind of LLM to do what is
| being proposed, while maybe the intentions are good with regard
| to commercial interests, I want to be clear: this paper seems
| indicate that election-related activities were coordinated by
| groups of AI agents in a simulation. These kinds of claims
| require substantial evidence and that was not provided.
|
| The prompts that are provided are not in any way connected to an
| applied usage of LLMs that are described.
| bitwize wrote:
| I'm reminded of Dwarf Fortress, which simulates thousands of
| years of dwarf world time, the changing landscapes and the rise
| and fall and rise and fall of dwarf kingdoms, then drops seven
| player-controlled dwarves on the map and tells the player "have
| fun!" It'd be a useful toy model perhaps for identifying areas of
| investigation to see if it can predict behavior of real
| civilizations, but I'm not seeing any AI breakthroughs here.
|
| Maybe when Project Sid 6.7 comes out...
| sweetkimchi wrote:
| interesting
___________________________________________________________________
(page generated 2024-11-03 23:00 UTC)